https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#Head https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ http://www.nanopub.org/nschema#hasAssertion https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#assertion https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ http://www.nanopub.org/nschema#hasProvenance https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#provenance https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ http://www.nanopub.org/nschema#hasPublicationInfo https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#pubinfo https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.nanopub.org/nschema#Nanopublication https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#assertion http://id.crossref.org/issn/2451-8492 http://purl.org/dc/terms/title Data Science https://doi.org/10.3233/DS-240059 http://purl.org/dc/terms/abstract Measuring data drift is essential in machine learning applications where model scoring (evaluation) is done on data samples that differ from those used in training. The Kullback-Leibler divergence is a common measure of shifted probability distributions, for which discretized versions are invented to deal with binned or categorical data. We present the Unstable Population Indicator, a robust, flexible and numerically stable, discretized implementation of Jeffrey's divergence, along with an implementation in a Python package that can deal with continuous, discrete, ordinal and nominal data in a variety of popular data types. We show the numerical and statistical properties in controlled experiments. It is not advised to employ a common cut-off to distinguish stable from unstable populations, but rather to let that cut-off depend on the use case. https://doi.org/10.3233/DS-240059 http://purl.org/dc/terms/date 2024-06-26 https://doi.org/10.3233/DS-240059 http://purl.org/dc/terms/hasPart https://w3id.org/kpxl/ios/ds/np/RA0XRooQKz2A7aoP0VJLS2NKcvQv-n7RwPoYtcD4wtTPc https://doi.org/10.3233/DS-240059 http://purl.org/dc/terms/isPartOf http://id.crossref.org/issn/2451-8492 https://doi.org/10.3233/DS-240059 http://purl.org/dc/terms/title Measuring Data Drift with the Unstable Population Indicator https://doi.org/10.3233/DS-240059 http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://purl.org/spar/fabio/ResourcePaper https://orcid.org/0000-0003-2581-8370 http://schema.org/affiliation https://ror.org/04dkp9463 https://orcid.org/0000-0003-2581-8370 http://schema.org/affiliation https://ror.org/05xvt9f17 https://orcid.org/0000-0003-2581-8370 http://schema.org/email datascience@marcelhaas.com https://orcid.org/0000-0003-2581-8370 http://xmlns.com/foaf/0.1/name Marcel R. Haas https://orcid.org/0009-0003-5030-0108 http://schema.org/affiliation https://ror.org/04b8v1s79 https://orcid.org/0009-0003-5030-0108 http://schema.org/affiliation https://ror.org/04dkp9463 https://orcid.org/0009-0003-5030-0108 http://schema.org/email L.Sibbald@tilburguniversity.edu https://orcid.org/0009-0003-5030-0108 http://xmlns.com/foaf/0.1/name Lisette Sibbald https://ror.org/04b8v1s79 http://xmlns.com/foaf/0.1/name Department of Methodology and Statistics and Department of Cognitive Neuropsychology, Tilburg University, Prof. Cobbenhagenlaan 125, 5037 DB Tilburg, The Netherlands https://ror.org/04dkp9463 http://xmlns.com/foaf/0.1/name Business Intelligence, University of Amsterdam, Spui 21, 1012WX Amsterdam, The Netherlands https://ror.org/05xvt9f17 http://xmlns.com/foaf/0.1/name Public Health and Primary Care, Leiden University Medical Center, Albinusdreef 2, The Netherlands https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#author-list http://www.w3.org/1999/02/22-rdf-syntax-ns#_1 https://orcid.org/0000-0003-2581-8370 https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#author-list__1 http://www.w3.org/1999/02/22-rdf-syntax-ns#_2 https://orcid.org/0009-0003-5030-0108 https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#provenance https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#assertion http://www.w3.org/ns/prov#wasAttributedTo https://orcid.org/0000-0003-2581-8370 https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#assertion http://www.w3.org/ns/prov#wasAttributedTo https://orcid.org/0009-0003-5030-0108 https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#pubinfo https://orcid.org/0000-0002-1267-0234 http://xmlns.com/foaf/0.1/name Tobias Kuhn https://orcid.org/0000-0003-2581-8370 http://xmlns.com/foaf/0.1/name Marcel R. Haas https://orcid.org/0009-0003-5030-0108 http://xmlns.com/foaf/0.1/name Lisette Sibbald https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#author-list http://www.w3.org/1999/02/22-rdf-syntax-ns#_1 https://orcid.org/0000-0003-2581-8370 https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#author-list http://www.w3.org/1999/02/22-rdf-syntax-ns#_2 https://orcid.org/0009-0003-5030-0108 https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#sig http://purl.org/nanopub/x/hasAlgorithm RSA https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#sig http://purl.org/nanopub/x/hasPublicKey MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCjDGQCS1S+SRnERDuYDXOugdYUP0efEquHJEEHAbU/uLzBVlga89zqrNPCS7fBE6lArBUWEmT8eLKdMapyqvAzI1J3jUWTMhDJF+XFBkUiuiFfNSc4vJJcmi0yujtnuzXsRIG202jyaP4f5ULoskFwaZOSBZJfiE0dsB3D7DTIAQIDAQAB https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#sig http://purl.org/nanopub/x/hasSignature Ox+5X6nHLumNtHd4Ka2ICEWhUX+v6KVWn4UKDEEAixySaGj9TJt/mBFpssxtxcrM29g070GCs1SakxQ2Re3c6lUEEkHh/E4MLDc9ReR2vZoLi2oUzJfKzWC+WuTjML12q88gZUw9uoWThRpPW+j4XOn8dUrPk8DffrF/R1+Hrg8= https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#sig http://purl.org/nanopub/x/hasSignatureTarget https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#sig http://purl.org/nanopub/x/signedBy https://orcid.org/0000-0002-1267-0234 https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ http://purl.org/dc/terms/created 2024-07-12T09:07:29.273Z https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ http://purl.org/dc/terms/creator https://orcid.org/0000-0002-1267-0234 https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ http://purl.org/dc/terms/isPartOf https://doi.org/10.3233/DS-240059 https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ http://purl.org/dc/terms/license https://creativecommons.org/licenses/by/4.0/ https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ http://purl.org/nanopub/x/hasNanopubType http://purl.org/spar/fabio/ScholarlyWork https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ http://purl.org/nanopub/x/hasNanopubType https://w3id.org/kpxl/ios/ds/terms/DataScienceNanopub https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ http://purl.org/nanopub/x/introduces https://doi.org/10.3233/DS-240059 https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ http://purl.org/nanopub/x/supersedes https://w3id.org/kpxl/ios/ds/np/RALO1noJ6z4w0bumoQuKpUVKT7HE_zagqAA8Qy4djeLg0 https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ http://purl.org/nanopub/x/wasCreatedAt https://nanodash.petapico.org/ https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ http://purl.org/ontology/bibo/authorList https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#author-list https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ http://www.w3.org/2000/01/rdf-schema#label Article: Measuring Data Drift with the Unstable Population Indicator https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ https://w3id.org/np/o/ntemplate/wasCreatedFromProvenanceTemplate http://purl.org/np/RAi6zZAwhaJ23Hzg4lIjlPir6Take3ZQp-lS9skfBEwfQ https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ https://w3id.org/np/o/ntemplate/wasCreatedFromPubinfoTemplate http://purl.org/np/RAA2MfqdBCzmz9yVWjKLXNbyfBNcwsMmOqcNUxkk1maIM https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ https://w3id.org/np/o/ntemplate/wasCreatedFromPubinfoTemplate http://purl.org/np/RAh1gm83JiG5M6kDxXhaYT1l49nCzyrckMvTzcPn-iv90 https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ https://w3id.org/np/o/ntemplate/wasCreatedFromPubinfoTemplate http://purl.org/np/RAjpBMlw3owYhJUBo3DtsuDlXsNAJ8cnGeWAutDVjuAuI https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ https://w3id.org/np/o/ntemplate/wasCreatedFromPubinfoTemplate https://w3id.org/np/RA5R_qv3VsZIrDKd8Mr37x3HoKCsKkwN5tJVqgQsKhjTE https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ https://w3id.org/np/o/ntemplate/wasCreatedFromPubinfoTemplate https://w3id.org/np/RAIabr2sRVJ-YOIwZRD__BVMJKnq3QtQw_mjLIGSACPAI https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ https://w3id.org/np/o/ntemplate/wasCreatedFromPubinfoTemplate https://w3id.org/np/RA_JdI7pfDcyvEXLr_gper3h8egmNggeTqkJbyHrlMEdo https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ https://w3id.org/np/o/ntemplate/wasCreatedFromPubinfoTemplate https://w3id.org/np/RAoWx0AJvNw-WqkGgZO4k8udNCg6kMcGZARN3DgO_5TII https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ https://w3id.org/np/o/ntemplate/wasCreatedFromTemplate https://w3id.org/np/RAeQJfX3lMDqtzyddnRmlBvxSoWohzEKzsaMKWrR8K6J0