Matches in Nanopublications for { ?s ?p ?o <https://w3id.org/kpxl/ios/ds/np/RAp2-E77MOiPhLIbTOtkjV7l_4y1kYc63ZhZaflJ547FQ#assertion>. }
Showing items 1 to 25 of
25
with 100 items per page.
- 0000-0003-2581-8370 type Person assertion.
- DS-230059 type ResourcePaper assertion.
- 0009-0003-5030-0108 type Person assertion.
- 04b8v1s79 type Organization assertion.
- 04dkp9463 type Organization assertion.
- 05xvt9f17 type Organization assertion.
- 0000-0003-2581-8370 name "Marcel R. Haas" assertion.
- 0009-0003-5030-0108 name "Lisette Sibbald" assertion.
- 04b8v1s79 name "Department of Methodology and Statistics and Department of Cognitive Neuropsychology, Tilburg University, Prof. Cobbenhagenlaan 125, 5037 DB Tilburg, The Netherlands" assertion.
- 04dkp9463 name "Business Intelligence, University of Amsterdam, Spui 21, 1012WX Amsterdam, The Netherlands" assertion.
- 05xvt9f17 name "Public Health and Primary Care, Leiden University Medical Center, Albinusdreef 2, The Netherlands" assertion.
- 2451-8492 title "Data Science" assertion.
- DS-230059 title "Measuring Data Drift with the Unstable Population Indicator" assertion.
- DS-230059 date "2024" assertion.
- DS-230059 authoredBy 0000-0003-2581-8370 assertion.
- DS-230059 authoredBy 0009-0003-5030-0108 assertion.
- DS-230059 isPartOf 2451-8492 assertion.
- DS-230059 abstract "Measuring data drift is essential in machine learning applications where model scoring (evaluation) is done on data samples that differ from those used in training. The Kullback-Leibler divergence is a common measure of shifted probability distributions, for which discretized versions are invented to deal with binned or categorical data. We present the Unstable Population Indicator, a robust, flexible and numerically stable, discretized implementation of Jeffrey's divergence, along with an implementation in a Python package that can deal with continuous, discrete, ordinal and nominal data in a variety of popular data types. We show the numerical and statistical properties in controlled experiments. It is not advised to employ a common cut-off to distinguish stable from unstable populations, but rather to let that cut-off depend on the use case." assertion.
- 0000-0003-2581-8370 email "datascience@marcelhaas.com" assertion.
- 0009-0003-5030-0108 email "L.Sibbald@tilburguniversity.edu" assertion.
- 0000-0003-2581-8370 affiliation 04dkp9463 assertion.
- 0000-0003-2581-8370 affiliation 05xvt9f17 assertion.
- 0009-0003-5030-0108 affiliation 04b8v1s79 assertion.
- 0009-0003-5030-0108 affiliation 04dkp9463 assertion.
- DS-230059 hasPart RA4SqymT32eltSYbr41lDKMBV3Zr8nEBEXRFhfOrN6f3k assertion.