Title: A prediction model for genetic cholestatic disease in infancy using the machine learning approach
Source: Journal of Pediatric Gastroenterology and Nutrition 2025, Jul 30. [E–publication]
Date of publication: July 2025
Publication type: Article
Abstract: Objectives: Cholestasis in infancy poses a complex clinical conundrum for pediatric hepatologists, warranting timely diagnosis, especially for genetic diseases. This study aims to create machine learning (ML)-based prediction models, referred to as Jaundice Diagnosis Easy for Baby (JADE-B), to identify the subjects prone to genetic causes of cholestasis.
Methods: We retrieved patient data from the Integrated Medical Database at a university-affiliated tertiary medical center from 2006 to 2018. Patients with cholestatic disease were identified using liver-disease-specific International Classification of Diseases codes. A total of 47 clinical and laboratory parameters were used for ML for predicting a positive genetic disease, defined by a disease-specific genetic diagnosis matched with phenotype. Four distinct classifiers: Logistic regression, XGBoost (XGB), LightGBM (LGBM), and Random Forests were utilized to build the models.
Results: From a patient pool of 1845, 1008 infants below 1 year of age diagnosed with cholestatic liver disease were included in the analysis. A comprehensive set of 47 pertinent clinical and laboratory features was incorporated for training the ML models. We built five sets of models (Model 1-5), yielding an area under the receiver operating characteristic curve of 0.869, 0.884, 0.855, 0.852, and 0.836, respectively. A JADE-B model was built using 20 simple and widely accessible clinical parameters at disease onset, up to 1 month, to predict patients with genetic disorders.
Conclusions: The machine learning model prioritizes cholestatic infants for the allocation of genetic diagnostic tools and patient referrals, as well as optimizes the utilization of genetic diagnostic resources.