Machine Learning for Compositional Microbiome Data to Predict a Clinical Outcome: Are Interpretable Methods Up to the Task?
Abstract
--- - Microbiome data are inherently compositional and pose unique challenges for clinical prediction. This study compares interpretable and complex machine learning approaches for predicting a composite clinical outcome in children with cystic fibrosis (CF) treated with lumacaftor/ivacaftor. The binary composite outcome combined nutritional and pulmonary improvement at 12 months. We evaluated classic Lasso and Random forests models (with forced clinical covariates), FLORAL and Graph Neural Network (GNN) models (tailored to microbiome data) across compositional transformations (CLR, ALR, ILR, arcsine, PA, rCLR, and spline-based discretization). Feature stability was assessed across cross-validation folds. - Random Forests and Lasso achieved the highest and most stable discrimination (ROC-AUC around 0.9), outperforming FLORAL and GNN. While transformation choice had limited impact on predictive accuracy, it affected feature correlations and may influence model interpretability. Model performance was probably constrained by the outcome complexity and missing clinical values, underscoring the need for more robust regularization and representation strategies. These findings highlight both the promise and limits of interpretable models for microbiomebased prediction in cystic fibrosis.
Más información
| Título según WOS: | ID WOS:001691773100029 Not found in local WOS DB |
| Título de la Revista: | 2025 15TH IEEE INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION SYSTEMS, ICPRS |
| Editorial: | IEEE |
| Fecha de publicación: | 2025 |
| DOI: |
10.1109/ICPRS66293.2025.11302842 |
| Notas: | ISI |