Prediction of Children's Subjective Well-Being from Physical Activity and Sports Participation Using Machine Learning Techniques: Evidence from a Multinational Study
Keywords: children, physical activity, machine learning, subjective well-being, physical literacy, SHAP, XGBoost, sports participation
Abstract
Highlights: What are the main findings? Machine learning models, particularly XGBoost and LightGBM, predict childrens subjective well-being with up to 50% explained variance, surpassing traditional regression. Sports participation, including exercise frequency, emerges as a key predictor, with linear benefits observed across diverse global samples. What is the implication of the main finding? These results support the development of targeted sports programs to enhance child well-being, leveraging advanced predictive tools. The findings advocate for integrating physical literacy into educational policies to address global inactivity trends in youth. Background/Objectives: Traditional models like ordinary least squares (OLS) struggle to capture non-linear relationships in childrens subjective well-being (SWB), which is associated with physical activity. This study evaluated machine learning (ML) for predicting SWB, focusing on sports participation, and explored theoretical prediction limits using a global dataset. It addresses a gap in understanding complex patterns across diverse cultural contexts. Methods: We analyzed 128,184 records from the ISCWeB survey (ages 614, 35 countries), with self-reported data on sports frequency, emotional states, and family support. To ensure cross-country generalizability, we used GroupKFold CV (grouped by country) and leave-one-country-out (LOCO) validation, yielding mean R2 = 0.45 ± 0.05, confirming robustness beyond cultural patterns, SHAP for interpretability, and bootstrapping for error estimation. No pre-registration was required for this secondary analysis. Results: XGBoost and LightGBM outperformed OLS, achieving R2 up to 0.504 in restricted datasets (sensitivity excluding affective leakage: R2 = 0.35), with sports-related variables (e.g., exercise frequency) associated positively with SWB predictions (SHAP values: +0.150.25; incremental ?R2 = 0.06 over demographics/family/school base). Using testretest reliability from literature (r = 0.74), the estimated irreducible RMSE reached 0.941; XGBoost achieved RMSE = 1.323, approaching the predictability bound with 68.1% of explainable variance captured (after noise adjustment). Partial dependence plots showed linear associations with exercise without satiation and slight age decline. Conclusions: ML improves SWB prediction in children, highlighting associations with sports participation, and approaches predictable variance bounds. These findings suggest potential for data-driven tools to identify patterns, such as through physical literacy pathways, informing physical activity interventions. However, longitudinal studies are needed to explore causality and address cultural biases in self-reports. © 2025 by the authors.
Más información
| Título según WOS: | Prediction of Children's Subjective Well-Being from Physical Activity and Sports Participation Using Machine Learning Techniques: Evidence from a Multinational Study |
| Título según SCOPUS: | Prediction of Childrens Subjective Well-Being from Physical Activity and Sports Participation Using Machine Learning Techniques: Evidence from a Multinational Study |
| Título de la Revista: | Children |
| Volumen: | 12 |
| Número: | 8 |
| Editorial: | Multidisciplinary Digital Publishing Institute (MDPI) |
| Fecha de publicación: | 2025 |
| Idioma: | English |
| DOI: |
10.3390/children12081083 |
| Notas: | ISI, SCOPUS |