(1.烟台市国民体质监测中心,山东 烟台 264003;2.滨州医学院 公共卫生与管理学院,山东 烟台 264003;3.滨州医学院 临床医学院,山东 烟台 264003;4.烟台市疾病预防控制中心,山东 烟台 264003) 摘 要:应用机器学习模型对青少年运动员在新冠肺炎疫情流行期间的应对能力状况进行快速识别和预测。利用“问卷星”在线调查平台对来自烟台市6家体育运动学校1 699名7~17岁青少年运动员开展问卷调查,收集疫情防护知识与行为信息并计算应对能力得分,利用动态聚类分析将应对能力分为高、低两个响应级别,并建立随机森林模型度量各类应对能力影响因素的重要性;以各类影响因素为输入特征,建立BP神经网络、支持向量机和多元自适应回归样条3种机器学习模型对响应级别进行分类预测,并与Logistic回归模型进行预测准度和分类性能的比较。结果显示:参与调查的青少年运动员对疫情防护知识知晓程度不够全面,将近1/2的运动员无法克服紧张和恐慌心理,约3/4的运动员无法完成训练计划;疫情应对能力影响因素重要性排序结果显示,年龄、训练项目和居住地区位居前3位;机器学习模型预测结果显示,与Logistic回归相比,基于径向基核函数的支持向量机模型平均准确率(80.32%)最高,提升7.15%,多元自适应回归样条加法模型的灵敏度(0.86)最高,提升12.24%,5-3-2-1双隐含层结构BP神经网络模型的特异度(0.83)最高,提升62.11%。结果表明:机器学习模型对青少年运动员新冠肺炎疫情应对能力的模拟具有可行性,预测准确度和分类性能优于Logistic回归模型,特异度最高的BPN模型更擅长判断疫情应对能力较弱的青少年运动员,推荐用于疫情防控期间干预指导目标的快速识别。 |
YE Chun-ming1,ZHAO Sheng-wen2,YANG Xiu-hong3,LIU Hai-yun4
(1.Yantai National Physique Monitoring Center,Yantai 264003,China;2.School of Public Health and Management,Binzhou Medical University,Yantai 264003,China;3.School of Clinical Medicine,Binzhou Medical University,Yantai 264003,China;4.Yantai Center for Disease Control and Prevention,Yantai 264003,China) Abstract: The authors used machine learning models to quickly identify and predict the condition of teenage athletes’ coping ability during the outbreak of COVID-19. The authors utilized the “Questionnaire Star” online questionnaire platform to carry out a questionnaire survey on 1 699 teenage athletes aged 7-17 at 6 sports schools in Yantai, collected their outbreak protection knowledge and behavior information and calculated their coping ability scores, utilize dy-namic clustering analysis to divide coping ability into high and low levels, and established a random forest model to measure the importance of various coping ability affecting factors; by using various affecting factors as input character-istics, the authors established such 3 machine learning models as BP neural network, support vector machine and mul-tivariate adaptive regression spline to classify and predict response levels, and carried out prediction accuracy and clas-sification performance comparison with the Logistic regression model. The results show the followings: the survey par-ticipating teenage athletes’ outbreak protection knowledge knowing degree was not comprehensive enough, nearly 1/2 of the athletes were unable to overcome their nervous and panic psychology, approximately 3/4 of the athletes were unable to complete training plans; outbreak coping ability affecting factor importance ordering results show that age, training program and residence area were listed top 3; machine learning model predication results show that as com-pared with Logistic regression, the average accuracy rate (80.32%) of the support vector machine model based on radial basis function was the highest, 7.15% higher, the sensitivity (0.86) of the multivariate adaptive spline addition model was the highest, 12.24% higher, the specificity (0.83) of the BP neural network model of 5-3-2-1 double hidden layer structure was the highest, 62.11% higher. The results indicate the followings: the machine learning models’ simulation of teenage athletes’ ability to cope with the outbreak of COVID-19 is feasible; the prediction accuracy and classifica-tion performance are better than those of the Logistic regression model; the BPN model with the highest specificity is more capable of judging teenage athletes with a weaker outbreak coping ability, recommended for being used for the quick identification of intervention guidance targets during outbreak control. |
下载本期全文:点击下载