157 153

Full metadata record

DC FieldValueLanguage
dc.contributor.author이주현-
dc.date.accessioned2023-05-17T05:18:42Z-
dc.date.available2023-05-17T05:18:42Z-
dc.date.issued2022-02-
dc.identifier.citationSCIENTIFIC REPORTS, v. 12, NO. 1, article no. 2250.0, Page. 1.0-11.0-
dc.identifier.issn2045-2322-
dc.identifier.urihttps://www.nature.com/articles/s41598-022-06333-1en_US
dc.identifier.urihttps://repository.hanyang.ac.kr/handle/20.500.11754/180694-
dc.description.abstractThe prevalence of cardiocerebrovascular disease (CVD) is continuously increasing, and it is the leading cause of human death. Since it is difficult for physicians to screen thousands of people, high-accuracy and interpretable methods need to be presented. We developed four machine learning-based CVD classifiers (i.e., multi-layer perceptron, support vector machine, random forest, and light gradient boosting) based on the Korea National Health and Nutrition Examination Survey. We resampled and rebalanced KNHANES data using complex sampling weights such that the rebalanced dataset mimics a uniformly sampled dataset from overall population. For clear risk factor analysis, we removed multicollinearity and CVD-irrelevant variables using VIF-based filtering and the Boruta algorithm. We applied synthetic minority oversampling technique and random undersampling before ML training. We demonstrated that the proposed classifiers achieved excellent performance with AUCs over 0.853. Using Shapley value-based risk factor analysis, we identified that the most significant risk factors of CVD were age, sex, and the prevalence of hypertension. Additionally, we identified that age, hypertension, and BMI were positively correlated with CVD prevalence, while sex (female), alcohol consumption and, monthly income were negative. The results showed that the feature selection and the class balancing technique effectively improve the interpretability of models.-
dc.description.sponsorshipThis work was supported by the research fund of Hanyang University (HY-2021-2593).-
dc.languageen-
dc.publisherNATURE PORTFOLIO-
dc.titleMachine learning-based diagnosis and risk factor analysis of cardiocerebrovascular disease based on KNHANES-
dc.typeArticle-
dc.relation.no1-
dc.relation.volume12-
dc.identifier.doi10.1038/s41598-022-06333-1-
dc.relation.page1.0-11.0-
dc.relation.journalSCIENTIFIC REPORTS-
dc.contributor.googleauthorOh, Taeseob-
dc.contributor.googleauthorKim, Dongkyun-
dc.contributor.googleauthorLee, Siryeol-
dc.contributor.googleauthorWon, Changwon-
dc.contributor.googleauthorKim, Sunyoung-
dc.contributor.googleauthorYang, Ji-soo-
dc.contributor.googleauthorYu, Junghwa-
dc.contributor.googleauthorKim, Byungsung-
dc.contributor.googleauthorLee, Joohyun-
dc.sector.campusE-
dc.sector.daehak공학대학-
dc.sector.department전자공학부-
dc.identifier.pidjoohyunlee-
dc.identifier.article2250.0-


qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE