Repository at Hanyang University: Evaluation of penalized and machine learning methods for asthma disease prediction in the Korean Genome and Epidemiology Study (KoGES)

Browse

My Repository

Repository at Hanyang UniversityCOLLEGE OF SCIENCE AND CONVERGENCE TECHNOLOGY[E](과학기술융합대학)ETC

113 0

Full metadata record

DC Field	Value	Language
dc.contributor.author	최성경	-
dc.date.accessioned	2024-03-05T06:29:43Z	-
dc.date.available	2024-03-05T06:29:43Z	-
dc.date.issued	2024-02-02	-
dc.identifier.citation	BMC BIOINFORMATICS	en_US
dc.identifier.issn	1471-2105	en_US
dc.identifier.uri	https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-024-05677-x	en_US
dc.identifier.uri	https://repository.hanyang.ac.kr/handle/20.500.11754/189485	-
dc.description.abstract	BackgroundGenome-wide association studies have successfully identified genetic variants associated with human disease. Various statistical approaches based on penalized and machine learning methods have recently been proposed for disease prediction. In this study, we evaluated the performance of several such methods for predicting asthma using the Korean Chip (KORV1.1) from the Korean Genome and Epidemiology Study (KoGES).ResultsFirst, single-nucleotide polymorphisms were selected via single-variant tests using logistic regression with the adjustment of several epidemiological factors. Next, we evaluated the following methods for disease prediction: ridge, least absolute shrinkage and selection operator, elastic net, smoothly clipped absolute deviation, support vector machine, random forest, boosting, bagging, naive Bayes, and k-nearest neighbor. Finally, we compared their predictive performance based on the area under the curve of the receiver operating characteristic curves, precision, recall, F1-score, Cohen ' s Kappa, balanced accuracy, error rate, Matthews correlation coefficient, and area under the precision-recall curve. Additionally, three oversampling algorithms are used to deal with imbalance problems.ConclusionsOur results show that penalized methods exhibit better predictive performance for asthma than that achieved via machine learning methods. On the other hand, in the oversampling study, randomforest and boosting methods overall showed better prediction performance than penalized methods.	en_US
dc.description.sponsorship	This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No.2018R1C1B6008277 and 2022R1F1A1072274). This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No.RS-2022–00155885, Artifcial Intelligence Convergence Innovation Human Resources Development (Hanyang University ERICA)). This research was supported by the Bio & Medical Technology Development Program of the National Research Foundation (NRF) funded by the Korean government (MSIT) (2019M3E5D3073365). This study was conducted using bioresources from National Biobank of Korea, the Korea Disease Control and Prevention Agency, Republic of Korea (KBN-2020-106).	en_US
dc.language	en_US	en_US
dc.publisher	BMC	en_US
dc.relation.ispartofseries	v. 25, Article number: 56;1-27	-
dc.subject	NAIVE Bayes classification	en_US
dc.subject	MACHINE learning	en_US
dc.subject	RECEIVER operating characteristic curves	en_US
dc.subject	GENOME-wide association studies	en_US
dc.subject	EPIDEMIOLOGY	en_US
dc.subject	K-nearest neighbor classification	en_US
dc.subject	GENOMES	en_US
dc.subject	SUPPORT vector machines	en_US
dc.subject	Asthma	en_US
dc.subject	Disease risk prediction model	en_US
dc.subject	Ensemble methods	en_US
dc.subject	Genome-wide association study	en_US
dc.subject	GWAS	en_US
dc.subject	KoGES	en_US
dc.subject	Korean Genome and Epidemiology Study	en_US
dc.subject	Large-scale genetic data	en_US
dc.subject	Machine learning methods	en_US
dc.subject	Oversampling	en_US
dc.subject	Penalized methods	en_US
dc.title	Evaluation of penalized and machine learning methods for asthma disease prediction in the Korean Genome and Epidemiology Study (KoGES)	en_US
dc.type	Article	en_US
dc.relation.no	1	-
dc.relation.volume	25	-
dc.identifier.doi	10.1186/s12859-024-05677-x	en_US
dc.relation.page	1-27	-
dc.relation.journal	BMC BIOINFORMATICS	-
dc.contributor.googleauthor	Choi, Yongjun	-
dc.contributor.googleauthor	Cha, Junho	-
dc.contributor.googleauthor	Choi, Sungkyoung	-
dc.relation.code	2024007632	-
dc.sector.campus	E	-
dc.sector.daehak	COLLEGE OF SCIENCE AND CONVERGENCE TECHNOLOGY[E]	-
dc.sector.department	DEPARTMENT OF MATHEMATICAL DATA SCIENCE	-
dc.identifier.pid	day0413	-

Appears in Collections:: COLLEGE OF SCIENCE AND CONVERGENCE TECHNOLOGY[E](과학기술융합대학) > ETC

Files in This Item:

Export: RIS (EndNote); XLS (Excel); XML

Show simple item record

한양대학교 리포지터리는 국립중앙도서관 OAK 보급사업으로 구축되었습니다. Feedback 개인정보처리방침

Hanyang University repository

Browse

My Repository

BROWSE