113 0

Full metadata record

DC FieldValueLanguage
dc.contributor.author최성경-
dc.date.accessioned2024-03-05T06:29:43Z-
dc.date.available2024-03-05T06:29:43Z-
dc.date.issued2024-02-02-
dc.identifier.citationBMC BIOINFORMATICSen_US
dc.identifier.issn1471-2105en_US
dc.identifier.urihttps://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-024-05677-xen_US
dc.identifier.urihttps://repository.hanyang.ac.kr/handle/20.500.11754/189485-
dc.description.abstractBackgroundGenome-wide association studies have successfully identified genetic variants associated with human disease. Various statistical approaches based on penalized and machine learning methods have recently been proposed for disease prediction. In this study, we evaluated the performance of several such methods for predicting asthma using the Korean Chip (KORV1.1) from the Korean Genome and Epidemiology Study (KoGES).ResultsFirst, single-nucleotide polymorphisms were selected via single-variant tests using logistic regression with the adjustment of several epidemiological factors. Next, we evaluated the following methods for disease prediction: ridge, least absolute shrinkage and selection operator, elastic net, smoothly clipped absolute deviation, support vector machine, random forest, boosting, bagging, naive Bayes, and k-nearest neighbor. Finally, we compared their predictive performance based on the area under the curve of the receiver operating characteristic curves, precision, recall, F1-score, Cohen ' s Kappa, balanced accuracy, error rate, Matthews correlation coefficient, and area under the precision-recall curve. Additionally, three oversampling algorithms are used to deal with imbalance problems.ConclusionsOur results show that penalized methods exhibit better predictive performance for asthma than that achieved via machine learning methods. On the other hand, in the oversampling study, randomforest and boosting methods overall showed better prediction performance than penalized methods.en_US
dc.description.sponsorshipThis work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No.2018R1C1B6008277 and 2022R1F1A1072274). This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No.RS-2022–00155885, Artifcial Intelligence Convergence Innovation Human Resources Development (Hanyang University ERICA)). This research was supported by the Bio & Medical Technology Development Program of the National Research Foundation (NRF) funded by the Korean government (MSIT) (2019M3E5D3073365). This study was conducted using bioresources from National Biobank of Korea, the Korea Disease Control and Prevention Agency, Republic of Korea (KBN-2020-106).en_US
dc.languageen_USen_US
dc.publisherBMCen_US
dc.relation.ispartofseriesv. 25, Article number: 56;1-27-
dc.subjectNAIVE Bayes classificationen_US
dc.subjectMACHINE learningen_US
dc.subjectRECEIVER operating characteristic curvesen_US
dc.subjectGENOME-wide association studiesen_US
dc.subjectEPIDEMIOLOGYen_US
dc.subjectK-nearest neighbor classificationen_US
dc.subjectGENOMESen_US
dc.subjectSUPPORT vector machinesen_US
dc.subjectAsthmaen_US
dc.subjectDisease risk prediction modelen_US
dc.subjectEnsemble methodsen_US
dc.subjectGenome-wide association studyen_US
dc.subjectGWASen_US
dc.subjectKoGESen_US
dc.subjectKorean Genome and Epidemiology Studyen_US
dc.subjectLarge-scale genetic dataen_US
dc.subjectMachine learning methodsen_US
dc.subjectOversamplingen_US
dc.subjectPenalized methodsen_US
dc.titleEvaluation of penalized and machine learning methods for asthma disease prediction in the Korean Genome and Epidemiology Study (KoGES)en_US
dc.typeArticleen_US
dc.relation.no1-
dc.relation.volume25-
dc.identifier.doi10.1186/s12859-024-05677-xen_US
dc.relation.page1-27-
dc.relation.journalBMC BIOINFORMATICS-
dc.contributor.googleauthorChoi, Yongjun-
dc.contributor.googleauthorCha, Junho-
dc.contributor.googleauthorChoi, Sungkyoung-
dc.relation.code2024007632-
dc.sector.campusE-
dc.sector.daehakCOLLEGE OF SCIENCE AND CONVERGENCE TECHNOLOGY[E]-
dc.sector.departmentDEPARTMENT OF MATHEMATICAL DATA SCIENCE-
dc.identifier.pidday0413-
Appears in Collections:
COLLEGE OF SCIENCE AND CONVERGENCE TECHNOLOGY[E](과학기술융합대학) > ETC
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE