452 0

전유전체 서열 기반 한국인 참조 변이체 분석 연구

Title
전유전체 서열 기반 한국인 참조 변이체 분석 연구
Other Titles
Variation analysis study in 62 Korean whole genomes
Author
박선혜
Alternative Author(s)
Park, Sun Hye
Advisor(s)
고인송
Issue Date
2016-02
Publisher
한양대학교
Degree
Master
Abstract
최근 차세대 염기서열 분석기술과 생물정보학의 발전으로 유전체데이터생산 및 분석비용이 하락함에 따라, 국제적으로 다양한 인종에 대한 인간 유전체 분석 프로젝트들이 수행되고 있다. 하지만 이러한 국제적인 프로젝트에는 한국인의 유전체데이터가 포함되어 있지 않아, 한국인의 특이적인 유전적 정보를 이해하기 위한 정보가 상당히 부족하다. 각국 내에서도 지속적인 유전체 분석비용의 하락으로 전유전체(whole genome)생산이 증가하고 있다. 한국인 게놈 프로젝트(Korean Personal Genome Project; KPGP)는 한국의 게놈연구재단(Personal Genomics Institute; PGI)에서 진행하고 있는 참여형 연구프로젝트로, 62명의 한국인 전유전체서열데이터가 TGI(The Genomics Institute)의 FTP 사이트와 국가생명연구자원정보센터(Korean Bioinformation Center; KOBIC)의 FTP 사이트를 통해 공개되어 있다. 본 연구에서는 이 62명의 한국인 KPGP 전유전체서열분석을 통해 한국인의 단일염기변이(SNV), 삽입과 결실(INDEL), 구조적 결실(large deletion), 복제수변이(CNV)와 같은 다양한 유전변이를 확인하고, 한국인의 유전체정보를 이해하기 위한 파일럿 연구를 수행하여 참조변이체(reference variome)의 정보를 구축하였다. 한국인 62명의 전유전체에서 서열단위 변이의 분석 결과, 단일염기변이(SNV) 10,208,160개와 삽입과 결실(INDEL)(≤ 200bp)은 1,056,125개가 발견되었다. 이 중 한국인 특이적인 단열염기변이(SNV)는 1,540,469개로 나타났다. 또 단백질 코딩에 영향을 주는 non-synonymous SNV는 37,973개가 발견되었다. 구조적 단위 변이의 결과, 구조적 결실(large deletion)(> 200bp)는 4,112개 발견되었고, 복제수변이(CNV)는 1,946개 발견되었다. 본 연구는 인종단위의 한국인 전유전체서열분석(WGS)에 대한 파일럿 연구로서, 추후 대량의 샘플데이터를 활용한 참조변이체(reference variome) 정보의 구축에 도움이 될 것이다. 또한, 한국인 파일럿 참조변이체 정보로서 향후 한국인과 관련된 질병 연구에 도움을 줄 것이다.| In a recent year, a new DNA sequencing technology, known as next-generation sequencing (NGS), propelled by massively parallel sequencing has enabled to reduce costs of producing and analyzing genomic data. Advances in sequencing technologies have paved the way for launching huge international projects, such as the international HapMap project and 1000 Genomes Project, observing relatedness between DNA structure and population. However, several huge international projects do not reflect Korean population, which is insufficient for applying to understand specific Korean genetic traits because of missing genetic information of Korean. In Korea, the Korean Personal Genome Project (KPGP) has been conducted by Genome Research Foundation (GRF). The Korean whole-genome data of 62 healthy individuals from KPGP are publicly available and can be downloaded from the Genomics Institute (TGI) FTP site and Korean Bioinformation Center (KOBIC) FTP site. The KPGP whole -genome data was produced by Hi-Seq platform of Illumina at average 35.25x coverage. This study analyzed various types of variations, such as single nucleotide variation (SNV), insertion and deletion (INDEL) (≤ 200bp), large deletion (> 200bp) and copy number variation (CNV), with the KPGP Korean genomes of 62 individuals, and a pilot reference variome information for the Koreans was constructed As a results, a total of 10,208,160 SNVs, 1,056,125 short INDELs, 4,112 large deletions and 1,946 CNVs in 62 Koreans were identified. Hereafter, this study for whole-genome sequencing analysis in the Korean population as a pilot study will help to construct the Korean reference variome using large-scale samples. Additionally, it will be useful for tracing genetic disease related to the Koreans.; In a recent year, a new DNA sequencing technology, known as next-generation sequencing (NGS), propelled by massively parallel sequencing has enabled to reduce costs of producing and analyzing genomic data. Advances in sequencing technologies have paved the way for launching huge international projects, such as the international HapMap project and 1000 Genomes Project, observing relatedness between DNA structure and population. However, several huge international projects do not reflect Korean population, which is insufficient for applying to understand specific Korean genetic traits because of missing genetic information of Korean. In Korea, the Korean Personal Genome Project (KPGP) has been conducted by Genome Research Foundation (GRF). The Korean whole-genome data of 62 healthy individuals from KPGP are publicly available and can be downloaded from the Genomics Institute (TGI) FTP site and Korean Bioinformation Center (KOBIC) FTP site. The KPGP whole -genome data was produced by Hi-Seq platform of Illumina at average 35.25x coverage. This study analyzed various types of variations, such as single nucleotide variation (SNV), insertion and deletion (INDEL) (≤ 200bp), large deletion (> 200bp) and copy number variation (CNV), with the KPGP Korean genomes of 62 individuals, and a pilot reference variome information for the Koreans was constructed As a results, a total of 10,208,160 SNVs, 1,056,125 short INDELs, 4,112 large deletions and 1,946 CNVs in 62 Koreans were identified. Hereafter, this study for whole-genome sequencing analysis in the Korean population as a pilot study will help to construct the Korean reference variome using large-scale samples. Additionally, it will be useful for tracing genetic disease related to the Koreans.
URI
https://repository.hanyang.ac.kr/handle/20.500.11754/126631http://hanyang.dcollection.net/common/orgView/200000428785
Appears in Collections:
GRADUATE SCHOOL OF BIOMEDICAL SCIENCE AND ENGINEERING[S](의생명공학전문대학원) > BIOMEDICAL INFORMATICS(생명의료정보학과) > Theses (Master)
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE