Repository at Hanyang University: Efficient Algorithms to Identify Peptides from Massive High-resolution MS/MS Spectra

Browse

My Repository

Repository at Hanyang UniversityGRADUATE SCHOOL[S](대학원)ELECTRONICS AND COMPUTER ENGINEERING(전자컴퓨터통신공학과)Theses (Ph.D.)

316 0

Efficient Algorithms to Identify Peptides from Massive High-resolution MS/MS Spectra

Title: Efficient Algorithms to Identify Peptides from Massive High-resolution MS/MS Spectra

Author: 김현우

Advisor(s): 박희진

Issue Date: 2016-08

Publisher: 한양대학교

Degree: Doctor

Abstract: Peptide identification is an important problem in proteomics. One of the most popular scoring schemes for peptide identification is XCorr (cross-correlation). Since calculating XCorr is very computationally intensive, a lot of efforts have been made to develop fast XCorr engines. However, the existing XCorr engines are not suitable for high-resolution tandem mass spectrometry because they are too slow and consume most of the running time. We present a high-speed XCorr engine for high-resolution tandem mass spectrometry by developing a novel algorithm for calculating XCorr. The algorithm enables XCorr calculation 1.25-49 times faster than previous algorithms for 0.01 Da fragment tolerance. Recently, proteogenomics has emerged as a new research field that combines proteomics and genomics. Proteogenomics research has been using six-frame translation of the whole genome or amino acid exon graphs to overcome the limitations of reference protein sequence databases. However, six-frame translation is not suitable for annotating genes that span over multiple exons, and amino acid exon graphs are not convenient to represent novel splice variants and exon skipping events between exons of incompatible reading frames. We propose a proteogenomic pipeline NextSearch (Nucleotide EXon-graph Transcriptome Search) that is based on a nucleotide exon graph. This pipeline consists of constructing a compact nucleotide exon graph that systematically incorporates novel splice variations, and a search tool that identifies peptides by directly searching the nucleotide exon graph against tandem mass spectra. Because our exon graph stores nucleotide sequences, it can easily represent novel splice variations and exon skipping events between incompatible reading frame exons. Searching for peptide identification is performed against this nucleotide ex`on graph, without converting it into a protein sequence in a FASTA format, achieving an order of magnitude reduction in the size of the sequence database storage. NextSearch outputs the proteome-genome/transcriptome mapping results in a general feature format (GFF) file, which can be visualized by public tools such as the UCSC Genome Browser.

URI: https://repository.hanyang.ac.kr/handle/20.500.11754/125561 http://hanyang.dcollection.net/common/orgView/200000429276

Appears in Collections:: GRADUATE SCHOOL[S](대학원) > ELECTRONICS AND COMPUTER ENGINEERING(전자컴퓨터통신공학과) > Theses (Ph.D.)

Files in This Item:

Export: RIS (EndNote); XLS (Excel); XML

Show full item record

한양대학교 리포지터리는 국립중앙도서관 OAK 보급사업으로 구축되었습니다. Feedback 개인정보처리방침

Hanyang University repository

Browse

My Repository

BROWSE