Compact variant-rich customized sequence database and a fast and sensitive database search for efficient proteogenomic analyses

Park, Heejin; Bae, Junwoo; Kim, Hyunwoo; Kim, Sangok; Kim, Hokeun; Mun, Dong-Gi; Joh, Yoonsung; Lee, Wonyeop; Chae, Sehyun; Lee, Sanghyuk; Kim, Hark Kyun; Hwang, Daehee; Lee, Sang-Won; Paek, Eunok

doi:10.1002/pmic.201400225

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Compact variant-rich customized sequence database and a fast and sensitive database search for efficient proteogenomic analyses

Full metadata record

DC Field	Value	Language
dc.contributor.author	Park, Heejin	-
dc.contributor.author	Bae, Junwoo	-
dc.contributor.author	Kim, Hyunwoo	-
dc.contributor.author	Kim, Sangok	-
dc.contributor.author	Kim, Hokeun	-
dc.contributor.author	Mun, Dong-Gi	-
dc.contributor.author	Joh, Yoonsung	-
dc.contributor.author	Lee, Wonyeop	-
dc.contributor.author	Chae, Sehyun	-
dc.contributor.author	Lee, Sanghyuk	-
dc.contributor.author	Kim, Hark Kyun	-
dc.contributor.author	Hwang, Daehee	-
dc.contributor.author	Lee, Sang-Won	-
dc.contributor.author	Paek, Eunok	-
dc.date.accessioned	2021-09-05T02:35:00Z	-
dc.date.available	2021-09-05T02:35:00Z	-
dc.date.created	2021-06-15	-
dc.date.issued	2014-12	-
dc.identifier.issn	1615-9853	-
dc.identifier.uri	https://scholar.korea.ac.kr/handle/2021.sw.korea/96662	-
dc.description.abstract	In proteogenomic analysis, construction of a compact, customized database from mRNA-seq data and a sensitive search of both reference and customized databases are essential to accurately determine protein abundances and structural variations at the protein level. However, these tasks have not been systematically explored, but rather performed in an ad-hoc fashion. Here, we present an effective method for constructing a compact database containing comprehensive sequences of sample-specific variantssingle nucleotide variants, insertions/deletions, and stop-codon mutations derived from Exome-seq and RNA-seq data. It, however, occupies less space by storing variant peptides, not variant proteins. We also present an efficient search method for both customized and reference databases. The separate searches of the two databases increase the search time, and a unified search is less sensitive to identify variant peptides due to the smaller size of the customized database, compared to the reference database, in the target-decoy setting. Our method searches the unified database once, but performs target-decoy validations separately. Experimental results show that our approach is as fast as the unified search and as sensitive as the separate searches. Our customized database includes mutation information in the headers of variant peptides, thereby facilitating the inspection of peptide-spectrum matches.	-
dc.language	English	-
dc.language.iso	en	-
dc.publisher	WILEY-BLACKWELL	-
dc.subject	RNA-SEQ DATA	-
dc.subject	PEPTIDE IDENTIFICATION	-
dc.subject	PROTEIN IDENTIFICATION	-
dc.subject	CONSTRUCTION	-
dc.subject	PROTEOMICS	-
dc.subject	FRAMEWORK	-
dc.subject	STRATEGY	-
dc.subject	CANCER	-
dc.subject	PAIRS	-
dc.title	Compact variant-rich customized sequence database and a fast and sensitive database search for efficient proteogenomic analyses	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Lee, Sang-Won	-
dc.identifier.doi	10.1002/pmic.201400225	-
dc.identifier.scopusid	2-s2.0-84913526299	-
dc.identifier.wosid	000345915200012	-
dc.identifier.bibliographicCitation	PROTEOMICS, v.14, no.23-24, pp.2742 - 2749	-
dc.relation.isPartOf	PROTEOMICS	-
dc.citation.title	PROTEOMICS	-
dc.citation.volume	14	-
dc.citation.number	23-24	-
dc.citation.startPage	2742	-
dc.citation.endPage	2749	-
dc.type.rims	ART	-
dc.type.docType	Article	-
dc.description.journalClass	1	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Biochemistry & Molecular Biology	-
dc.relation.journalWebOfScienceCategory	Biochemical Research Methods	-
dc.relation.journalWebOfScienceCategory	Biochemistry & Molecular Biology	-
dc.subject.keywordPlus	RNA-SEQ DATA	-
dc.subject.keywordPlus	PEPTIDE IDENTIFICATION	-
dc.subject.keywordPlus	PROTEIN IDENTIFICATION	-
dc.subject.keywordPlus	CONSTRUCTION	-
dc.subject.keywordPlus	PROTEOMICS	-
dc.subject.keywordPlus	FRAMEWORK	-
dc.subject.keywordPlus	STRATEGY	-
dc.subject.keywordPlus	CANCER	-
dc.subject.keywordPlus	PAIRS	-
dc.subject.keywordAuthor	Bioinformatics	-
dc.subject.keywordAuthor	Early onset gastric cancer	-
dc.subject.keywordAuthor	Peptide identification	-
dc.subject.keywordAuthor	Proteogenomics	-
dc.subject.keywordAuthor	Sequence database	-

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Science > Department of Chemistry > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher LEE, Sang Won photo

LEE, Sang Won: 이과대학 (화학과)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :8,685,578; Today View :17,007

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE