Deep learning of mutation-gene-drug relations from the literature

Lee, Kyubum; Kim, Byounggun; Choi, Yonghwa; Kim, Sunkyu; Shin, Wonho; Lee, Sunwon; Park, Sungjoon; Kim, Seongsoon; Tan, Aik Choon; Kang, Jaewoo

doi:10.1186/s12859-018-2029-1

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Deep learning of mutation-gene-drug relations from the literature

Authors: Lee, Kyubum; Kim, Byounggun; Choi, Yonghwa; Kim, Sunkyu; Shin, Wonho; Lee, Sunwon; Park, Sungjoon; Kim, Seongsoon; Tan, Aik Choon; Kang, Jaewoo

Issue Date: 25-1월-2018

Publisher: BMC

Keywords: Deep learning; Convolutional neural networks; Information extraction; Text mining; NLP; BioNLP; Mutation; Precision medicine

Citation: BMC BIOINFORMATICS, v.19

Indexed: SCIE
SCOPUS

Journal Title: BMC BIOINFORMATICS

Volume: 19

URI: https://scholar.korea.ac.kr/handle/2021.sw.korea/77935

DOI: 10.1186/s12859-018-2029-1

ISSN: 1471-2105

Abstract: Background: Molecular biomarkers that can predict drug efficacy in cancer patients are crucial components for the advancement of precision medicine. However, identifying these molecular biomarkers remains a laborious and challenging task. Next-generation sequencing of patients and preclinical models have increasingly led to the identification of novel gene-mutation-drug relations, and these results have been reported and published in the scientific literature. Results: Here, we present two new computational methods that utilize all the PubMed articles as domain specific background knowledge to assist in the extraction and curation of gene-mutation-drug relations from the literature. The first method uses the Biomedical Entity Search Tool (BEST) scoring results as some of the features to train the machine learning classifiers. The second method uses not only the BEST scoring results, but also word vectors in a deep convolutional neural network model that are constructed from and trained on numerous documents such as PubMed abstracts and Google News articles. Using the features obtained from both the BEST search engine scores and word vectors, we extract mutation-gene and mutation-drug relations from the literature using machine learning classifiers such as random forest and deep convolutional neural networks. Our methods achieved better results compared with the state-of-the-art methods. We used our proposed features in a simple machine learning model, and obtained F1-scores of 0.96 and 0.82 for mutation-gene and mutation-drug relation classification, respectively. We also developed a deep learning classification model using convolutional neural networks, BEST scores, and the word embeddings that are pre-trained on PubMed or Google News data. Using deep learning, the classification accuracy improved, and F1-scores of 0.96 and 0.86 were obtained for the mutation-gene and mutation-drug relations, respectively. Conclusion: We believe that our computational methods described in this research could be used as an important tool in identifying molecular biomarkers that predict drug responses in cancer patients. We also built a database of these mutation-gene-drug relations that were extracted from all the PubMed abstracts. We believe that our database can prove to be a valuable resource for precision medicine researchers.

Files in This Item: There are no files associated with this item.

Appears in Collections: Graduate School > Department of Computer Science and Engineering > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Kang, Jae woo photo

Kang, Jae woo: 컴퓨터학과

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :9,520,692; Today View :12,630

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE