A Pilot Study of Biomedical Text Comprehension using an Attention-Based Deep Neural Reader: Design and Experimental Analysis

Kim, Seongsoon; Park, Donghyeon; Choi, Yonghwa; Lee, Kyubum; Kim, Byounggun; Jeon, Minji; Kim, Jihye; Tan, Aik Choon; Kang, Jaewoo

doi:10.2196/medinform.8751

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

A Pilot Study of Biomedical Text Comprehension using an Attention-Based Deep Neural Reader: Design and Experimental Analysis

Authors: Kim, Seongsoon; Park, Donghyeon; Choi, Yonghwa; Lee, Kyubum; Kim, Byounggun; Jeon, Minji; Kim, Jihye; Tan, Aik Choon; Kang, Jaewoo

Issue Date: Jan-2018

Publisher: JMIR PUBLICATIONS, INC

Keywords: machine comprehension; biomedical text comprehension; deep learning; machine comprehension dataset

Citation: JMIR MEDICAL INFORMATICS, v.6, no.1

Indexed: SCIE
SCOPUS

Journal Title: JMIR MEDICAL INFORMATICS

Volume: 6

Number: 1

URI: https://scholar.korea.ac.kr/handle/2021.sw.korea/78398

DOI: 10.2196/medinform.8751

ISSN: 2291-9694

Abstract: Background: With the development of artificial intelligence (AI) technology centered on deep-learning, the computer has evolved to a point where it can read a given text and answer a question based on the context of the text. Such a specific task is known as the task of machine comprehension. Existing machine comprehension tasks mostly use datasets of general texts, such as news articles or elementary school-level storybooks. However, no attempt has been made to determine whether an up-to-date deep learning-based machine comprehension model can also process scientific literature containing expert-level knowledge, especially in the biomedical domain. Objective: This study aims to investigate whether a machine comprehension model can process biomedical articles as well as general texts. Since there is no dataset for the biomedical literature comprehension task, our work includes generating a large-scale question answering dataset using PubMed and manually evaluating the generated dataset. Methods: We present an attention-based deep neural model tailored to the biomedical domain. To further enhance the performance of our model, we used a pretrained word vector and biomedical entity type embedding. We also developed an ensemble method of combining the results of several independent models to reduce the variance of the answers from the models. Results: The experimental results showed that our proposed deep neural network model outperformed the baseline model by more than 7% on the new dataset. We also evaluated human performance on the new dataset. The human evaluation result showed that our deep neural model outperformed humans in comprehension by 22% on average. Conclusions: In this work, we introduced a new task of machine comprehension in the biomedical domain using a deep neural model. Since there was no large-scale dataset for training deep neural models in the biomedical domain, we created the new cloze-style datasets Biomedical Knowledge Comprehension Title (BMKC_T) and Biomedical Knowledge Comprehension Last Sentence (BMKC_LS) (together referred to as BioMedical Knowledge Comprehension) using the PubMed corpus. The experimental results showed that the performance of our model is much higher than that of humans. We observed that our model performed consistently better regardless of the degree of difficulty of a text, whereas humans have difficulty when performing biomedical literature comprehension tasks that require expert level knowledge.

Files in This Item: There are no files associated with this item.

Appears in Collections: Graduate School > Department of Computer Science and Engineering > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Kang, Jae woo photo

Kang, Jae woo: Department of Computer Science and Engineering

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :7,085,465; Today View :17,483

RSS_1.0 RSS_2.0 ATOM_1.0

145 Anam-ro, Seongbuk-gu, Seoul, 02841, Korea+82-2-3290-2963

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE