Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Full-text chemical identification with improved generalizability and tagging consistencyopen access

Authors
Kim, HyunjaeSung, MujeenYoon, WonjinPark, SungjoonKang, Jaewoo
Issue Date
28-9월-2022
Publisher
OXFORD UNIV PRESS
Citation
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, v.2022
Indexed
SCIE
SCOPUS
Journal Title
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION
Volume
2022
URI
https://scholar.korea.ac.kr/handle/2021.sw.korea/146590
DOI
10.1093/database/baac074
ISSN
1758-0463
Abstract
Chemical identification involves finding chemical entities in text (i.e. named entity recognition) and assigning unique identifiers to the entities (i.e. named entity normalization). While current models are developed and evaluated based on article titles and abstracts, their effectiveness has not been thoroughly verified in full text. In this paper, we identify two limitations of models in tagging full-text articles: (1) low generalizability to unseen mentions and (2) tagging inconsistency. We use simple training and post-processing methods to address the limitations such as transfer learning and mention-wise majority voting. We also present a hybrid model for the normalization task that utilizes the high recall of a neural model while maintaining the high precision of a dictionary model. In the BioCreative VII NLM-Chem track challenge, our best model achieves 86.72 and 78.31 F1 scores in named entity recognition and normalization, significantly outperforming the median (83.73 and 7749 Fl scores) and taking first place in named entity recognition. In a post-challenge evaluation, we re-implement our model and obtain 84.70 F1 score in the normalization task, outperforming the best score in the challenge by 3.34 F1 score.
Files in This Item
There are no files associated with this item.
Appears in
Collections
Graduate School > Department of Computer Science and Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kang, Jae woo photo

Kang, Jae woo
컴퓨터학과
Read more

Altmetrics

Total Views & Downloads

BROWSE