Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

How Do Your Biomedical Named Entity Recognition Models Generalize to Novel Entities?

Full metadata record
DC Field Value Language
dc.contributor.authorKim, Hyunjae-
dc.contributor.authorKang, Jaewoo-
dc.date.accessioned2022-04-18T18:42:40Z-
dc.date.available2022-04-18T18:42:40Z-
dc.date.created2022-04-18-
dc.date.issued2022-
dc.identifier.issn2169-3536-
dc.identifier.urihttps://scholar.korea.ac.kr/handle/2021.sw.korea/140329-
dc.description.abstractThe number of biomedical literature on new biomedical concepts is rapidly increasing, which necessitates a reliable biomedical named entity recognition (BioNER) model for identifying new and unseen entity mentions. However, it is questionable whether existing models can effectively handle them. In this work, we systematically analyze the three types of recognition abilities of BioNER models: memorization, synonym generalization, and concept generalization. We find that although current best models achieve state-of-the-art performance on benchmarks based on overall performance, they have limitations in identifying synonyms and new biomedical concepts, indicating they are overestimated in terms of their generalization abilities. We also investigate failure cases of models and identify several difficulties in recognizing unseen mentions in biomedical literature as follows: (1) models tend to exploit dataset biases, which hinders the models' abilities to generalize, and (2) several biomedical names have novel morphological patterns with weak name regularity, and models fail to recognize them. We apply a statistics-based debiasing method to our problem as a simple remedy and show the improvement in generalization to unseen mentions. We hope that our analyses and findings would be able to facilitate further research into the generalization capabilities of NER models in a domain where their reliability is of utmost importance.-
dc.languageEnglish-
dc.language.isoen-
dc.publisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC-
dc.titleHow Do Your Biomedical Named Entity Recognition Models Generalize to Novel Entities?-
dc.typeArticle-
dc.contributor.affiliatedAuthorKang, Jaewoo-
dc.identifier.doi10.1109/ACCESS.2022.3157854-
dc.identifier.scopusid2-s2.0-85126334532-
dc.identifier.wosid000773239000001-
dc.identifier.bibliographicCitationIEEE ACCESS, v.10, pp.31513 - 31523-
dc.relation.isPartOfIEEE ACCESS-
dc.citation.titleIEEE ACCESS-
dc.citation.volume10-
dc.citation.startPage31513-
dc.citation.endPage31523-
dc.type.rimsART-
dc.type.docTypeArticle-
dc.description.journalClass1-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalResearchAreaTelecommunications-
dc.relation.journalWebOfScienceCategoryComputer Science, Information Systems-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.relation.journalWebOfScienceCategoryTelecommunications-
dc.subject.keywordAuthortext mining-
dc.subject.keywordAuthorBiological system modeling-
dc.subject.keywordAuthorCOVID-19-
dc.subject.keywordAuthorBenchmark testing-
dc.subject.keywordAuthorTraining-
dc.subject.keywordAuthorMicromechanical devices-
dc.subject.keywordAuthorAnalytical models-
dc.subject.keywordAuthorSurface morphology-
dc.subject.keywordAuthorBioinformatics (in engineering in medicine and biology)-
dc.subject.keywordAuthornatural language processing-
Files in This Item
There are no files associated with this item.
Appears in
Collections
Graduate School > Department of Computer Science and Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kang, Jae woo photo

Kang, Jae woo
컴퓨터학과
Read more

Altmetrics

Total Views & Downloads

BROWSE