Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

An all-words sense tagging method for resource-deficient languages

Authors
Yi, Bong-JunLee, Do-GilRim, Hae-Chang
Issue Date
9월-2017
Publisher
OXFORD UNIV PRESS
Citation
DIGITAL SCHOLARSHIP IN THE HUMANITIES, v.32, no.3, pp.672 - 688
Indexed
SSCI
AHCI
SCOPUS
Journal Title
DIGITAL SCHOLARSHIP IN THE HUMANITIES
Volume
32
Number
3
Start Page
672
End Page
688
URI
https://scholar.korea.ac.kr/handle/2021.sw.korea/82363
DOI
10.1093/llc/fqw031
ISSN
2055-7671
Abstract
All-words sense tagging is the task of determining the correct senses of all content words in a given text. Many methods utilizing various language resources, such as a machine readable dictionary (MRD), sense tagged corpus, and WordNet, have been proposed for tagging senses to all words rather than a small number of sample words. However, sense tagging methods that require vast resources cannot be used for resource-deficient languages. The conventional sense tagging method for resource-deficient languages, which utilizes only an MRD, suffers from low recall and low precision because it determines senses only when a gloss word in the dictionary exactly matches a context word. In this study, we propose an all-words sense tagging method that is effective for resource-deficient languages in particular. It requires an MRD, which is the essential resource for all-words sense tagging, and a raw corpus, which is easily acquired and freely available. The proposed sense tagging method attempts to find semantically related context words based on the co-occurrence information extracted from the raw corpus and utilizes these words for tagging the senses of the target word. The experimental results of an evaluation of the proposed sense tagging algorithm on a Korean test corpus consisting of approximately 15 million words show that it can tag senses to all contents words automatically with high precision. Furthermore, we also show that a semantic concordancer can be developed based on the automatic sense tagged corpus.
Files in This Item
There are no files associated with this item.
Appears in
Collections
Associate Research Center > Research Institute of Korean Studies > 1. Journal Articles
College of Informatics > Department of Computer Science and Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE