Unsupervised Lexical Entry Acquisition Model based on Representation of Human Mental Lexicon
- Authors
- Yu, Wonhee; Park, Doo-Soon; Suh, Taeweon; Lim, Heuiseok
- Issue Date
- 7월-2011
- Publisher
- INT INFORMATION INST
- Keywords
- Mental Lexicon; Lexical Acquisition; Language Learning; Machine Readable Dictionary
- Citation
- INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, v.14, no.7, pp.2229 - 2241
- Indexed
- SCIE
SCOPUS
- Journal Title
- INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL
- Volume
- 14
- Number
- 7
- Start Page
- 2229
- End Page
- 2241
- URI
- https://scholar.korea.ac.kr/handle/2021.sw.korea/112076
- ISSN
- 1343-4500
- Abstract
- This paper proposes a computational lexical entry acquisition model based on a representation model of the mental lexicon. The proposed model acquires lexical entries from a raw corpus by unsupervised learning, like human beings. The model is composed of full-form and morpheme acquisition modules. In the full-form acquisition module, core full-forms are automatically acquired according to the frequency and recency thresholds. In the morpheme acquisition module, a repeatedly occurring substring in different full-forms is chosen as a candidate morpheme. Then, the candidate is corroborated as a morpheme by using the entropy measure of syllables in the string. We tested the model with a Korean language raw corpus as large as about 16 million Korean full-forms. The test results show that the model successively acquires major Korean language full-forms and morphemes, with an average precision of 100% and 99.04%, respectively. In addition, we observed a vocabulary spurt during learning, which is a phenomenon peculiar to children's language learning process.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - Graduate School > Department of Computer Science and Engineering > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.