Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Phraseological Analysis of Learner Corpus Based on Language Model

Authors
송상헌
Issue Date
2월-2018
Publisher
한국언어정보학회
Citation
언어와 정보, v.22, no.1, pp.123 - 152
Indexed
KCI
Journal Title
언어와 정보
Volume
22
Number
1
Start Page
123
End Page
152
URI
https://scholar.korea.ac.kr/handle/2021.sw.korea/139786
ISSN
12267430
Abstract
The present study addresses how Englishexpressions produced by Korean native speakers are close to common expressions usedby English native speakers. To this end, this article provides a quantitative study of theYonsei English Learner Corpus using a skill set derived from computational linguistics. The focus of the current work is on a language model of English texts written by Koreanuniversity students. A language model refers to a collection of logarithmic N-gramsdescribed in the ARPA format, and this model serves to discriminate native-likesentences from awkward sentences. The present study compares a language modelacquired from an L2 corpus to the other language models acquired from two L1 corporain English: namely, English Gigaword and Europarl. The present study utilizes GeniaSentence Splitter to separate the sentences and SRILM to create the language models ina computationally tractable way. On the one hand, a deep analysis of N-grams ispresented. This analysis consists of two subtasks. First, the N-grams are tallied andevaluated using common metrics of computational linguistics. Second, as an evaluation ofthe language model, the perplexity of each language model is measured and comparedto a reference point drawn from five test data sources. On the other hand, an analysis
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Liberal Arts > Department of Linguistics > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE