Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

대규모 신문 기사의 자동 키워드 추출과 분석 -t-점수를 이용하여-

Authors
김일환이도길
Issue Date
2011
Publisher
한국어학회
Keywords
키워드(keyword); 키워드성(keywordness); 키워드 추출(extraction of keyword); 사용 빈도(frequency of use); t-점수(t-score); [물결 21] 코퍼스(Trends21 corpus; 신문 기사(newspaper)
Citation
한국어학, v.53, pp.145 - 194
Indexed
KCI
Journal Title
한국어학
Volume
53
Start Page
145
End Page
194
URI
https://scholar.korea.ac.kr/handle/2021.sw.korea/113705
ISSN
1226-9123
Abstract
Kim, Ilhwan & Lee, Do-Gil. 2011. 11. Automatic Keyword Extraction and Analysis from the Large Scale Newspaper Corpus Based on t-score. Korean Linguistics 53,145-194. As the type and size of documents radically increased in recent years, how to automatically extract proper keywords from those documents has also been important. This paper aims to propose an automatic method to extract keywords and to analyze their characteristics. The keywords are extracted from Trends 21 corpus, a collection of four major Korean daily newspapers issued from the year 2000 to 2009. We introduce t-score to measure the keywordness. The keywords were extracted from two aspects i.e. year and topic. We present the top 100 keywords for 6 topics and 10years. Also, to verify whether these keywords can be representatives of the texts, we compared them with the headline news of 2009. The two main contributions of this work are as follows: 1) this study can present keywords which are automatically extracted from large scaled corpora without any human intervention by the verifiable and objective method and 2) this study analyzed the characteristics of the keywords by topic and year.
Files in This Item
There are no files associated with this item.
Appears in
Collections
Associate Research Center > Research Institute of Korean Studies > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE