벡터 공간 모델과 HAL에 기초한 단어 의미 유사성 군집

김동성

doi:10.19066/cogsci.2012.23.3.001

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

벡터 공간 모델과 HAL에 기초한 단어 의미 유사성 군집Word Sense Similarity Clustering Based on Vector Space Model and HAL

Other Titles: Word Sense Similarity Clustering Based on Vector Space Model and HAL

Authors: 김동성

Issue Date: 2012

Publisher: 한국인지과학회

Keywords: 분포 가설; 벡터 공간 모델; 기계학습; HAL; 심리언어학; 클러스터링; 다차원 축소; 코퍼스언어학; Distributional Hypothesis; Vector Space Model; HAL; Supervised/Non-supervised Learning; Pysholinguistics; Clustering; Dimensionality Reduction; Corpus Linguistics; Distributional Hypothesis; Vector Space Model; HAL; Supervised/Non-supervised Learning; Pysholinguistics; Clustering; Dimensionality Reduction; Corpus Linguistics

Citation: 인지과학, v.23, no.3, pp.295 - 322

Indexed: KCI

Journal Title: 인지과학

Volume: 23

Number: 3

Start Page: 295

End Page: 322

URI: https://scholar.korea.ac.kr/handle/2021.sw.korea/110534

DOI: 10.19066/cogsci.2012.23.3.001

ISSN: 1226-4067

Abstract: 본 연구에서는 벡터 공간 모델과 HAL (Hyperspace Analog to Language)을 적용해서 단어 의미 유사성을 군집한다. 일정한 크기의 문맥을 통해서 단어 간의 상관성을 측정하는 HAL을 도입하고(Lund and Burgess 1996), 상관성 측정에서 고빈도와 저빈도에 다르게 측정되는 왜곡을 줄이기 위해서 벡터 공간 모델을 적용해서 단어 쌍의 코사인 유사도를 측정하였다(Salton et al. 1975, Widdows 2004). HAL과 벡터 공간 모델로 만들어지는 공간은 다차원이므로, 차원을 축소하기 위해서 PCA (Principal Component Analysis)와 SVD (Singular Value Decomposition)를 적용하였다. 유사성 군집을 위해서 비감독 방식과 감독 방식을 적용하였는데, 비감독 방식에는 클러스터링을 감독 방식에는 SVM (Support Vector Machine), 나이브 베이즈 구분자(Naive Bayes Classifier), 최대 엔트로피(Maximum Entropy) 방식을 적용하였다. 이 연구는 언어학적 측면에서 Harris (1954), Firth (1957)의 분포 가설(Distributional Hypothesis)을 활용한 의미 유사도를 측정하였으며, 심리언어학적 측면에서 의미 기억을 설명하기 위한 모델로 벡터 공간 모델과 HAL을 결합하였으며, 전산적 언어 처리 관점에서 기계학습 방식 중 감독 기반과 비감독 기반을 적용하였다.

Files in This Item: There are no files associated with this item.

Appears in Collections: ETC > 1. Journal Articles

Show full item record

qrcode

Altmetrics

Total Views & Downloads

STATISTICS: Total View :8,935,779; Today View :56,354

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE