Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

토픽모델링을 이용한 코퍼스의 주제구조 탐색

Full metadata record
DC Field Value Language
dc.contributor.author홍정하-
dc.contributor.author최재웅-
dc.date.accessioned2021-09-03T13:54:47Z-
dc.date.available2021-09-03T13:54:47Z-
dc.date.created2021-06-17-
dc.date.issued2017-
dc.identifier.issn1598-1886-
dc.identifier.urihttps://scholar.korea.ac.kr/handle/2021.sw.korea/86015-
dc.description.abstractThis paper aims to demonstrate the applicability of topic modeling, which can organize and summarize large archives of texts, from a corpus-linguistic perspective. To do this, we investigate thematic structures in the Brown Corpus uncovered by an R package which implements topic modeling based on LDA (latent Dirichlet allocation), and use statistical techniques such as comparison cloud, principal component analysis and phylogenetic tree to analyze and visualize the results effectively. This paper shows (i) that the Brown Corpus has a core thematic structure which is divided into texts representing the tendency of past tense and spoken language and texts representing the tendency of present tense and written language, (ii) that the former texts are mainly about women, home, and battle, and the latter texts are primarily related to humanities, society and the economy, and (iii) that the linguistic texts reveal the interdisciplinary nature related to mathematics and engineering, as well as humanities and social sciences.-
dc.languageKorean-
dc.language.isoko-
dc.publisher서강대학교 언어정보연구소-
dc.title토픽모델링을 이용한 코퍼스의 주제구조 탐색-
dc.title.alternativeExploring the Thematic Structure in Corpora with Topic Modeling-
dc.typeArticle-
dc.contributor.affiliatedAuthor최재웅-
dc.identifier.doi10.29211/soli.2017.30..009-
dc.identifier.bibliographicCitation언어와 정보 사회, v.30, pp.239 - 276-
dc.relation.isPartOf언어와 정보 사회-
dc.citation.title언어와 정보 사회-
dc.citation.volume30-
dc.citation.startPage239-
dc.citation.endPage276-
dc.type.rimsART-
dc.identifier.kciidART002214526-
dc.description.journalClass2-
dc.description.journalRegisteredClasskci-
dc.subject.keywordAuthor토픽모델링-
dc.subject.keywordAuthorLDA알고리즘-
dc.subject.keywordAuthor주제구조-
dc.subject.keywordAuthor텍스트분류-
dc.subject.keywordAuthor불용어-
dc.subject.keywordAuthor비교클라우드-
dc.subject.keywordAuthor주성분분석-
dc.subject.keywordAuthor계통수도-
dc.subject.keywordAuthortopic modeling-
dc.subject.keywordAuthorLDA(latent Dirichlet allocation)-
dc.subject.keywordAuthorthematic structure-
dc.subject.keywordAuthortext classification-
dc.subject.keywordAuthorstop word-
dc.subject.keywordAuthorcomparison cloud-
dc.subject.keywordAuthorprincipal component analysis-
dc.subject.keywordAuthorphylogenetic tree-
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Liberal Arts > Department of Linguistics > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE