Localized user-driven topic discovery via boosted ensemble of nonnegative matrix factorization
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Suh, Sangho | - |
dc.contributor.author | Shin, Sungbok | - |
dc.contributor.author | Lee, Joonseok | - |
dc.contributor.author | Reddy, Chandan K. | - |
dc.contributor.author | Choo, Jaegul | - |
dc.date.accessioned | 2021-09-02T06:44:52Z | - |
dc.date.available | 2021-09-02T06:44:52Z | - |
dc.date.created | 2021-06-16 | - |
dc.date.issued | 2018-09 | - |
dc.identifier.issn | 0219-1377 | - |
dc.identifier.uri | https://scholar.korea.ac.kr/handle/2021.sw.korea/73253 | - |
dc.description.abstract | Nonnegative matrix factorization (NMF) has been widely used in topic modeling of large-scale document corpora, where a set of underlying topics are extracted by a low-rank factor matrix from NMF. However, the resulting topics often convey only general, thus redundant information about the documents rather than information that might be minor, but potentially meaningful to users. To address this problem, we present a novel ensemble method based on nonnegative matrix factorization that discovers meaningful local topics. Our method leverages the idea of an ensemble model, which has shown advantages in supervised learning, into an unsupervised topic modeling context. That is, our model successively performs NMF given a residual matrix obtained from previous stages and generates a sequence of topic sets. The algorithm we employ to update is novel in two aspects. The first lies in utilizing the residual matrix inspired by a state-of-the-art gradient boosting model, and the second stems from applying a sophisticated local weighting scheme on the given matrix to enhance the locality of topics, which in turn delivers high-quality, focused topics of interest to users. We subsequently extend this ensemble model by adding keyword- and document-based user interaction to introduce user-driven topic discovery. | - |
dc.language | English | - |
dc.language.iso | en | - |
dc.publisher | SPRINGER LONDON LTD | - |
dc.subject | CONSTRAINED LEAST-SQUARES | - |
dc.subject | ALGORITHMS | - |
dc.title | Localized user-driven topic discovery via boosted ensemble of nonnegative matrix factorization | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | Choo, Jaegul | - |
dc.identifier.doi | 10.1007/s10115-017-1147-9 | - |
dc.identifier.scopusid | 2-s2.0-85049339359 | - |
dc.identifier.wosid | 000437002900001 | - |
dc.identifier.bibliographicCitation | KNOWLEDGE AND INFORMATION SYSTEMS, v.56, no.3, pp.503 - 531 | - |
dc.relation.isPartOf | KNOWLEDGE AND INFORMATION SYSTEMS | - |
dc.citation.title | KNOWLEDGE AND INFORMATION SYSTEMS | - |
dc.citation.volume | 56 | - |
dc.citation.number | 3 | - |
dc.citation.startPage | 503 | - |
dc.citation.endPage | 531 | - |
dc.type.rims | ART | - |
dc.type.docType | Article | - |
dc.description.journalClass | 1 | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Artificial Intelligence | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Information Systems | - |
dc.subject.keywordPlus | CONSTRAINED LEAST-SQUARES | - |
dc.subject.keywordPlus | ALGORITHMS | - |
dc.subject.keywordAuthor | Topic modeling | - |
dc.subject.keywordAuthor | Ensemble learning | - |
dc.subject.keywordAuthor | Matrix factorization | - |
dc.subject.keywordAuthor | Gradient boosting | - |
dc.subject.keywordAuthor | Local weighting | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
(02841) 서울특별시 성북구 안암로 14502-3290-1114
COPYRIGHT © 2021 Korea University. All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.