Document clustering method using dimension reduction and support vector clustering to overcome sparseness
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Jun, Sunghae | - |
dc.contributor.author | Park, Sang-Sung | - |
dc.contributor.author | Jang, Dong-Sik | - |
dc.date.accessioned | 2021-09-05T08:07:30Z | - |
dc.date.available | 2021-09-05T08:07:30Z | - |
dc.date.created | 2021-06-15 | - |
dc.date.issued | 2014-06-01 | - |
dc.identifier.issn | 0957-4174 | - |
dc.identifier.uri | https://scholar.korea.ac.kr/handle/2021.sw.korea/98265 | - |
dc.description.abstract | Many studies on developing technologies have been published as articles, papers, or patents. We use and analyze these documents to find scientific and technological trends. In this paper, we consider document clustering as a method of document data analysis. In general, we have trouble analyzing documents directly because document data are not suitable for statistical and machine learning methods of analysis. Therefore, we have to transform document data into structured data for analytical purposes. For this process, we use text mining techniques. The structured data are very sparse, and hence, it is difficult to analyze them. This study proposes a new method to overcome the sparsity problem of document clustering. We build a combined clustering method using dimension reduction and K-means clustering based on support vector clustering and Silhouette measure. In particular, we attempt to overcome the sparseness in patent document clustering. To verify the efficacy of our work, we first conduct an experiment using news data from the machine learning repository of the University of California at Irvine. Second, using patent documents retrieved from the United States Patent and Trademark Office, we carry out patent clustering for technology forecasting. (C) 2013 Elsevier Ltd. All rights reserved. | - |
dc.language | English | - |
dc.language.iso | en | - |
dc.publisher | PERGAMON-ELSEVIER SCIENCE LTD | - |
dc.subject | CLASSIFICATION | - |
dc.subject | INVENTION | - |
dc.subject | PLATFORM | - |
dc.subject | TRENDS | - |
dc.subject | INDIA | - |
dc.subject | MAP | - |
dc.title | Document clustering method using dimension reduction and support vector clustering to overcome sparseness | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | Park, Sang-Sung | - |
dc.contributor.affiliatedAuthor | Jang, Dong-Sik | - |
dc.identifier.doi | 10.1016/j.eswa.2013.11.018 | - |
dc.identifier.scopusid | 2-s2.0-84890497754 | - |
dc.identifier.wosid | 000331019800006 | - |
dc.identifier.bibliographicCitation | EXPERT SYSTEMS WITH APPLICATIONS, v.41, no.7, pp.3204 - 3212 | - |
dc.relation.isPartOf | EXPERT SYSTEMS WITH APPLICATIONS | - |
dc.citation.title | EXPERT SYSTEMS WITH APPLICATIONS | - |
dc.citation.volume | 41 | - |
dc.citation.number | 7 | - |
dc.citation.startPage | 3204 | - |
dc.citation.endPage | 3212 | - |
dc.type.rims | ART | - |
dc.type.docType | Article | - |
dc.description.journalClass | 1 | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalResearchArea | Engineering | - |
dc.relation.journalResearchArea | Operations Research & Management Science | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Artificial Intelligence | - |
dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
dc.relation.journalWebOfScienceCategory | Operations Research & Management Science | - |
dc.subject.keywordPlus | CLASSIFICATION | - |
dc.subject.keywordPlus | INVENTION | - |
dc.subject.keywordPlus | PLATFORM | - |
dc.subject.keywordPlus | TRENDS | - |
dc.subject.keywordPlus | INDIA | - |
dc.subject.keywordPlus | MAP | - |
dc.subject.keywordAuthor | Document clustering | - |
dc.subject.keywordAuthor | Sparseness problem | - |
dc.subject.keywordAuthor | Patent clustering | - |
dc.subject.keywordAuthor | Dimension reduction | - |
dc.subject.keywordAuthor | K-means clustering based on support vector clustering | - |
dc.subject.keywordAuthor | Silhouette measure | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
(02841) 서울특별시 성북구 안암로 14502-3290-1114
COPYRIGHT © 2021 Korea University. All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.