A Fast K-prototypes Algorithm Using Partial Distance Computation
- Authors
- Kim, Byoungwook
- Issue Date
- 4월-2017
- Publisher
- MDPI AG
- Keywords
- clustering algorithm; k-prototypes algorithm; partial distance computation
- Citation
- SYMMETRY-BASEL, v.9, no.4
- Indexed
- SCIE
SCOPUS
- Journal Title
- SYMMETRY-BASEL
- Volume
- 9
- Number
- 4
- URI
- https://scholar.korea.ac.kr/handle/2021.sw.korea/84005
- DOI
- 10.3390/sym9040058
- ISSN
- 2073-8994
- Abstract
- The k-means is one of the most popular and widely used clustering algorithm; however, it is limited to numerical data only. The k-prototypes algorithm is an algorithm famous for dealing with both numerical and categorical data. However, there have been no studies to accelerate it. In this paper, we propose a new, fast k-prototypes algorithm that provides the same answers as those of the original k-prototypes algorithm. The proposed algorithm avoids distance computations using partial distance computation. Our k-prototypes algorithm finds minimum distance without distance computations of all attributes between an object and a cluster center, which allows it to reduce time complexity. A partial distance computation uses a fact that a value of the maximum difference between two categorical attributes is 1 during distance computations. If data objects havem categorical attributes, the maximum difference of categorical attributes between an object and a cluster center is m. Our algorithm first computes distance with numerical attributes only. If a difference of the minimum distance and the second smallest with numerical attributes is higher than m, we can find the minimum distance between an object and a cluster center without distance computations of categorical attributes. The experimental results show that the computational performance of the proposed k-prototypes algorithm is superior to the original k-prototypes algorithm in our dataset.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - ETC > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.