Data Partitioning in Hierarchical Clustering: A Parameter-Insensitive Approach
- Title
- Data Partitioning in Hierarchical Clustering: A Parameter-Insensitive Approach
- Author
- 김상욱
- Keywords
- Data partitioning; Hierarchical clustering; Parameter-insensitive
- Issue Date
- 2013-10
- Publisher
- International Information Institute LTD
- Citation
- Information (Japan), 2013, 16(10), P.7699-7709
- Abstract
- In this paper, we propose a parameter-insensitive data partitioning approach for Chameleon, a hierarchical clustering algorithm. We first show that the quality of clusters produced by Chameleon is significantly affected by the sizes of initial sub-clusters and also that it is mainly because Chameleon recursively splits a dataset into two equal-sized clusters until the size of clusters becomes similar to that given by a user. Also, through preliminary experimentation, we show the problem appear in real situations. The proposed method splits a given dataset into every possible number of clusters by using existing algorithms that do allow arbitrarysized sub-clusters in partitioning. After that, it evaluates the quality of every set of initial sub-clusters by using our measurement function, and decides the optimal set of initial sub-clusters such that they show the highest value of measurement. Finally, it merges these optimal initial subclusters repeatedly and produces the final clustering result. We perform extensive experiments, and the results show that the proposed approach is insensitive to parameters and also produces a set of final clusters whose quality is better than the previous one.
- URI
- https://dl.acm.org/citation.cfm?doid=2448556.2448628
- ISSN
- 1344-8994; 1343-4500
- DOI
- 10.1145/2448556.2448628
- Appears in Collections:
- COLLEGE OF ENGINEERING[S](공과대학) > COMPUTER SCIENCE(컴퓨터소프트웨어학부) > Articles
- Files in This Item:
There are no files associated with this item.
- Export
- RIS (EndNote)
- XLS (Excel)
- XML