A Contrast Pattern Based Clustering Quality Index for Categorical Data

TitleA Contrast Pattern Based Clustering Quality Index for Categorical Data
Publication TypeConference Paper
Year of Publication2009
AuthorsQingbao Liu, Guozhu Dong
Conference NameIEEE International Conference on Data Mining series (ICDM 2009)
Conference LocationMiami, Florida
Abstract

Since clustering is unsupervised and highly explorative, clustering validation (i.e. assessing the quality of clustering solutions) has been an important and long standing research problem. Existing validity measures have significant shortcomings. This paper proposes a novel Contrast Pattern based Clustering Quality index (CPCQ) for categorical data, by utilizing the quality and diversity of the contrast patterns (CPs) which contrast the clusters in clusterings. High quality CPs can characterize clusters and discriminate them against each other. Experiments show that the CPCQ index (1) can recognize that expert-determined classes are the best clusters for many datasets from the UCI repository; (2) does not give inappropriate preference to larger number of clusters; (3) does not require a user to provide a distance function.

Full Text

Qingbao Liu,Guozhu Dong, A Contrast Pattern based Clustering Quality Index for Categorical Data, IEEE International Conference on Data Mining Series (ICDM 2009), Miami, Florida, December 6-9, 2009

related resource url: http://www.cs.umbc.edu/ICDM09/

Related Files: