bootstrap_clustering_revision.pdf (1.56 MB)
0/0

Selection of the number of clusters via the bootstrap method

Download (1.56 MB)
journal contribution
posted on 21.08.2012 by Yixin Fang, Junhui Wang
Here the problem of selecting the number of clusters in cluster analysis is considered. Recently, the concept of clustering stability, which measures the robustness of any given clustering algorithm, has been utilized in Wang (2010) for selecting the number of clusters through cross validation. In this manuscript, an estimation scheme for clustering instability is developed based on the bootstrap, and then the number of clusters is selected so that the corresponding estimated clustering instability is minimized. The proposed selection criterion’s effectiveness is demonstrated on simulations and real examples.

History

Publisher Statement

NOTICE: this is the author’s version of a work that was accepted for publication in Computational Statistics and Data Analysis. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Computational Statistics and Data Analysis, Vol 56, Issue 3, (MAR 1 2012). DOI: 10.1016/j.csda.2011.09.003

Publisher

Elsevier

Language

en_US

issn

0167-9473

Issue date

01/03/2012

Exports

Categories

Exports