Generalizations and Unification of Centroid-based Clustering Methods

Show full item record

Title: Generalizations and Unification of Centroid-based Clustering Methods
Author: Canas, Daniel Alberto
Advisors: Dr. Robert Funderlic, Committee Chair
Dr, Jon Doyle, Committee Member
Dr. Steffen Heber, Committee Member
Abstract: There are many clustering methods that are referred to as k-means-like. We give the minimal necessary and sufficient components for the mechanism of the k-means (iterative and partitional) clustering method of a finite set of objects, X. Thus k-means is generalized and the methods that mimic k-means are unified. We name these k-center clustering methods. The fundamental mechanism of k-center methods exposes the usual misconceptions of k-means such as (a) "distance" satisfies some of properties of a mathematical metric, (b) there is a need to measure "distance" between objects in X, and (c) the centers of clusters have the same nature as the objects of X. Moreover, k-center methods have a common formula to choose or calculate centers of clusters. We characterize the convergent common objective function by expressing it in terms of (a) a distance measure for closeness between center objects and the objects in X and (b) the coherence of clusters. We give a three object example to demonstrate the components of the formal mechanism of a k-center method. We then give examples of various known methods that belong to the class of k-center methods. We exhibit an extensive and thorough comparison of the qualitative k-modes and the numerical spherical k-means. Included are paradigm applications, a matrix environment, an understanding of the duality of a dissimilarity and similarity measure, and an understanding of normalized X and the normalized centers of subsets of X.
Date: 2004-12-01
Degree: MS
Discipline: Computer Science

Files in this item

Files Size Format View
etd.pdf 249.2Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record