Browsing by Author "Jason osborne, Committee Member"
Now showing 1 - 1 of 1
- Results Per Page
- Sort Options
- Estimating the Number of Clusters in Cluster Analysis(2007-03-08) Dasah, Julius Berry; David Dickey, Committee Member; Leonard Stefanski, Committee Co-Chair; Dennis Boos, Committee Chair; Jason osborne, Committee MemberIn many applied fields of study such as medicine, psychology, ecology, taxonomy and finance one has to deal with massive amounts of noisy but structured data. A question that often arises in this context is whether or not the observations in these data fall into some "natural" groups, and if so, how many groups? This dissertation proposes a new quantity, called the [it maximal jump function], for assessing the number of groups in a data set. The estimated maximal jump function measures the excess transformed [it distortion] attainable by fitting an extra cluster to a data set. By [it distortion,] we mean the average distance between each observation and its nearest cluster center. [it Distortion] $ d g$ in the above sense, is a measure of the error incurred by fitting $g$ clusters to a data set. Three stopping rules based on the maximal jump function are proposed for determining the number of groups in a data set. A new procedure for clustering data sets with a common covariance structure is also introduced. The proposed methods are tested on a wide variety of real data including DNA microarray data sets as well as on high-dimensional simulated data possessing numerous "noisy" features⁄dimensions. Also, to show the effectiveness of the proposed methods, comparisons are made to some well known clustering methods.
