Categories &

Functions List

Class Definition: GapEvaluation

statistics: GapEvaluation

Gap evaluation for clustering solutions

The GapEvaluation class implements the gap statistic criterion for evaluating clustering solutions. A GapEvaluation object is a specialization of ClusterCriterion and contains fields and methods to compute the gap statistic, its Monte-Carlo reference expectations, and to select the optimal number of clusters according to a chosen search method.

Create a GapEvaluation object by using the evalclusters function or by calling the class constructor directly.

See also: evalclusters, ClusterCriterion, CalinskiHarabaszEvaluation, DaviesBouldinEvaluation, SilhouetteEvaluation

Source Code: GapEvaluation

Properties

A positive integer specifying how many reference datasets are generated to compute the expected log within-cluster dispersion via Monte-Carlo simulation. This property is read-only.

A character vector or function handle specifying the distance measure passed to clustering routines (as accepted by pdist). When a numeric vector is supplied it is interpreted as a precomputed distance vector. This property is read-only.

A character vector naming the reference distribution used to generate reference datasets. Supported values include 'pca' and 'uniform'. This property is read-only.

A character vector specifying the method used to select the optimal number of clusters from the gap statistic. Supported values include 'globalMaxSE' and 'firstMaxSE'. This property is read-only.

A numeric vector containing the Monte-Carlo estimate of the expected values for the natural logarithm of the within-cluster dispersion, computed across the generated reference datasets. This property is read-only.

A numeric vector containing the observed values of the natural logarithm of the within-cluster dispersion computed on the actual data. This property is read-only.

A numeric vector containing the standard error of the expected values for the natural logarithm of the within-cluster dispersion. This property is read-only.

A numeric vector containing the standard deviation of the Monte-Carlo estimates of the log within-cluster dispersion. This property is read-only.

Methods

GapEvaluation: obj = addK (obj, K)

Add a new cluster array to inspect in the GapEvaluation object. This updates internal storage for Monte-Carlo results and evaluates the newly requested cluster counts.

ClusterCriterion: plot (obj)
ClusterCriterion: h = plot (obj)

Plot the gap statistic (criterion values) versus the inspected numbers of clusters and display error bars representing the Monte-Carlo standard deviations. Optionally returns the axes handle.

GapEvaluation: obj = compact (obj)

Return a compact representation of the GapEvaluation object. Currently not implemented; calling this method will issue a warning.