linkage
Produce a hierarchical clustering dendrogram.
d is the dissimilarity matrix relative to n observations,
formatted as a x1 vector as produced by pdist
.
Alternatively, x contains data formatted for input to
pdist
, metric is a metric for pdist
and
arglist is a cell array containing arguments that are passed to
pdist
.
linkage
starts by putting each observation into a singleton
cluster and numbering those from 1 to n. Then it merges two
clusters, chosen according to method, to create a new cluster
numbered n+1, and so on until all observations are grouped into
a single cluster numbered 2(n-1). Row k of the
(m-1)x3 output matrix relates to cluster n+k: the first
two columns are the numbers of the two component clusters and column
3 contains their distance.
method defines the way the distance between two clusters is computed and how they are recomputed when two clusters are merged:
Distance between two clusters is the minimum distance between two elements belonging each to one cluster. Produces a cluster tree known as minimum spanning tree.
Furthest distance between two elements belonging each to one cluster.
Unweighted pair group method with averaging (UPGMA). The mean distance between all pair of elements each belonging to one cluster.
Weighted pair group method with averaging (WPGMA). When two clusters A and B are joined together, the new distance to a cluster C is the mean between distances A-C and B-C.
Unweighted Pair-Group Method using Centroids (UPGMC). Assumes Euclidean metric. The distance between cluster centroids, each centroid being the center of mass of a cluster.
Weighted pair-group method using centroids (WPGMC). Assumes Euclidean metric. Distance between cluster centroids. When two clusters are joined together, the new centroid is the midpoint between the joined centroids.
Ward’s sum of squared deviations about the group mean (ESS). Also known as minimum variance or inner squared distance. Assumes Euclidean metric. How much the moment of inertia of the merged cluster exceeds the sum of those of the individual clusters.
Reference Ward, J. H. Hierarchical Grouping to Optimize an Objective Function J. Am. Statist. Assoc. 1963, 58, 236-244, http://iv.slis.indiana.edu/sw/data/ward.pdf.
See also: pdist, squareform
Source Code: linkage