Function Reference: pdist2

statistics: D = pdist2 (X, Y)
statistics: D = pdist2 (X, Y, Distance)
statistics: D = pdist2 (X, Y, Distance, DistParameter)
statistics: D = pdist2 (…, Name, Value)
statistics: [D, I] = pdist2 (…, Name, Value)

Compute pairwise distance between two sets of vectors.

D = pdist2 (X, Y) calculates the euclidean distance between each pair of observations in X and Y. Let X be an M×P matrix representing M points in P-dimensional space and Y be an N×P matrix representing another set of points in the same space. This function computes the M×N distance matrix D, where D(i,j) is the distance between X(i,:) and Y(j,:).

D = pdist2 (X, Y, Distance) returns the distance between each pair of observations in X and Y using the metric specified by Distance, which can be any of the following options.

"euclidean"Euclidean distance.
"squaredeuclidean"Squared Euclidean distance.
"seuclidean"standardized Euclidean distance. Each coordinate difference between the rows in X and the query matrix Y is scaled by dividing by the corresponding element of the standard deviation computed from X. A different scaling vector can be specified with the subsequent DistParameter input argument.
"mahalanobis"Mahalanobis distance, computed using a positive definite covariance matrix. A different covariance matrix can be specified with the subsequent DistParameter input argument.
"cityblock"City block distance.
"minkowski"Minkowski distance. The default exponent is 2. A different exponent can be specified with the subsequent DistParameter input argument.
"chebychev"Chebychev distance (maximum coordinate difference).
"cosine"One minus the cosine of the included angle between points (treated as vectors).
"correlation"One minus the sample linear correlation between observations (treated as sequences of values).
"hamming"Hamming distance, which is the percentage of coordinates that differ.
"jaccard"One minus the Jaccard coefficient, which is the percentage of nonzero coordinates that differ.
"spearman"One minus the sample Spearman’s rank correlation between observations (treated as sequences of values).
@distfunCustom distance function handle. A distance function of the form function D2 = distfun (XI, YI), where XI is a 1×P vector containing a single observation in P-dimensional space, YI is an N×P matrix containing an arbitrary number of observations in the same P-dimensional space, and D2 is an N×P vector of distances, where (D2k) is the distance between observations XI and (YIk,:).

D = pdist2 (X, Y, Distance, DistParameter) returns the distance using the metric specified by Distance and DistParameter. The latter one can only be specified when the selected Distance is "seuclidean", "minkowski", and "mahalanobis".

D = pdist2 (…, Name, Value) for any previous arguments, modifies the computation using Name-Value parameters.

  • D = pdist2 (X, Y, Distance, "Smallest", K) computes the distance using the metric specified by Distance and returns the K smallest pairwise distances to observations in X for each observation in Y in ascending order.
  • D = pdist2 (X, Y, Distance, DistParameter, "Largest", K) computes the distance using the metric specified by Distance and DistParameter and returns the K largest pairwise distances in descending order.

[D, I] = pdist2 (…, Name, Value) also returns the matrix I, which contains the indices of the observations in X corresponding to the distances in D. You must specify either "Smallest" or "Largest" as an optional Name-Value pair pair argument to compute the second output argument.

See also: pdist, knnsearch, rangesearch

Source Code: pdist2