Statistics: fitgmdist

Function Reference: `fitgmdist`

statistics: GMdist = fitgmdist (data, k, param1, value1, …)

Fit a Gaussian mixture model with k components to data. Each row of data is a data sample. Each column is a variable.

Optional parameters are:

"start": Initialization conditions. Possible values are:
- "randSample" (default) Takes means uniformly from rows of data.
- "plus" Use k-means++ to initialize means.
- "cluster" Performs an initial clustering with 10% of the data.
- vector A vector whose length is the number of rows in data, and whose values are 1 to k specify the components each row is initially allocated to. The mean, variance, and weight of each component is calculated from that.
- structure A structure with fields mu, Sigma and ComponentProportion.
For "randSample", "plus", and "cluster", the initial variance of each component is the variance of the entire data sample.
"Replicates": Number of random restarts to perform.
"RegularizationValue" or "Regularize": A small number added to the diagonal entries of the covariance to prevent singular covariances.
"SharedCovariance" or "SharedCov" (logical). True if all components must share the same variance, to reduce the number of free parameters
"CovarianceType" or "CovType" (string). Possible values are:
- "full" (default) Allow arbitrary covariance matrices.
- "diagonal" Force covariances to be diagonal, to reduce the number of free parameters.
"Options": A structure with all of the following fields:
- MaxIter Maximum number of EM iterations (default 100).
- TolFun Threshold increase in likelihood to terminate EM (default 1e-6).
- Display Possible values are:
  - "off" (default): Display nothing.
  - "final": Display the total number of iterations and likelihood once the execution completes.
  - "iter": Display the number of iteration and likelihood after each iteration.
"Weight": A column vector or $N×2$ matrix. The first column consists of non-negative weights given to the samples. If these are all integers, this is equivalent to specifying weight(i) copies of row i of data, but potentially faster. If a row of data is used to represent samples that are similar but not identical, then the second column of weight indicates the variance of those original samples. Specifically, in the EM algorithm, the contribution of row i towards the variance is set to at least weight(i,2), to prevent spurious components with zero variance.

See also: gmdistribution, kmeans

Source Code: fitgmdist

Example: 1


 ## Generate a two-cluster problem
 C1 = randn (100, 2) + 2;
 C2 = randn (100, 2) - 2;
 data = [C1; C2];

 ## Perform clustering
 GMModel = fitgmdist (data, 2);

 ## Plot the result
 figure
 [heights, bins] = hist3([C1; C2]);
 [xx, yy] = meshgrid(bins{1}, bins{2});
 bbins = [xx(:), yy(:)];
 contour (reshape (GMModel.pdf (bbins), size (heights)));

Example: 2


 Angle_Theta = [ 30 + 10 * randn(1, 10),  60 + 10 * randn(1, 10) ]';
 nbOrientations = 2;
 initial_orientations = [38.0; 18.0];
 initial_weights = ones (1, nbOrientations) / nbOrientations;
 initial_Sigma = 10 * ones (1, 1, nbOrientations);
 start = struct ("mu", initial_orientations, "Sigma", initial_Sigma, ...
                 "ComponentProportion", initial_weights);
 GMModel_Theta = fitgmdist (Angle_Theta, nbOrientations, "Start", start , ...
                            "RegularizationValue", 0.0001)

Gaussian mixture distribution with 2 components in 1 dimension(s)
Clust 1: weight 0.701113
	Mean: 50.5551 
	Variance:135.42
Clust 2: weight 0.298887
	Mean: 19.3242 
	Variance:23.764
AIC=175.832 BIC=180.811 NLogL=82.9162 Iter=10 Cged=1 Reg=0.0001

Categories &

Functions List

Clustering

Clustering

Classification Classes

Classification Classes

Clustering Classes

Clustering Classes

Regression Classes

Regression Classes

Data Manipulation

Data Manipulation

Descriptive Statistics

Descriptive Statistics

Distribution Classes

Distribution Classes

Distribution Fitting

Distribution Fitting

Distribution Functions

Distribution Functions

Distribution Statistics

Distribution Statistics

Distribution Wrappers

Distribution Wrappers

Experimental Design

Experimental Design

Machine Learning

Machine Learning

Model Fitting

Model Fitting

Hypothesis Testing

Hypothesis Testing

I/O

I/O

Plotting

Plotting

Regression

Regression

Transforms

Transforms

Function Reference: fitgmdist

Example: 1

Example: 2

Function Reference: `fitgmdist`