Statistics: fillmissing

Function Reference: `fillmissing`

statistics: B = fillmissing (A, "constant", v)
statistics: B = fillmissing (A, method)
statistics: B = fillmissing (A, move_method, window_size)
statistics: B = fillmissing (A, fill_function, window_size)
statistics: B = fillmissing (…, dim)
statistics: B = fillmissing (…, PropertyName, PropertyValue)
statistics: [B, idx] = fillmissing (…)

Replace missing entries of array A either with values in v or as determined by other specified methods. ’missing’ values are determined by the data type of A as identified by the function ismissing, currently defined as:

NaN: single, double
" " (white space): char
{""} (white space in cell): string cells.

A can be a numeric scalar or array, a character vector or array, or a cell array of character vectors (a.k.a. string cells).

v can be a scalar or an array containing values for replacing the missing values in A with a compatible data type for insertion into A. The shape of v must be a scalar or an array with number of elements in v equal to the number of elements orthogonal to the operating dimension. E.g., if size(A) = [3 5 4], operating along dim = 2 requires v to contain either 1 or 3x4=12 elements.

If requested, the optional output idx will contain a logical array the same shape as A indicating with 1’s which locations in A were filled.

Alternate Input Arguments and Values:

method - replace missing values with:

next

previous

nearest

next, previous, or nearest non-missing value (nearest defaults to next when equidistant as determined by SamplePoints.)

linear

linear interpolation of neigboring, non-missing values

spline

piecewise cubic spline interpolation of neigboring, non-missing values

pchip

’shape preserving’ piecewise cubic spline interposaliton of neighboring, non-missing values
move_method - moving window calculated replacement values:

movmean

movmedian

moving average or median using a window determined by window_size. window_size must be either a positive scalar value or a two element positive vector of sizes [nb, na] measured in the same units as SamplePoints. For scalar values, the window is centered on the missing element and includes all data points within a distance of half of window_size on either side of the window center point. Note that for compatibility, when using a scalar value, the backward window limit is inclusive and the forward limit is exclusive. If a two-element window_size vector is specified, the window includes all points within a distance of nb backward and na forward from the current element at the window center (both limits inclusive).
fill_function - custom method specified as a function handle. The supplied fill function must accept three inputs in the following order for each missing gap in the data:

A_values -

elements of A within the window on either side of the gap as determined by window_size. (Note these elements can include missing values from other nearby gaps.)

A_locs -

locations of the reference data, A_values, in terms of the default or specified SamplePoints.

gap_locs -

location of the gap data points that need to be filled in terms of the default or specified SamplePoints.

The supplied function must return a scalar or vector with the same number of elements in gap_locs. The required window_size parameter follows similar rules as for the moving average and median methods described above, with the two exceptions that (1) each gap is processed as a single element, rather than gap elements being processed individually, and (2) the window extended on either side of the gap has inclusive endpoints regardless of how window_size is specified.
dim - specify a dimension for vector operation (default = first non-singeton dimension)
PropertyName-PropertyValue pairs

SamplePoints

PropertyValue is a vector of sample point values representing the sorted and unique x-axis values of the data in A. If unspecified, the default is assumed to be the vector [1 : size (A, dim)]. The values in SamplePoints will affect methods and properties that rely on the effective distance between data points in A, such as interpolants and moving window functions where the window_size specified for moving window functions is measured relative to the SamplePoints.

EndValues

Apply a separate handling method for missing values at the front or back of the array. PropertyValue can be:
- A constant scalar or array with the same shape requirements as v.
- none - Do not fill end gap values.
- extrap - Use the same procedure as method to fill the end gap values.
- Any valid method listed above except for movmean, movmedian, and fill_function. Those methods can only be applied to end gap values with extrap.
MissingLocations

PropertyValue must be a logical array the same size as A indicating locations of known missing data with a value of true. (cannot be combined with MaxGap)

MaxGap

PropertyValue is a numeric scalar indicating the maximum gap length to fill, and assumes the same distance scale as the sample points. Gap length is calculated by the difference in locations of the sample points on either side of the gap, and gaps larger than MaxGap are ignored by fillmissing. (cannot be combined with MissingLocations)

Compatibility Notes:

Numerical and logical inputs for A and v may be specified in any combination. The output will be the same class as A, with the v converted to that data type for filling. Only single and double have defined ’missing’ values, so except for when the missinglocations option specifies the missing value identification of logical and other numeric data types, the output will always be B = A with idx = false(size(A)).
All interpolation methods can be individually applied to EndValues.
MATLAB’s fill_function method currently has several inconsistencies with the other methods (tested against version 2022a), and Octave’s implementation has chosen the following consistent behavior over compatibility: (1) a column full of missing data is considered part of EndValues, (2) such columns are then excluded from fill_function processing because the moving window is always empty. (3) operation in dimensions higher than 2 perform identically to operations in dims 1 and 2, most notable on vectors.
Method "makima" is not yet implemented in interp1, which is used by fillmissing. Attempting to call this method will produce an error until the method is implemented in interp1.

See also: ismissing, rmmissing, standardizeMissing

Source Code: fillmissing

Categories &

Functions List

Clustering

Clustering

Classification Classes

Classification Classes

Clustering Classes

Clustering Classes

Regression Classes

Regression Classes

Data Manipulation

Data Manipulation

Descriptive Statistics

Descriptive Statistics

Distribution Classes

Distribution Classes

Distribution Fitting

Distribution Fitting

Distribution Functions

Distribution Functions

Distribution Statistics

Distribution Statistics

Distribution Wrappers

Distribution Wrappers

Experimental Design

Experimental Design

Machine Learning

Machine Learning

Model Fitting

Model Fitting

Hypothesis Testing

Hypothesis Testing

I/O

I/O

Plotting

Plotting

Regression

Regression

Transforms

Transforms

Function Reference: fillmissing

Function Reference: `fillmissing`