Function Reference: fillmissing

statistics: B = fillmissing (A, 'constant', v)
statistics: B = fillmissing (A, method)
statistics: B = fillmissing (A, move_method, window_size)
statistics: B = fillmissing (A, fill_function, window_size)
statistics: B = fillmissing (…, dim)
statistics: B = fillmissing (…, PropertyName, PropertyValue)
statistics: [B, idx] = fillmissing (…)

Replace missing entries of array A either with values in v or as determined by other specified methods. ’missing’ values are determined by the data type of A as identified by the function @ref{ismissing}, curently defined as:

  • NaN: single, double
  • " " (white space): char
  • {""} (white space in cell): string cells.

A can be a numeric scalar or array, a character vector or array, or a cell array of character vectors (a.k.a. string cells).

v can be a scalar or an array containing values for replacing the missing values in A with a compatible data type for isertion into A. The shape of v must be a scalar or an array with number of elements in v equal to the number of elements orthoganal to the operating dimension. E.g., if size(A) = [3 5 4], operating along dim = 2 requires v to contain either 1 or 3x4=12 elements.

If requested, the optional output idx will contain a logical array the same shape as A indicating with 1’s which locations in A were filled.

Alternate Input Arguments and Values:

  • method - replace missing values with:
    next
    previous
    nearest

    next, previous, or nearest non-missing value (nearest defaults to next when equidistant as determined by SamplePoints.)

    linear

    linear interpolation of neigboring, non-missing values

    spline

    piecewise cubic spline interpolation of neigboring, non-missing values

    pchip

    ’shape preserving’ piecewise cubic spline interposaliton of neighboring, non-missing values

  • move_method - moving window calculated replacement values:
    movmean
    movmedian

    moving average or median using a window determined by window_size. window_size must be either a positive scalar value or a two element positive vector of sizes [nb, na] measured in the same units as SamplePoints. For scalar values, the window is centered on the missing element and includes all data points within a distance of half of window_size on either side of the window center point. Note that for compatability, when using a scalar value, the backward window limit is inclusive and the forward limit is exclusive. If a two-element window_size vector is specified, the window includes all points within a distance of nb backward and na forward from the current element at the window center (both limits inclusive).

  • fill_function - custom method specified as a function handle. The supplied fill function must accept three inputs in the following order for each missing gap in the data:
    A_values -

    elements of A within the window on either side of the gap as determined by window_size. (Note these elements can include missing values from other nearby gaps.)

    A_locs -

    locations of the reference data, A_values, in terms of the default or specified SamplePoints.

    gap_locs -

    location of the gap data points that need to be filled in terms of the default or specified SamplePoints.

    The supplied function must return a scalar or vector with the same number of elements in gap_locs. The required window_size parameter follows similar rules as for the moving average and median methods described above, with the two exceptions that (1) each gap is processed as a single element, rather than gap elements being processed individually, and (2) the window extended on either side of the gap has inclusive endpoints regardless of how window_size is specified.

  • dim - specify a dimension for vector operation (default = first non-singeton dimension)
  • PropertyName-PropertyValue pairs
    SamplePoints

    PropertyValue is a vector of sample point values representing the sorted and unique x-axis values of the data in A. If unspecified, the default is assumed to be the vector [1 : size (A, dim)]. The values in SamplePoints will affect methods and properties that rely on the effective distance between data points in A, such as interpolants and moving window functions where the window_size specified for moving window functions is measured relative to the SamplePoints.

    EndValues

    Apply a separate handling method for missing values at the front or back of the array. PropertyValue can be:

    • A constant scalar or array with the same shape requirments as v.
    • none - Do not fill end gap values.
    • extrap - Use the same procedure as method to fill the end gap values.
    • Any valid method listed above except for movmean, movmedian, and fill_function. Those methods can only be applied to end gap values with extrap.
    MissingLocations

    PropertyValue must be a logical array the same size as A indicating locations of known missing data with a value of true. (cannot be combined with MaxGap)

    MaxGap

    PropertyValue is a numeric scalar indicating the maximum gap length to fill, and assumes the same distance scale as the sample points. Gap length is calculated by the difference in locations of the sample points on either side of the gap, and gaps larger than MaxGap are ignored by fillmissing. (cannot be combined with MissingLocations)

Compatibility Notes:

  • Numerical and logical inputs for A and v may be specified in any combination. The output will be the same class as A, with the v converted to that data type for filling. Only single and double have defined ’missing’ values, so except for when the missinglocations option specifies the missing value identification of logical and other numeric data types, the output will always be B = A with idx = false(size(A)).
  • All interpolation methods can be individually applied to EndValues.
  • MATLAB’s fill_function method currently has several inconsistencies with the other methods (tested against version 2022a), and Octave’s implementation has chosen the following consistent behavior over compatibility: (1) a column full of missing data is considered part of EndValues, (2) such columns are then excluded from fill_function processing because the moving window is always empty. (3) operation in dimensions higher than 2 perform identically to operations in dims 1 and 2, most notable on vectors.
  • Method "makima" is not yet implemented in interp1, which is used by fillmissing. Attempting to call this method will produce an error until the method is implemented in interp1.

See also: ismissing, rmmissing, standardizeMissing

Source Code: fillmissing