boxplot
Produce a box plot.
A box plot is a graphical display that simultaneously describes several important features of a data set, such as center, spread, departure from symmetry, and identification of observations that lie unusually far from the bulk of the data.
Input arguments (case-insensitive) recognized by boxplot are:
notched = 0 (default) produces a rectangular box plot.
notched within the interval (0,1) produces a notch of the specified depth. Notched values outside (0,1) are amusing if not exactly impractical.
symbol = ’.’: points between 1.5 and 3 times the IQR are marked with ’.’ and points outside 3 times IQR with ’o’.
symbol = [’x’,’*’]: points between 1.5 and 3 times the IQR are marked with ’x’ and points outside 3 times IQR with ’*’.
boxplot
displays all data
values outside the box using the plotting symbol for points that lie
outside 3 times the IQR.
’Notch’ | ’on’ | Notched by 0.25 of the boxes width. |
’off’ | Produces a straight box. | |
scalar | Proportional width of the notch. | |
’Symbol’ | ’.’ | Defines only outliers between 1.5 and 3 IQR. |
[’x’,’*’] | 2nd character defines outliers > 3 IQR | |
’Orientation’ | ’vertical’ | Default value, can also be defined with numerical 1. |
’horizontal’ | Can also be defined with numerical 0. | |
’Whisker’ | scalar | Multiplier of IQR (default is 1.5). |
’OutlierTags’ | ’on’ or 1 | Plot the vector index of the outlier value next to its point. |
’off’ or 0 | No tags are plotted (default value). | |
’Sample_IDs’ | ’cell’ | A cell vector with one cell for each data set containing a nested cell vector with each sample’s ID (should be a string). If this option is passed, then all outliers are tagged with their respective sample’s ID string instead of their vector’s index. |
’BoxWidth’ | ’proportional’ | Create boxes with their width proportional to the number of samples in their respective dataset (default value). |
’fixed’ | Make all boxes with equal width. | |
’Widths’ | scalar | Scaling factor for box widths (default value is 0.4). |
’CapWidths’ | scalar | Scaling factor for whisker cap widths (default value is 1, which results to ’Widths’/8 halflength) |
’BoxStyle’ | ’outline’ | Draw boxes as outlines (default value). |
’filled’ | Fill boxes with a color (outlines are still plotted). | |
’Positions’ | vector | Numerical vector that defines the position of each data set. It must have the same length as the number of groups in a desired manner. This vector merely defines the points along the group axis, which by default is [1:number of groups]. |
’Labels’ | cell | A cell vector of strings containing the names of each group. By default each group is labeled numerically according to its order in the data set |
’Colors’ | character string or Nx3 numerical matrix | If just one character or 1x3 vector of RGB values, specify the fill color of all boxes when BoxStyle = ’filled’. If a character string or Nx3 matrix is entered, box #1’s fill color corrresponds to the first character or first matrix row, and the next boxes’ fill colors corresponds to the next characters or rows. If the char string or Nx3 array is exhausted the color selection wraps around. |
Supplemental arguments not described above (…) are concatenated and passed to the plot() function.
The returned matrix s has one column for each data set as follows:
1 | Minimum |
2 | 1st quartile |
3 | 2nd quartile (median) |
4 | 3rd quartile |
5 | Maximum |
6 | Lower confidence limit for median |
7 | Upper confidence limit for median |
The returned structure h contains handles to the plot elements, allowing customization of the visualization using set/get functions.
Example
title ("Grade 3 heights"); axis ([0,3]); set(gca (), "xtick", [1 2], "xticklabel", {"girls", "boys"}); boxplot ({randn(10,1)*5+140, randn(13,1)*8+135}); |
Source Code: boxplot
axis ([0, 3]); randn ("seed", 1); # for reproducibility girls = randn (10, 1) * 5 + 140; randn ("seed", 2); # for reproducibility boys = randn (13, 1) * 8 + 135; boxplot ({girls, boys}); set (gca (), "xtick", [1 2], "xticklabel", {"girls", "boys"}) title ("Grade 3 heights"); |
randn ("seed", 7); # for reproducibility A = randn (10, 1) * 5 + 140; randn ("seed", 8); # for reproducibility B = randn (25, 1) * 8 + 135; randn ("seed", 9); # for reproducibility C = randn (20, 1) * 6 + 165; data = [A; B; C]; groups = [(ones (10, 1)); (ones (25, 1) * 2); (ones (20, 1) * 3)]; labels = {"Team A", "Team B", "Team C"}; pos = [2, 1, 3]; boxplot (data, groups, "Notch", "on", "Labels", labels, "Positions", pos, ... "OutlierTags", "on", "BoxStyle", "filled"); title ("Example of Group splitting with paired vectors"); |
randn ("seed", 1); # for reproducibility data = randn (100, 9); boxplot (data, "notch", "on", "boxstyle", "filled", ... "colors", "ygcwkmb", "whisker", 1.2); title ("Example of different colors specified with characters"); |
randn ("seed", 5); # for reproducibility data = randn (100, 13); colors = [0.7 0.7 0.7; ... 0.0 0.4 0.9; ... 0.7 0.4 0.3; ... 0.7 0.1 0.7; ... 0.8 0.7 0.4; ... 0.1 0.8 0.5; ... 0.9 0.9 0.2]; boxplot (data, "notch", "on", "boxstyle", "filled", ... "colors", colors, "whisker", 1.3, "boxwidth", "proportional"); title ("Example of different colors specified as RGB values"); |