Plot functional boxplot.
A functional boxplot is the analog of a boxplot for functional data. Functional data is any type of data that varies over a continuum, i.e. curves, probabillity distributions, seasonal data, etc.
The data is first ordered, the order statistic used here is banddepth. Plotted are then the median curve, the envelope of the 50% central region, the maximum non-outlying envelope and the outlier curves.
Parameters: | data : sequence of ndarrays or 2-D ndarray
xdata : ndarray, optional
labels : sequence of scalar or str, optional
depth : ndarray, optional
method : {‘MBD’, ‘BD2’}, optional
wfactor : float, optional
ax : Matplotlib AxesSubplot instance, optional
plot_opts : dict, optional
|
---|---|
Returns: | fig : Matplotlib figure instance
depth : ndarray
ix_depth : ndarray
ix_outliers : ndarray
|
See also
Notes
The median curve is the curve with the highest band depth.
Outliers are defined as curves that fall outside the band created by multiplying the central region by wfactor. Note that the range over which they fall outside this band doesn’t matter, a single data point outside the band is enough. If the data is noisy, smoothing may therefore be required.
The non-outlying region is defined as the band made up of all the non-outlying curves.
References
Examples
Load the El Nino dataset. Consists of 60 years worth of Pacific Ocean sea surface temperature data.
>>> import matplotlib.pyplot as plt
>>> import statsmodels.api as sm
>>> data = sm.datasets.elnino.load()
Create a functional boxplot. We see that the years 1982-83 and 1997-98 are outliers; these are the years where El Nino (a climate pattern characterized by warming up of the sea surface and higher air pressures) occurred with unusual intensity.
>>> fig = plt.figure()
>>> ax = fig.add_subplot(111)
>>> res = sm.graphics.fboxplot(data.raw_data[:, 1:], wfactor=2.58,
... labels=data.raw_data[:, 0].astype(int),
... ax=ax)
>>> ax.set_xlabel("Month of the year")
>>> ax.set_ylabel("Sea surface temperature (C)")
>>> ax.set_xticks(np.arange(13, step=3) - 1)
>>> ax.set_xticklabels(["", "Mar", "Jun", "Sep", "Dec"])
>>> ax.set_xlim([-0.2, 11.2])
>>> plt.show()