Statistics - Maple Programming Help

Home : Support : Online Help : Statistics and Data Analysis : Statistics Package : Regression : Statistics/Lowess

Statistics

 Lowess
 produce lowess smoothed functions

 Calling Sequence Lowess(XY, options) Lowess(X, Y, options) L := Lowess(...) L(x1, x2, ..., xN) L(M)

Parameters

 XY - Matrix, Array, or listlist of data convertible to numeric. If XY is a Matrix or Array, it must have at least two columns. If XY is a listlist, then each inner list is considered one row, with the $k$th of each inner list being the elements of the $k$th column, and the same requirement of at least two columns applies. Each row of XY is interpreted as one data point. If XY has $m$ columns, then the first $m-1$ columns contain the values of the $m-1$ independent variables. The last column contains the corresponding values of the dependent variable. X - Vector, list, Matrix, Array, or listlist of data convertible to numeric with the number of rows equal to the length of Y. The columns are interpreted as the values of the independent variables. Y - Vector, list, or Array of data convertible to numeric with length equal to the number of rows in X. The values in Y are the corresponding values of the dependent variable. options - (optional) equation(s) of the form option=value where option is one of fitorder, bandwidth, or iters x1, x2, ..., xN - evaluates L at (x1, x2, ..., xN), where N is equal to $m-1$ or the number of columns in X. M - a matrix with number of columns equal to the number of columns in X or $m-1$. Returns a Vector where the $i$th element is L evaluated with the $i$th row of M as arguments

Options

 • fitorder=nonnegint
 The degree of the polynomial used in each local regression. The default value is $1$.
 • bandwidth=Range(0, 1)
 The proportion of the input data points used in each local regression. The default value depends on fitorder and the number of input data points.
 • iters=nonnegint
 The number of iterations when smoothing data of one independent variable. Each iteration makes the data smoother by eliminating outliers, thus making the computation more robust. This option has no effect when the data has more than one independent variable. The default is $2$.

Description

 • The Lowess command creates a function whose values represent the input data smoothed with the lowess algorithm.
 • Suppose the input data set $\mathrm{XY}$ is of $m$ independent variables and has $n$ data points, the lowess smoothed value at $x≔\left({x}_{1},{x}_{2},\dots ,{x}_{m}\right)$ is computed as follows.
 – Take the $n\cdot \mathrm{bandwidth}$ points in $\mathrm{XY}$ that are closest to $x$.
 – Fit a polynomial $P$ of $m$ variables and degree $\mathrm{fitorder}$ to the points using weighted linear least squares, where the weight for a point $w$ is computed by applying the tri-cube weight function to the distance between $x$ and $w$.
 – Evaluate $P\left(x\right)$.
 • Running one or more iterations, as specified by the iters option, will produce a set of weights to reduce the influence of outliers (that is, make the computation more robust). At each iteration, the weight of a point depends on the residual of the Lowess curve at that point in the previous iteration. These weights are combined with the weights given by the distance, as described previously.
 • L will return unevaluated if the arguments are non-convertible to numerics. But if the first and only argument is a Matrix with number of columns equal to the number of parameters of L, a Vector will be returned where the $i$th element is the L applied with the $i$th row of the Matrix as arguments.

Examples

 > $\mathrm{with}\left(\mathrm{Statistics}\right):$

Create a data sample and apply to it some error.

 > $X≔\mathrm{Sample}\left(\mathrm{Uniform}\left(-2,2\right),200\right):$
 > $Y≔\mathrm{Sample}\left(\mathrm{Uniform}\left(-2,2\right),200\right):$
 > $\mathrm{Zerror}≔\mathrm{Sample}\left(\mathrm{Normal}\left(0,0.1\right),200\right):$
 > $Z≔\mathrm{~}[:-\mathrm{*}]\left(X,\mathrm{ },\mathrm{map}\left(\mathrm{exp},-\mathrm{~}[:-\mathrm{^}]\left(X,\mathrm{ },2\right)-\mathrm{~}[:-\mathrm{^}]\left(Y,\mathrm{ },2\right)\right)\right)+\mathrm{Zerror}:$
 > $\mathrm{XYZ}≔{\mathrm{Matrix}\left(\left[\left[X\right],\left[Y\right],\left[Z\right]\right],\mathrm{datatype}=\mathrm{float}[8]\right)}^{\mathrm{%T}}$
 ${\mathrm{XYZ}}{≔}\left[\begin{array}{c}{\mathrm{200 x 3}}{\mathrm{Matrix}}\\ {\mathrm{Data Type:}}{\mathrm{float}}{[}{8}{]}\\ {\mathrm{Storage:}}{\mathrm{rectangular}}\\ {\mathrm{Order:}}{\mathrm{Fortran_order}}\end{array}\right]$ (1)

Create the function whose graph is the smoothed surface.

 > $L≔\mathrm{Lowess}\left(\mathrm{XYZ},\mathrm{fitorder}=2,\mathrm{bandwidth}=0.3\right):$

Plot the data sample, smoothed surface, and the region between the plane $z=0.4$ and the surface for $-1.5\le x\le -0.5$ and $-1\le y\le 1$.

 > $P≔\mathrm{ScatterPlot3D}\left(\mathrm{XYZ}\right):$
 > $Q≔\mathrm{plot3d}\left(L,-2..2,-2..2,\mathrm{grid}=\left[25,25\right]\right):$
 > $R≔\mathrm{plots}:-\mathrm{shadebetween}\left(L\left(x,y\right),0.4,x=-1.5..-0.5,y=-1..1,\mathrm{showboundary}=\mathrm{false},\mathrm{negativeonly}\right):$
 > $\mathrm{plots}:-\mathrm{display}\left(P,Q,R,\mathrm{orientation}=\left[100,70,0\right],\mathrm{lightmodel}=\mathrm{none}\right)$

Find the volume of the shaded region.

 > $\mathrm{int}\left(0.4-L\left(x,y\right),x=-1.5..-0.5,y=-1..1,\mathrm{numeric},\mathrm{ε}=0.01,\mathrm{method}=\mathrm{_CubaSuave}\right)$
 ${1.30321733303695}$ (2)

For a two dimensional example we will create another data sample.

 > $X≔\mathrm{Sample}\left(\mathrm{Uniform}\left(0,\mathrm{π}\right),200\right)$
 ${X}{≔}\left[\begin{array}{c}{\mathrm{1 .. 200}}{\mathrm{Vector}}{[}{\mathrm{row}}{]}\\ {\mathrm{Data Type:}}{\mathrm{float}}{[}{8}{]}\\ {\mathrm{Storage:}}{\mathrm{rectangular}}\\ {\mathrm{Order:}}{\mathrm{Fortran_order}}\end{array}\right]$ (3)
 > $\mathrm{Yerror}≔\mathrm{Sample}\left(\mathrm{Normal}\left(0,0.1\right),200\right)$
 ${\mathrm{Yerror}}{≔}\left[\begin{array}{c}{\mathrm{1 .. 200}}{\mathrm{Vector}}{[}{\mathrm{row}}{]}\\ {\mathrm{Data Type:}}{\mathrm{float}}{[}{8}{]}\\ {\mathrm{Storage:}}{\mathrm{rectangular}}\\ {\mathrm{Order:}}{\mathrm{Fortran_order}}\end{array}\right]$ (4)
 > $Y≔\mathrm{map}\left(\mathrm{sin},X\right)+\mathrm{Yerror}$
 ${Y}{≔}\left[\begin{array}{c}{\mathrm{1 .. 200}}{\mathrm{Vector}}{[}{\mathrm{row}}{]}\\ {\mathrm{Data Type:}}{\mathrm{float}}{[}{8}{]}\\ {\mathrm{Storage:}}{\mathrm{rectangular}}\\ {\mathrm{Order:}}{\mathrm{Fortran_order}}\end{array}\right]$ (5)

Create the function whose graph is the smoothed curve.

 > $L≔\mathrm{CurveFitting}:-\mathrm{Lowess}\left(X,Y,\mathrm{fitorder}=1,\mathrm{bandwidth}=0.3\right):$

Plot the data sample, smoothed curve, and the region between the $x$-axis and the curve for $\frac{\mathrm{\pi }}{8}\le x\le 3\frac{\mathrm{\pi }}{8}$.

 > $P≔\mathrm{ScatterPlot}\left(X,Y\right)$
 ${P}{≔}$
 > $Q≔\mathrm{plot}\left(L\left(x\right),x=0..\mathrm{π}\right)$
 ${Q}{≔}$
 > $R≔\mathrm{plots}:-\mathrm{shadebetween}\left(L\left(x\right),0,x=\frac{\mathrm{π}}{8}..\frac{3\mathrm{π}}{8},\mathrm{showboundary}=\mathrm{false},\mathrm{positiveonly}\right)$
 ${R}{≔}$
 > $\mathrm{plots}:-\mathrm{display}\left(P,Q,R\right)$

Find the area of the shaded region.

 > $\mathrm{int}\left(L,\frac{\mathrm{π}}{8}..\frac{3\mathrm{π}}{8},\mathrm{numeric},\mathrm{ε}=0.01\right)$
 ${0.526726498845341}$ (6)

And find the maximum.

 > $\mathrm{Optimization}:-\mathrm{Maximize}\left(L,\mathrm{map}\left(\mathrm{unapply},\left\{-x,x-\mathrm{π}\right\},x\right),\mathrm{optimalitytolerance}=0.001\right)$
 $\left[{0.991236282390020151}{,}\left[\begin{array}{c}{1.50201622806699}\end{array}\right]\right]$ (7)

Compatibility

 • The Statistics[Lowess] command was introduced in Maple 2015.