 Statistics - Maple Programming Help

Home : Support : Online Help : Statistics and Data Analysis : Statistics Package : Regression : Statistics/NonlinearFit

Statistics

 NonlinearFit
 fit a nonlinear model function to data

 Calling Sequence NonlinearFit(falg, X, Y, v, options) NonlinearFit(falg, XY, v, options) NonlinearFit(fop, X, Y, options) NonlinearFit(fop, XY, options)

Parameters

 falg - algebraic; model function in algebraic form X - Vector or Matrix; values of independent variable(s) Y - Vector; values of dependent variable XY - Matrix; values of independent and dependent variables v - name or list(names); name(s) of independent variables in the model function fop - procedure; model function in operator form options - (optional) equation(s) of the form option=value where option is one of initialvalues, output, parameternames, parameterranges and weights; specify options for the NonlinearFit command

Description

 • The NonlinearFit command fits a model function that is nonlinear in the model parameters to data by minimizing the least-squares error.  If you are not sure if the model function is linear or not, it is recommended to use the Statistics[Fit] command, which will call this command or Statistics[LinearFit] depending on linearity of the model.
 • The NonlinearFit command minimizes the error in a local sense; see the Notes section below, in particular the advice regarding the initialvalues option.
 • This help page describes how to use the NonlinearFit command with algebraic-form and operator-form input.  An advanced form of the command is described in the Statistics/NonlinearFitMatrixForm help page. For more information about the input forms, see the Input Forms help page.
 • Consider the model $y=f\left({x}_{1},{x}_{2},\mathrm{...},{x}_{n};{a}_{1},{a}_{2},\mathrm{...},{a}_{m}\right)$, where y is the dependent variable and f is the model function of n independent variables ${x}_{1},{x}_{2},\mathrm{...},{x}_{n}$, and m model parameters ${a}_{1},{a}_{2},\mathrm{...},{a}_{m}$. Given k data points, where each data point is an (n+1)-tuple of numerical values for $\left({x}_{1},{x}_{2},\mathrm{...},{x}_{n},y\right)$, the NonlinearFit command finds values of the model parameters such that the sum of the k residuals squared is minimized.  The ith residual is the value of $y-f\left({x}_{1},{x}_{2},\mathrm{...},{x}_{n};{a}_{1},{a}_{1},\mathrm{...},{a}_{m}\right)$ evaluated at the ith data point.
 • In the first two calling sequence, the first parameter falg is an algebraic expression in the independent variables ${x}_{1},{x}_{2},\mathrm{...},{x}_{n}$ and the model parameters ${a}_{1},{a}_{2},\mathrm{...},{a}_{m}$. In the last two calling sequences, the first parameter fop is a procedure having n input parameters representing the independent variables ${x}_{1},{x}_{2},\mathrm{...},{x}_{n}$ followed by m input parameters representing the model parameters ${a}_{1},{a}_{2},\mathrm{...},{a}_{m}$ and returning the single value $f\left({x}_{1},{x}_{2},\mathrm{...},{x}_{n};{a}_{1},{a}_{2},\mathrm{...},{a}_{m}\right)$.
 • The parameter X is a Matrix containing the values of the independent variables.  Row i in the Matrix contains the n values for the ith data point while column j contains all values of the single variable ${x}_{j}$.  If there is only one independent variable, X can be either a Vector or a k-by-1 Matrix.  The parameter Y is a Vector containing the k values of the dependent variable y. The parameter XY is a Matrix consisting of the n columns of X and, as last column, Y. For X, Y, and XY, one can also use lists or Arrays; for details, see the Input Forms help page.
 • The parameter v is a list of the independent variable names used in falg.  If there is only one independent variable, then v can be a single name.  The order of the names in the list must match exactly the order in which the independent variable values are placed in the columns of X.
 • By default, either the model function with the final parameter values or a Vector containing the parameter values is returned, depending on the input form.  Additional results or a solution module that allows you to query for various settings and results can be obtained with the output option.  For more information, see the Statistics/Regression/Solution help page.
 • Weights for the data points can be supplied through the weights option.

Options

 The options argument can contain one or more of the options shown below.  These options are described in more detail on the Statistics/Regression/Options help page.
 • initialvalues = set(equation), list(equation), list(realcons) or Vector(realcons) -- Provide initial values for the parameters.
 • output = name or string --  Specify the form of the solution.  The output option can take as a value the name solutionmodule, or one of the following names (or a list of these names): degreesoffreedom, leastsquaresfunction, parametervalues, parametervector, residuals, residualmeansquare, residualstandarddeviation, residualsumofsquares.  For more information, see the Statistics/Regression/Solution help page.
 • parameternames = list(name) -- Specify the order of parameter names in the model function.  This determines the order in which values are placed in Vector-valued results.
 • parameterranges = list(name=range), list(range) -- Specify the allowable range for each parameter.
 • weights = Vector -- Provide weights for the data points.

Notes

 • The underlying computation is done in floating-point; therefore, all data points must have type realcons and all returned solutions are floating-point, even if the problem is specified with exact values.  For more information about numeric computation in the Statistics package, see the Statistics/Computation help page.
 • The NonlinearFit command relies on the Matrix-form version of the Optimization[LSSolve] command, which in turns uses various methods implemented in a built-in library provided by the Numerical Algorithms Group (NAG).  More information is available on the Optimization[LSSolveMatrixForm] help page.  Additional options listed on that page may be provided to the NonlinearFit command and are passed directly to the LSSolve command.
 • The Optimization[LSSolve] command computes only local solutions to nonlinear least-squares problems.  The parameter values returned by NonlinearFit minimize the sum of the residuals squared in a local sense.  When the results returned are unexpected, it is highly recommended that you provide initial values for the parameters using the initialvalues option.
 • To obtain more details as the least-squares problem is being solved, set infolevel[Statistics] to 2 or higher.  To obtain details about the progress of the Optimization solver, set ${\mathrm{infolevel}}_{\mathrm{Optimization}}$ to 1 or higher.
 • For fitting a data sample to a distribution, see MaximumLikelihoodEstimate.

Examples

 > $\mathrm{with}\left(\mathrm{Statistics}\right):$
 > $X≔\mathrm{Vector}\left(\left[1,2,3,4,5,6\right],\mathrm{datatype}=\mathrm{float}\right):$
 > $Y≔\mathrm{Vector}\left(\left[2.2,3,4.8,10.2,24.5,75.0\right],\mathrm{datatype}=\mathrm{float}\right):$
 > $\mathrm{NonlinearFit}\left(a+bv+\mathrm{exp}\left(cv\right),X,Y,v\right)$
 ${6.49670407384335}{-}{4.54643947753519}{}{v}{+}{{ⅇ}}^{{0.758384141232909}{}{v}}$ (1)

Fit a multivariate nonlinear model.

 > $\mathrm{XY}≔\mathrm{Matrix}\left(\left[\left[2.2467,5.2219,6.5622\right],\left[2.0083,6.0656,6.3261\right],\left[5.8386,1.1084,11.942\right],\left[7.7071,5.9855,32.096\right],\left[4.6193,4.6921,15.297\right]\right]\right)$
 ${\mathrm{XY}}{≔}\left[\begin{array}{ccc}{2.2467}& {5.2219}& {6.5622}\\ {2.0083}& {6.0656}& {6.3261}\\ {5.8386}& {1.1084}& {11.942}\\ {7.7071}& {5.9855}& {32.096}\\ {4.6193}& {4.6921}& {15.297}\end{array}\right]$ (2)
 > $\mathrm{NonlinearFit}\left(a{x}^{b}{y}^{c},\mathrm{XY},\left[x,y\right]\right)$
 ${1.28883204859941}{}{{x}}^{{1.23745132612891}}{}{{y}}^{{0.383635147679040}}$ (3)

Consider an experiment where quantities $x$, $y$, and $z$ are quantities influencing a quantity $w$ according to an approximate relationship

$w={x}^{a}+\frac{b{x}^{2}}{y}+cyz$

with unknown parameters $a$, $b$, and $c$. Six data points are given by the following matrix, with respective columns for $x$, $y$, $z$, and $w$.

 > $\mathrm{ExperimentalData}≔⟨⟨1,1,1,2,2,2⟩|⟨1,2,3,1,2,3⟩|⟨1,2,3,4,5,6⟩|⟨0.531,0.341,0.163,0.641,0.713,-0.040⟩⟩$
 ${\mathrm{ExperimentalData}}{≔}\left[\begin{array}{cccc}{1}& {1}& {1}& {0.531}\\ {1}& {2}& {2}& {0.341}\\ {1}& {3}& {3}& {0.163}\\ {2}& {1}& {4}& {0.641}\\ {2}& {2}& {5}& {0.713}\\ {2}& {3}& {6}& {-0.040}\end{array}\right]$ (4)

We take an initial guess that the first term will be approximately quadratic in $x$, that $b$ will be approximately $1$, and for $c$ we don't even know whether it's going to be positive or negative, so we guess $c=0$. We compute both the model function and the residuals.

 > $\mathrm{NonlinearFit}\left({x}^{a}+\frac{b{x}^{2}}{y}+cyz,\mathrm{ExperimentalData},\left[x,y,z\right],\mathrm{initialvalues}=\left[a=2,b=1,c=0\right],\mathrm{output}=\left[\mathrm{leastsquaresfunction},\mathrm{residuals}\right]\right)$
 $\left[{{x}}^{{1.14701973996968}}{-}\frac{{0.298041864889394}{}{{x}}^{{2}}}{{y}}{-}{0.0982511893429762}{}{y}{}{z}{,}\left[\begin{array}{cccccc}{0.0727069457676300}& {0.116974310183398}& {-0.146607992383251}& {-0.0116127470057686}& {-0.0770361532848388}& {0.0886489085642805}\end{array}\right]\right]$ (5)

We see that the exponent on $x$ is only about $1.14$, and the other guesses were not very good either. However, this problem is conditioned well enough that Maple finds a good fit anyway.

Finally, consider a situation where an ordinary differential equation leads to results that need to be fitted. The system is given by

$\left[x\left(0\right)=-a,\frac{ⅆ}{ⅆt}\phantom{\rule[-0.0ex]{0.4em}{0.0ex}}x\left(t\right)=z{x\left(t\right)}^{-b}+1\right]$

where $a$ and $b$ are parameters that we want to find, $z$ is a variable that we can vary between experiments, and $x\left(t\right)$ is a quantity that we can measure at $t=1$. We perform 10 experiments at $z=0.1,0.2,\mathrm{...},1.0$, and the results are as follows.

 > $\mathrm{Input}≔\left[\mathrm{seq}\left(0.1..1,0.1\right)\right]$
 ${\mathrm{Input}}{≔}\left[{0.1}{,}{0.2}{,}{0.3}{,}{0.4}{,}{0.5}{,}{0.6}{,}{0.7}{,}{0.8}{,}{0.9}{,}{1.0}\right]$ (6)
 > $\mathrm{Output}≔\left[1.932,2.092,2.090,2.416,2.544,2.638,2.894,3.188,3.533,3.822\right]$
 ${\mathrm{Output}}{≔}\left[{1.932}{,}{2.092}{,}{2.090}{,}{2.416}{,}{2.544}{,}{2.638}{,}{2.894}{,}{3.188}{,}{3.533}{,}{3.822}\right]$ (7)

We now need to set up the procedure (called fop in the calling sequences above) that NonlinearFit can call to obtain the value for a given input value $z$ and a given pair of parameters $a$ and $b$. We do this using dsolve/numeric.

 > $\mathrm{ODE}≔\left[x\left(0\right)=-a,\mathrm{diff}\left(x\left(t\right),t\right)=z{x\left(t\right)}^{-b}+1\right]$
 ${\mathrm{ODE}}{≔}\left[{x}{}\left({0}\right){=}{-}{a}{,}\frac{{ⅆ}}{{ⅆ}{t}}\phantom{\rule[-0.0ex]{0.4em}{0.0ex}}{x}{}\left({t}\right){=}{z}{}{{x}{}\left({t}\right)}^{{-}{b}}{+}{1}\right]$ (8)
 > $\mathrm{ODE_Solution}≔\mathrm{dsolve}\left(\mathrm{ODE},\mathrm{numeric},\mathrm{parameters}=\left[a,b,z\right]\right)$
 ${\mathrm{ODE_Solution}}{:=}{\mathbf{proc}}\left({\mathrm{x_rkf45}}\right)\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}{...}\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}{\mathbf{end proc}}$ (9)

We now have a procedure ODE_Solution that can compute the correct value, but we cannot call that with the values for z, a, and b, and expect to get the correct answer. We first need to call it once to set the parameters, then another time to obtain the value of $x\left(t\right)$ at $t=1$, and then return this value (for more information about how this works, see dsolve/numeric). By hand, we can do this as follows:

 > $\mathrm{ODE_Solution}\left(\mathrm{parameters}=\left[a=-1,b=-0.5,z=1\right]\right)$
 $\left[{a}{=}{-1.}{,}{b}{=}{-0.5}{,}{z}{=}{1.}\right]$ (10)
 > $\mathrm{ODE_Solution}\left(1\right)$
 $\left[{t}{=}{1.}{,}{x}{}\left({t}\right){=}{3.44630585135012}\right]$ (11)
 > $\mathrm{ODE_Solution}\left(\mathrm{parameters}=\left[a=1,b=1,z=1\right]\right)$
 $\left[{a}{=}{1.}{,}{b}{=}{1.}{,}{z}{=}{1.}\right]$ (12)
 > $\mathrm{ODE_Solution}\left(1\right)$

Note that for some settings of the parameters, we cannot obtain a solution. We need to take care of this in the fop procedure we create (which we call f), by returning a value that is very far from all output points, leading to a very bad fit for these erroneous parameter values.

 > f := proc(zValue, aValue, bValue) global ODE_Solution, a, b, z, x, t; ODE_Solution('parameters' = [a = aValue, b = bValue, z = zValue]); try return eval(x(t), ODE_Solution(1)); catch: return 100; end try; end proc;
 ${f}{:=}{\mathbf{proc}}\left({\mathrm{zValue}}{,}{\mathrm{aValue}}{,}{\mathrm{bValue}}\right)\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}{\mathbf{global}}\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}{\mathrm{ODE_Solution}}{,}{a}{,}{b}{,}{z}{,}{x}{,}{t}{;}\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}{\mathrm{ODE_Solution}}{}\left({'}{\mathrm{parameters}}{'}{=}\left[{a}{=}{\mathrm{aValue}}{,}{b}{=}{\mathrm{bValue}}{,}{z}{=}{\mathrm{zValue}}\right]\right){;}\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}{\mathbf{try}}\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}{\mathbf{return}}\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}{\mathrm{eval}}{}\left({x}{}\left({t}\right){,}{\mathrm{ODE_Solution}}{}\left({1}\right)\right)\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}{\mathbf{catch}}{:}\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}{\mathbf{return}}\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}{100}\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}{\mathbf{end try}}\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}{\mathbf{end proc}}$ (13)
 > $f\left(1,-1,-0.5\right)$
 ${3.44630585135012}$ (14)

With this setup, we can perform an initial attempt at fitting the data.

 > $\mathrm{NonlinearFit}\left(f,\mathrm{Input},\mathrm{Output},\mathrm{output}=\left[\mathrm{parametervector},\mathrm{residualstandarddeviation}\right]\right)$
 $\left[\left[\begin{array}{c}{1.}\\ {1.00000002108371}\end{array}\right]{,}{108.7701554}\right]$ (15)

That is an extremely bad fit, and indeed this is a case where the solution to the ODE is undefined. Since NonlinearFit does local optimization, it never finds a point where the result is anything other than the error return value, $100$. The solution is to provide an initial estimate. We will go with the values that provided a solution above: $a=-1,b=-0.5$.

 > $\mathrm{NonlinearFit}\left(f,\mathrm{Input},\mathrm{Output},\mathrm{output}=\left[\mathrm{parametervector},\mathrm{residualstandarddeviation}\right],\mathrm{initialvalues}=\left[-1,-0.5\right]\right)$
 $\left[\left[\begin{array}{c}{-0.739406921546561}\\ {-1.03081992749769}\end{array}\right]{,}{0.07109676248}\right]$ (16)

This is a much better fit.

Compatibility

 • The XY parameter was introduced in Maple 15.