CurveFitting - Maple Programming Help

Home : Support : Online Help : Statistics and Data Analysis : Interpolation and Curve Fitting : CurveFitting Package : CurveFitting/ArrayInterpolation

CurveFitting

 ArrayInterpolation
 n-dimensional data interpolation (table lookup)

 Calling Sequence ArrayInterpolation(xdata, ydata, xvalues, options) ArrayInterpolation(xydata, xvalues, options)

Parameters

 xdata - a list, Array, DataFrame, DataSeries, Vector, or Matrix containing the independent coordinate(s) of each of the data points, given in one of several possible forms ydata - a list, Array, DataFrame, DataSeries, or Vector containing the dependent coordinate of each of the data points xydata - alternate input; a list, Array, DataFrame, or Matrix containing both the dependent and independent coordinates of each of the data points xvalues - a numeric value, list, Vector, or Array containing the independent coordinate(s) of one or more points whose dependent coordinate will be approximated using interpolation options - (optional) equation(s) of the form keyword = value, where keyword is one of method, degree, endpoints, knots, uniform, verify, extrapolate, or container.

Description

 • The ArrayInterpolation command takes a finite set of distinct data points given by xdata and ydata (or xydata), and interpolates to approximate the y-values corresponding to the points given in xvalues.  It considers an interpolant function $f$ such that $f\left(x\right)=y$ for all respective pairs $\left(x,y\right)$ in xdata and ydata (or xydata). Such a function can be constructed using one of various methods (see below).  It then computes and returns $f\left({x}_{i}\right)$ for all ${x}_{i}$ in xvalues.
 • The focus of the ArrayInterpolation command is the performance of quick and efficient data resampling and table lookup.  To actually compute and return interpolants, functions such as CurveFitting[Spline] and CurveFitting[RationalInterpolation] can be used instead.
 • The ArrayInterpolation function can interpolate numeric data in $n$ dimensions, where $n$ is any positive integer.
 • The list of independent coordinates of the data points, given in xdata, can be input in a number of different ways.  xdata can be:
 – (preferred if $n=1$) a Vector, DataSeries, list, or one-dimensional Array of strictly increasing x-coordinates. The data set will then have size ${a}_{1}$, where ${a}_{1}$ is the length of xdata.
 – (preferred if $1) a list of $n$ Vectors, lists, or one-dimensional Arrays, one for each dimension of the data.  The $j$th Vector, list, or Array in the input must contain, in increasing order, all of the possible $j$th coordinates of the data points.  In this case, the block of data points will be assumed to lie on an ${a}_{1}$ by ${a}_{2}$ by ... by ${a}_{n}$ grid, where ${a}_{j}$ is the length of the $j$th Vector or Array in the input.  The $p$th coordinate of the data point at index $[{j}_{1},{j}_{2}, ...,{j}_{n}]$ (where $1\le {j}_{i}\le {a}_{i}$) will be equal to the ${a}_{p}$th element of the $p$th Array in the input.
 – an Array of size ${a}_{1}$ by ${a}_{2}$ by ... by ${a}_{n}$ by $n$, giving the independent coordinate(s) of each of ${a}_{1}$ by ${a}_{2}$ by ... by ${a}_{n}$ data points as an ordered $n$-tuple. These coordinates must form a proper "grid" of values, and must be sorted in strictly increasing order along each dimension.  More formally, $\mathrm{xdata}[{j}_{1},{j}_{2}, ...,{j}_{n},p]-\mathrm{xdata}[{k}_{1},{k}_{2}, ...,{k}_{n},p]$ must be zero if ${j}_{p}={k}_{p}$, and must be positive if ${k}_{p}<{j}_{p}$.
 – a list of $n$ Arrays of size ${a}_{1}$ by ${a}_{2}$ by ... by ${a}_{n}$, where the $j$th array contains the $j$th independent coordinate of each of the ${a}_{1}$ by ${a}_{2}$ by ... by ${a}_{n}$ data points.  The coordinates must form a proper "grid" of values, and must be sorted in strictly increasing order along each dimension.  More formally, $\mathrm{op}\left(p,\mathrm{xdata}\right)[{j}_{1},{j}_{2}, ...,{j}_{n}\right] -\mathrm{op}\left(p,\mathrm{xdata}\right)[{k}_{1},{k}_{2}, ...,{k}_{n}]$ must be zero if ${j}_{p}={k}_{p}$, and must be positive if ${k}_{p}<{j}_{p}$.
 The preferred methods minimize memory usage and execution time by avoiding unnecessary storage and verification of redundant data.  In all cases, xdata must contain real values of type numeric.
 • The list of dependent coordinates of the data points, given in ydata, must be input as an Array (or a Matrix, Vector, or list for appropriate values of $n$) of size ${a}_{1}$ by ${a}_{2}$ by ... by ${a}_{n}$, so that the value of $\mathrm{ydata}[{j}_{1},{j}_{2}, ...,{j}_{n}]$ corresponds to the element in xdata of index $[{j}_{1},{j}_{2}, ...,{j}_{n}]$. Values in ydata must be real numbers of type numeric.
 • As an alternate form of input, a single structure xydata containing all coordinates of the data points can be entered.  It can be formatted in one of the following ways:
 – an Array, DataFrame, or Matrix of size ${a}_{1}$ by ${a}_{2}$ by ... by ${a}_{n}$ by ($n+1$), giving the independent and dependent coordinate(s) of each of ${a}_{1}$ by ${a}_{2}$ by ... by ${a}_{n}$ data points as an ordered ($n+1$)-tuple.  The first n elements in each ($n+1$)-tuple represent the independent coordinates of each point, and must adhere to the same restrictions as above (a proper "grid" must be formed, and the independent coordinates must be sorted in strictly increasing order along each dimension).  The $n+1$st coordinate in each ($n+1$)-tuple then represents the dependent coordinate of the respective data point.
 – a list of $n+1$ Arrays, Vectors, Matrices, or lists of size ${a}_{1}$ by ${a}_{2}$ by ... by ${a}_{n}$, where the $j$th array contains the $j$th independent coordinate of each of the ${a}_{1}$ by ${a}_{2}$ by ... by ${a}_{n}$ data points for $1\le j\le n$, and the $n+1$st Array contains the dependent coordinates of each point.  As above, the independent coordinates must adhere to certain restrictions (a proper "grid" must be formed, and the independent coordinates must be sorted in strictly increasing order along each dimension).
 For multidimensional data, these methods are not recommended, since space is wasted storing the full grid of independent coordinates instead of a list of all the possible coordinates in each dimension. In both cases, the coordinates must be real values of type numeric.
 • The list of values to interpolate at, given in xvalues, may be input in one of the following formats:
 – for one-dimensional data, a single numeric value, or a Vector, list, or one-dimensional Array of numeric values can be input.  The output will be returned in a format matching the format of the input.
 – for multidimensional data, an Array or Matrix of size ${u}_{1}$ by ${u}_{2}$ by ... by ${u}_{k}$ by $n$ of numeric values can be input.  It must contain the $n$ coordinates of each of ${u}_{1}$ by ${u}_{2}$ by ... by ${u}_{k}$ values to interpolate at, with the value of $\mathrm{xvalues}[{j}_{1},{j}_{2}, ...,{j}_{k},p]$ giving the $p$th coordinate of the respective point.  The output will be returned in an array of size ${u}_{1}$ by ${u}_{2}$ by ... by ${u}_{k}$ containing the interpolated results.
 – alternatively, a list of $n$ Vectors, lists, or one-dimensional Arrays can be input. The $j$th Vector, list, or Array in the input will be assumed to contain all of the possible $j$th coordinates of the values to interpolate at.  In this case, interpolation will be performed on an ${a}_{1}$ by ${a}_{2}$ by ... by ${a}_{n}$ block of points, where ${a}_{j}$ is the length of the $j$th Vector or Array in the input.  The output will then be returned in a Vector, Matrix, list, or Array of size ${a}_{1}$ by ${a}_{2}$ by ... by ${a}_{n}$.
 • If any of the data points in xvalues lie outside the rectangular bounding box specified by the input, then extrapolation will be performed to approximate their corresponding y-values.  The method by which extrapolation is performed can be controlled by using option extrapolate; see below.
 • This routine has separate numeric methods for handling hardware and software floats.  The decision about which routine to use can be controlled by setting the UseHardwareFloats environment variable.  If UseHardwareFloats remains unset, then hardware floats are used if and only if $\mathrm{Digits}\le \mathrm{evalhf}\left(\mathrm{Digits}\right)$, in which case all software floats in the input will be converted to hardware floats.
 • Only computations involving numeric floating-point data are supported by this routine. If the input does not contain floating-point data, an error will be thrown.
 • For optimal performance, all rtables in the input should be Fortran order with rectangular storage (the default). Otherwise, a conversion will take place.  All rtables in the output will be Fortran order rtables with rectangular storage.
 • This function is part of the CurveFitting package, so it can be used in the short form ArrayInterpolation(..) only after executing the command with(CurveFitting).  However, it can always be accessed through the long form of the command by using CurveFitting[ArrayInterpolation](..).

Options

 • If the option method = is given, then one of the following interpolation methods are used to compute the interpolant $f$ and evaluate $f\left({x}_{i}\right)$ for each point ${x}_{i}$ in xvalues:
 – method = nearest: Perform nearest neighbor interpolation.  Given a point ${x}_{i}$ in xvalues, $f\left({x}_{i}\right)$ is defined to be $y$, where $\left(x,y\right)$ is the data point such that the Euclidean distance $‖x-{x}_{i}‖$ is minimized.
 – method = lowest: Perform lowest neighbor interpolation.  Given a point ${x}_{i}$ in xvalues, $f\left({x}_{i}\right)$ is defined to be $y$, where$\left(x,y\right)$ is the data point such that ${x}_{i}-x$ is non-negative in all coordinates, but the Euclidean distance $‖x-{x}_{i}‖$ is minimized.
 – method = highest: Perform highest neighbor interpolation.  Given a point ${x}_{i}$ in xvalues, $f\left({x}_{i}\right)$ is defined to be $y$, where $\left(x,y\right)$ is the data point such that $x-{x}_{i}$ is non-negative in all coordinates, but the Euclidean distance $‖x-{x}_{i}‖$ is minimized.
 – method = linear: Perform n-dimensional linear interpolation (lerping).  In the one-dimensional case, $f$ is a piecewise-linear function passing through each data point $\left(x,y\right)$ in the input.  In the multidimensional case, $f$ is the tensor product of $n$ such piecewise linear functions, one for each dimension.  $f\left({x}_{i}\right)$ is computed by performing linear interpolation along the first dimension, then along the second dimension, and so on.
 – method = cubic: Perform piecewise cubic Hermite interpolation. In the 1-dimensional case, $f$ is a piecewise-cubic function passing through each data point$\left(x,y\right)$ in the input. In this case, $f\left(x\right)$ = ${f}_{i}\left(x\right)$ if $x$ lies in the interval $\left[{x}_{i},{x}_{i+1}\right]$, where each ${f}_{i}$ is a cubic polynomial such that ${f}_{i}\left({x}_{i}\right)={y}_{i}$ and ${f}_{i}\left({x}_{i+1}\right)={y}_{i+1}$ for all data points $\left({x}_{i},{y}_{i}\right)$ in the input (where $i$ ranges from $0$ to $k$). The coefficients of the functions ${f}_{i}$ are determined locally by assigning slopes ${s}_{i}$ to each data point ${x}_{i}$ and solving for the unique cubic function ${f}_{i}\left(x\right)$ determined by the additional constraints that ${f}_{i}\text{'}\left({x}_{i}\right)={s}_{i}$and${f}_{i}\text{'}\left({x}_{i+1}\right)={s}_{i+1}$.  This forces $f$ to be continuously differentiable ($\mathrm{C1}$).  The ${s}_{i}$ themselves are computed using Bessel's method: ${s}_{i}$ is the slope at ${x}_{i}$ of the parabola passing through ${x}_{i-1},{y}_{i-1}$, ${x}_{i},{y}_{i}$, and ${x}_{i+1},{y}_{i+1}$. In the multidimensional case, $f$ is the tensor product of $n$ such spline functions, one for each dimension.
 – method = spline: Perform spline interpolation.  By default, natural cubic spline interpolation is used. In the 1-dimensional case, $f$ is a piecewise-cubic function passing through each data point$\left(x,y\right)$ in the input. In this case, $f\left(x\right)$ = ${f}_{i}\left(x\right)$ if $x$ lies in the interval $\left[{x}_{i},{x}_{i+1}\right]$, where each ${f}_{i}$ is a cubic polynomial such that ${f}_{i}\left({x}_{i}\right)={y}_{i}$ and ${f}_{i}\left({x}_{i+1}\right)={y}_{i+1}$ for all data points $\left({x}_{i},{y}_{i}\right)$ in the input (where $i$ ranges from $0$ to $k$).  The coefficients of the functions ${f}_{i}$ are selected such that $f$ is twice continuously differentiable ($\mathrm{C2}$), that is,  ${f}_{i}\text{'}\left({x}_{i+1}\right)={f}_{i+1}\text{'}\left({x}_{i+1}\right)$and ${f}_{i}\text{'}\text{'}\left({x}_{i+1}\right)={f}_{i+1}\text{'}\text{'}\left({x}_{i+1}\right)$.  In addition, the "natural" condition of the spline specifies that $f\text{'}\text{'}\left({x}_{0}\right)=0$ and $f\text{'}\text{'}\left({x}_{k}\right)=0$. In the multidimensional case, $f$ is the tensor product of $n$ such spline functions, one for each dimension. Using method=spline will produce a smoother interpolant than method=cubic ($\mathrm{C2}$ instead of $\mathrm{C1}$), but is more expensive to set up and more prone to numerical instability because each segment of the spline is determined globally by the positions of all other points in the data set.
 method=linear is used by default.
 • If the options degree=d and endpoints=e are given, where d is a positive integer and e is one of natural, notaknot, or periodic, then spline interpolation will be performed using the provided degree and endpoint conditions.  See Spline Continuity and End Conditions for details. These options only affect the result if method=spline is used.  In the multidimensional case, the same degree and endpoint conditions are used for the splines generated in each dimension. The defaults are degree=3 and endpoints=natural, in which case natural cubic spline interpolation will be performed.
 • If splines of an even degree are being used, the option knots=data forces the use of a spline function where the spline knots are positioned on the nodes.  See Spline Continuity and End Conditions for details.  The default method, knots=default, defines the spline knots at the midpoints of the nodes when even degree splines are used.  This option has no effect when other methods are used.
 • If the option uniform=true is given, then ArrayInterpolation assumes that the data points are sampled over a grid of uniformly spaced points in each dimension. In other words, if ${a}_{i,j}$ is the $j$th possible coordinate in the $i$th dimension, then ${d}_{i}={a}_{i,j}-{a}_{i,j-1}$ is assumed to be constant over all possible $j$, given any fixed value of $i$. This gives a considerable speedup when the input contains uniform data, since it allows ArrayInterpolation to use a fast lookup algorithm when evaluating the interpolant at the specified points. The default is uniform=false, in which case ArrayInterpolation uses a slower but more general binary search algorithm to perform interpolation.  Using the uniform=true option with non-uniform data may produce incorrect results.
 • If the option verify=false is given, then ArrayInterpolation skips the various checks it performs to ensure correctly formatted input.  This can decrease the time required to solve large problems, but will prevent the function from detecting any errors in the input.  If the input is improperly sorted, contains Arrays indexed from values other than 1, contains non-rectangular or C order rtables, or is otherwise formatted incorrectly, ArrayInterpolation may return incorrect results or throw an unexpected error.
 • If the option extrapolate=e is given, where e is of type extended_numeric or truefalse, then one of the following possible extrapolation methods will be used to compute $f\left(x\right)$ if $x$ lies outside the bounding box specified by the input:
 – extrapolate = true: Perform extrapolation using the closest valid branch of the interpolating function. In the case of method=lowest and method=highest, this is not be defined for some points, in which case undefined will be returned.
 – extrapolate = false: Do not extrapolate.  An error will be thrown if any point in xvalues lies outside the bounding box specified by the input.
 – extrapolate = e, where e is of type extended_numeric : Define $f\left(x\right)$ to be e if $x$ lies outside the bounding box specified by the input.  e is commonly zero or undefined.
 extrapolate=true is used by default.
 • If the option container=c is given, where c is an appropriately sized rtable, then the computation is performed in-place and the result is returned in c.  c must be of the correct size and datatype to match the output of the routine.  With this option, no additional memory is allocated to store the result; this is a programmer-level feature that can be used to reduce memory usage and decrease the time spent by Maple's garbage collector.  The default is container=false, in which case Maple creates and returns a new rtable containing the result.

Examples

 > with(CurveFitting):
 > with(plots):

An introductory example.  Suppose a signal is sampled several times over a given interval of time:

 > Times := [0.00,0.01,0.02,0.03,0.04,0.05,0.06,0.07,0.08,0.09, 0.10]:
 > Amplitudes := [-0.6, 0.0, 0.4, 0.6, 0.3, -0.1, -0.2, 0.0, 0.1, -0.3, -0.6]:
 > pointplot(Times,Amplitudes);

ArrayInterpolation to resample the data at a higher sampling frequency:

 > NewTimes := [seq(0.001*i,i=0..100)]:
 > NewAmplitudes := ArrayInterpolation(Times,Amplitudes,NewTimes):
 > pointplot(NewTimes,NewAmplitudes);

Use a cubic spline to achieve a smoother, more realistic resampling of the data:

 > NewAmplitudes := ArrayInterpolation(Times,Amplitudes,NewTimes,method=spline):
 > pointplot(NewTimes,NewAmplitudes);

Try again, using a spline that assumes the data is sampled from a periodic waveform:

 > NewAmplitudes := ArrayInterpolation(Times,Amplitudes,NewTimes,method=spline,endpoints=periodic):
 > pointplot(NewTimes,NewAmplitudes);

A two-dimensional example: a tiny grayscale image stored in a Matrix:

 > Ranges := [[seq(1..10)],[seq(1..16)]]:
 > Img := evalf(Matrix([[255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255], [255, 255, 208, 93, 22, 0, 16, 98, 210, 255, 255, 255, 255, 255, 255, 255], [255, 196, 16, 0, 0, 0, 0, 0, 19, 200, 255, 255, 54, 54, 212, 255], [255, 61, 34, 156, 231, 255, 230, 154, 26, 58, 255, 255, 61, 0, 59, 255], [255, 5, 205, 255, 255, 255, 255, 255, 196, 7, 255, 255, 230, 91, 7, 255], [255, 20, 235, 255, 255, 255, 255, 255, 235, 36, 255, 255, 255, 233, 32, 255], [255, 125, 129, 255, 255, 255, 255, 255, 133, 186, 255, 255, 217, 89, 147, 255], [255, 255, 125, 50, 91, 99, 93, 40, 22, 121, 93, 58, 10, 140, 255, 255], [255, 255, 255, 216, 131, 75, 14, 5, 7, 12, 59, 121, 214, 255, 255, 255], [255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255]])):
 > listdensityplot(Img,style=PATCHNOGRID,axes=none,range=0..255);

Upsample it to a larger image using bilinear interpolation:

 > NewRanges := [[seq(3..52)],[seq(3..82)]]/5.0:
 > NewImg := ArrayInterpolation(Ranges,Img,NewRanges,extrapolate=255):
 > listdensityplot(NewImg,style=PATCHNOGRID,axes=none,range=0..255);

Try again, using bicubic interpolation instead for a smoother fit:

 > NewImg := ArrayInterpolation(Ranges,Img,NewRanges,extrapolate=255,method=cubic):
 > listdensityplot(NewImg,style=PATCHNOGRID,axes=none,range=0..255);

A non-uniform multidimensional example.  Create some 3-D mesh structures to pass through a given set of points defined by a mathematical function:

 > f := (i,j) -> (3-sin(i))^2-(3-j)^2:
 > plot3d(f,0..10,0..10,axes=normal);

Define a non-uniform grid of points, and sample f over them:

 > v := Array([0,1.5,3.5,5,6,8,9]):
 > w := Array([0,3,5,6,6.5,7.5]):
 > y := Matrix(7,6,(a,b)->evalf(f(v[a],w[b]))):

Plot the data so far:

 > pointplot3d([seq(seq([v[i],w[j],y[i,j]],j=1..6),i=1..7)],axes=normal,symbol=sphere);

Create a finer mesh to interpolate over:

 > a1 := Matrix(50,50,(i,j)->i/5):
 > a2 := Matrix(50,50,(i,j)->j/5):
 > A := ArrayTools[Concatenate](3,a1,a2):

Linear interpolation produces a quick approximation to f:

 > B := ArrayInterpolation([v,w],y,A,method=linear):
 > matrixplot(Matrix(B),axes=normal);

Nearest-neighbor interpolation can also be used for quick lookup purposes:

 > B := ArrayInterpolation([v,w],y,A,method=nearest):
 > matrixplot(Matrix(B),axes=normal);

Spline interpolation produces a smoother approximation to the original function f:

 > B := ArrayInterpolation([v,w],y,A,method=spline):
 > matrixplot(Matrix(B),axes=normal);

Increasing the degree of the spline approximation can increase the smoothness of the result, but results in a longer computation time, greater numerical instability, and can cause large oscillations around the edges of a data set:

 > B := ArrayInterpolation([v,w],y,A,method=spline,degree=5):
 > matrixplot(Matrix(B),axes=normal);

Finally, a large example to illustrate a few tips for increasing the speed of computations:

 > A := evalf(Vector([seq(0..50000)])):
 > B := evalf(Vector([seq(i^(2),i=0..50000)])):
 > C := evalf(Vector([seq(0..200000)]/4)):
 > time(ArrayInterpolation(A,B,C));
 ${0.098}$ (1)

On such a large one-dimensional example, a significant portion of the execution time is spent verifying the integrity of the input data.  Disabling this verification will produce a significant speedup in the execution time of the routine, but will produce incorrect results if the input is not correctly formatted or sorted:

 > time(ArrayInterpolation(A,B,C,verify=false));
 ${0.061}$ (2)

Asserting that the data is uniform allows a faster lookup method to be used:

 > time(ArrayInterpolation(A,B,C,verify=false,uniform=true));
 ${0.050}$ (3)

Cubic interpolation takes longer than the default, linear method:

 > time(ArrayInterpolation(A,B,C,verify=false,uniform=true,method=cubic));
 ${0.055}$ (4)