Understanding Fortran_order and C_order 'order' Options for Matrices and Arrays - Maple Programming Help

Understanding Fortran_order and C_order 'order' Options for Matrices and Arrays

Description

 • In Maple, construction of a Matrix or Array actually constructs two main objects, an rtable object, and an associated piece of one dimensional (1-D) memory (data space) used to store the contents of the rtable. The $\mathrm{order}=\mathrm{Fortran_order}$ or $\mathrm{order}=\mathrm{C_order}$ option describes how the rtable entries are laid out in the 1-D data space. The default is Fortran_order.
 • For Matrices, Fortran_order is usually referred to as column-major order.  For an $nxm$ Matrix $A$, the entries are stored as $A[1,1],A[2,1],...,A[n,1],A[1,2],...,A[1,m],A[2,m],...,A[n,m]$, so each complete column of the Matrix is stored before the next column.
 For direct access to an element of the Fortran_order Matrix $A$ in the data space, one must know the number of rows in $A$, as the entry ${A}_{i,j}$ is easily seen to be at an offset of $i-1+n\left(j-1\right)$ units from the start of the data space.
 • C_order is usually referred to as row-major order, so each complete row of the Matrix is stored before the next row. For an $nxm$ Matrix $A$, the entries are stored as $A[1,1],A[1,2],...,A[1,m],A[2,1],...,A[n,1],A[n,2],...,A[n,m]$.
 For direct access to an element of the C_order Matrix $A$ in the data space, one must know the number of columns in $A$, as the entry ${A}_{i,j}$ is easily seen to be at an offset of $m\left(i-1\right)+j-1$ units from the start of the data space.
 Note: For the special case of square matrices, the data storage of a C_order Matrix resembles the transpose of the same Matrix stored in Fortran_order, and vice versa.
 • The extension of these concepts to higher dimensional Arrays is the expected one, namely that Fortran_order stores data by the last dimension first, and that C_order stores data by the first dimension first.
 • For example, a three-dimensional C_order Array $A$, with dimensions $1..n,1..m,1..o$ stores the element ${A}_{i,j,k}$ at an offset of $mo\left(i-1\right)+o\left(j-1\right)+k-1$ from the start of the data space, while for the same Array stored in Fortran_order, the offset would be given by $i-1+n\left(j-1\right)+nm\left(k-1\right)$.
 • Knowledge of the storage arrangement for rtable data can be essential in the design of algorithms that are efficient for large Matrices and Arrays. For efficient algorithms, you must access data in a way that consecutive data access is close to prior data access. This prevents the computer from having to swap out data to/from the memory of the system, instead allowing it to access data in the processor cache, which can be orders of magnitude faster.
 Note: Use of the ArrayTools package requires knowledge of the storage arrangement, as its primary purpose is to provide methods of accessing, copying, or viewing the rtable data in an efficient and flexible manner.

Examples

As a simple example, consider the following two approaches to computing the Frobenius norm of a 2000 x 2000 Fortran_order Matrix $A$.

 > $n≔2000:$
 > $A≔\mathrm{LinearAlgebra}:-\mathrm{RandomMatrix}\left(n,\mathrm{generator}=-100...100.,\mathrm{outputoptions}=\left[\mathrm{datatype}=\mathrm{float}[8],\mathrm{order}=\mathrm{Fortran_order}\right]\right):$
 > $\mathrm{rtable_dims}\left(A\right)$
 ${1}{..}{2000}{,}{1}{..}{2000}$ (1)

In the first case, arrange the loop so that it accesses consecutive elements in the data space in order.

 > $\mathrm{f1}≔\left(A,n\right)→\sqrt{\mathrm{add}\left(\mathrm{add}\left({A}_{i,j}^{2},i=1..n\right),j=1..n\right)}$
 ${\mathrm{f1}}{≔}\left({A}{,}{n}\right){→}\sqrt{{\mathrm{add}}{}\left({\mathrm{add}}{}\left({{A}}_{{i}{,}{j}}^{{2}}{,}{i}{=}{1}{..}{n}\right){,}{j}{=}{1}{..}{n}\right)}$ (2)
 > $\mathrm{t1}≔\mathrm{time}\left(\mathrm{evalhf}\left(\mathrm{f1}\left(A,n\right)\right)\right)$
 ${\mathrm{t1}}{≔}{0.426}$ (3)

In the second case, arrange the loop so that it accesses elements out of order.

 > $\mathrm{f2}≔\left(A,n\right)→\sqrt{\mathrm{add}\left(\mathrm{add}\left({A}_{i,j}^{2},j=1..n\right),i=1..n\right)}$
 ${\mathrm{f2}}{≔}\left({A}{,}{n}\right){→}\sqrt{{\mathrm{add}}{}\left({\mathrm{add}}{}\left({{A}}_{{i}{,}{j}}^{{2}}{,}{j}{=}{1}{..}{n}\right){,}{i}{=}{1}{..}{n}\right)}$ (4)
 > $\mathrm{t2}≔\mathrm{time}\left(\mathrm{evalhf}\left(\mathrm{f2}\left(A,n\right)\right)\right)$
 ${\mathrm{t2}}{≔}{0.496}$ (5)
 > $\mathrm{percent_savings}≔\frac{100\left(\mathrm{t2}-\mathrm{t1}\right)}{\mathrm{t1}}$
 ${\mathrm{percent_savings}}{≔}{16.43192488}$ (6)

Though the time savings of accessing the data in the correct order is not substantial, it can be more pronounced for other algorithms, or if the algorithm is coded in a compiled language.