Efficiency - Maple Help

All Products Maple MapleSim

Home : Support : Online Help : System : Information : Updates : Maple 13 : Efficiency

Efficiency Improvements in Maple 13

Maple 13 includes numerous efficiency improvements. Some highlights where dramatic speed-ups were achieved are described below.

Linear systems over Q and over cyclotomic fields

More LinearAlgebra efficiency improvements

Modular multivariate polynomial multiplication and division

New sparse modular algorithm for Greatest Common Divisors over algebraic number and function fields

Efficiency improvements to lcoeff and tcoeff

heap[extract] is more efficient

Multithreaded performance

Element-wise operations

Improvements to GetSubImage in the ImageTools package

Decreased garbage collection frequency

Linear systems over Q and over cyclotomic fields

•	A new dense p-adic lift algorithm has improved the efficiency of LinearAlgebra[LinearSolve] and LinearAlgebra[Modular][IntegerLinearSolve]. See Enhancements to LinearAlgebra for details.

•	Maple 13 includes a fast modular algorithm for solving linear systems over cyclotomic fields. For example, if a linear system involves $RootOf (f, x)$ where $f (x)$ is a cyclotomic polynomial of degree > 1, then the fast code will be used. See also numtheory[cyclotomic].

As a comparison, consider the following example:

>	r := RootOf(x^4+x^3+x^2+x+1);

$r ≔ RootOf ({_Z}^{4} + {_Z}^{3} + {_Z}^{2} +_Z + 1)$

(1)

>	cf := ()->randpoly(r, degree = 3):

nv := 50:

>	vars := seq(x[i],i=1..nv):

>	sys := {seq(randpoly([vars],degree=1,coeffs=cf),i=1..nv)}:

>	tt := time():

>	solve(sys,{vars}):

>	time()-tt;

$0.212$

(2)

In Maple 12, this same problem took almost 7 times as long.

More LinearAlgebra efficiency improvements

Copying of a Matrix to a result with a shape (indexing function) is faster. The following Cholesky decomposition operation is approximately twenty times as fast as in Maple 12.

>	with(LinearAlgebra):

>	N := 1000: M := RandomMatrix(N,N,generator=-0.5..0.5, outputoptions=[datatype=float[8]]):

>	L := Matrix(M,shape=triangular[lower]):

>	M := Matrix(L.Transpose(L)):

>	for i from 1 to N do M[i,i]:=i; end do:

>	for i from 1 to N do M[i,i] := i; end do:

>	st := time():

>	LUDecomposition(M,method=Cholesky):

Warning, Matrix may be nonsymmetric; using only data from lower triangle

>	time() - st;

$0.02$

(3)

The following LU decomposition operation is approximately ten times as fast and requires only 80% of the memory allocation as compared to Maple 12.

>	with(LinearAlgebra):

>	N := 1000: M := RandomMatrix(N,N,generator=-0.5..0.5, outputoptions=[datatype=float[8]]):

>	st := time():

>	LUDecomposition(M):

>	time()-st;

$0.03$

(4)

The new thin option to SingularValues gives a result with only $\min (m, n)$ right and left singular vectors. This produces a marked speedup for LeastSquares solving when the number of rows is much larger than the number of columns. The following singular values decomposition operation is approximately thirty times as fast and requires only 4% of the memory allocation as compared to the non-thin computation available in Maple 12.

>	with(LinearAlgebra):

>	m := 10000: n := 100: M := RandomMatrix(m,n,generator=-1.0..1.0, outputoptions=[datatype=float[8]]):

>	st := time():

>	SingularValues(M,output=[U,S,Vt],thin=true):

>	time()-st;

$1.78$

(5)

Modular multivariate polynomial multiplication and division

New heap-based modular multivariate polynomial multiplication and division algorithms have been added to Maple which are faster than the prior multiplication and division implementations.

This expansion has sped up by approximately 50%:

>	p1 := randpoly([x,y,z],degree=20,terms=50):

>	tt := time():

>	p2 := modp(Expand(p1 ^ 3),13):

>	time()-tt;

$0.013$

(6)

This division has sped up by approximately 60%:

>	tt := time():

>	modp(Divide(p2,p1),13);

$true$

(7)

>	time()-tt;

$0.014$

(8)

For more information about these commands, see Expand and Divide.

New sparse modular algorithm for Greatest Common Divisors over algebraic number and function fields

A new algorithm has been added for evaluating Greatest Common Divisors over algebraic number and function fields, which is particularly effective for sparse polynomials. For example, this Gcd computation is about 95% faster:

>	alias(p=RootOf(z^3+tz^2+st,z)):

>	alias(q=RootOf(z^2+2)):

>	g := (y+pt+q)(tx+sy*p+q):

>	a := tx+sy+pst+p^2+q:

>	b := x+sxy-tp^2+p+t+qs:

>	A := evala(Expand(g*a)):

>	B := evala(Expand(g*b)):

>	tt := time():

>	evala(Gcd(A, B));

>	time()-tt;

$0.5$

(9)

Efficiency improvements to lcoeff and tcoeff

The efficiency of lcoeff and tcoeff has been improved. The following computation of the leading and trailing coefficients took more than twice as long and used considerably more memory in Maple 12:

>	vars := [seq(x[i],i=0..19)];

$vars ≔ [x_{0}, x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, x_{6}, x_{7}, x_{8}, x_{9}, x_{10}, x_{11}, x_{12}, x_{13}, x_{14}, x_{15}, x_{16}, x_{17}, x_{18}, x_{19}]$

(10)

>	f := add(i*vars[i], i=1..20);

$f ≔ x_{0} + 2 x_{1} + 3 x_{2} + 4 x_{3} + 5 x_{4} + 6 x_{5} + 7 x_{6} + 8 x_{7} + 9 x_{8} + 10 x_{9} + 11 x_{10} + 12 x_{11} + 13 x_{12} + 14 x_{13} + 15 x_{14} + 16 x_{15} + 17 x_{16} + 18 x_{17} + 19 x_{18} + 20 x_{19}$

(11)

>	g := expand(f^6):

>	(nops,length)(g);

$177100, 10915266$

(12)

>	t := time(): (lcoeff,tcoeff)(g, vars); time() - t;

$1, 64000000$

$0.005$

(13)

Other improvements to lcoeff and tcoeff are described in Enhancements to Symbolic Capabilities in Maple 13.

heap[extract] is more efficient

The heap[extract] function now performs fewer comparisons than in prior Maple versions. The number of comparisons performed in extracting all elements from an $n$ element heap for Maple 13 versus Maple 12 is provided in the following table:

Number of
Elements	Maple 12	Maple 13
10	27	26
20	89	68
30	158	116
40	239	170
50	338	242
60	416	295
70	517	345
80	642	416
90	737	483
100	841	554

Multithreaded performance

Many parts of the Maple evaluation engine have been updated to improve the performance when executing in parallel. Although there is still more work to be done, the performance in parallel is significantly faster.

p := proc(A::Vector,B::Vector,C::Vector,m::integer,n::integer)
    option hfloat;
    local i;

    for i from m to n
    do
        C[i] := A[i]^2+B[i]^2;
    end do;
end proc:
n := 10^7:
m := n/2:
A := LinearAlgebra:-RandomVector( n, outputoptions=[datatype=hfloat] ):
B := LinearAlgebra:-RandomVector( n, outputoptions=[datatype=hfloat] ):
C := Vector( 1..n, datatype=hfloat ):

>	tt := time[real](): p(A,B,C,1,m): p(A,B,C,m+1,n): time[real]() - tt;

$33.089$

(14)

>	tt := time[real](): id1 := Threads:-Create( p(A,B,C,1,m) ): p(A,B,C,m+1,n): Threads:-Wait( id1 ): time[real]() - tt;

$22.873$

(15)

In Maple 12, the parallel case takes 35.568 seconds, slightly longer than the single threaded case.

Element-wise operations

The elementwise operator syntax can be more efficient than zip or map in some cases. Specific operations have been optimized compared to the general code used by zip or map. In other cases, depending on the input, built-in hardware floating-point routines can be used directly.

>	A := LinearAlgebra:-RandomMatrix(1000):

>	time(zip(`*`,A,2));

$0.608$

(16)

>	time( A *~ 2 );

$0.096$

(17)

>	B := LinearAlgebra:-RandomMatrix(1000,outputoptions=[datatype=float[8]]):

>	time(zip(`*`,B,2));

$1.684$

(18)

>	time( B *~ 2 );

$0.017$

(19)

>	time(map(`sin`,B));

$5.456$

(20)

>	time( sin~(B));

$0.116$

(21)

Improvements to GetSubImage in the ImageTools package

The GetSubImage routine of the ImageTools package is faster for the in-place case where the output is specified. The following repeated GetSubImage operation is approximately twice as fast as in Maple 12.

N := 500:

n := 100:

>	image := Array(1..N,1..N,datatype=float[8],order=C_order):

>	container := Array(1..n,1..n,datatype=float[8],order=C_order):

>	st := time():

>	for i from 1 to 1000 do ImageTools:-GetSubImage(image,10,10,n,n,output=container); end do:

>	time() - st;

$0.220$

(22)

Decreased garbage collection frequency

In previous versions of Maple, a garbage collection cycle was initiated after allocating approximately 10^6 words of memory. Given the increase in the amount of memory available on current computers the frequency of garbage collection can be reduced without placing a significant burden on the memory subsystem. In Maple 13, the garbage collector is activated after allocating a minimum of 10^7 words. On average, this yields an improvement of 19% on 32-bit processors and about 10% for 64-bit processors with a modest increase in memory usage. The frequency of garbage collection can be tuned to achieve the best performance for your application by adjusting the gcfreq variable using the kernelopts mechanism. For example,

>	with(LinearAlgebra):

>	gcTest := proc() local A, n, resultList; for n to 30 do A := RandomMatrix(n,n); resultList:= LUDecomposition(A,output=['P','L','U1','R']); end do; end proc:

>	kernelopts(gcfreq=10^6);

$[10000000, 0.1]$

(23)

>	gc(): start := time(): gcTest(): t1 := time() - start;

$t1 ≔ 0.924$

(24)

>	kernelopts(gcfreq=10^7);

$1000000$

(25)

>	gc(): start := time(): gcTest(): t2:= time() - start;

$t2 ≔ 0.860$

(26)

Maple

Erweiterungen für Maple

Student Success Platform

Verbesserung der Studienerfolgsquote

Maple Flow

MapleSim

Beratende Dienstleistungen

Produkte zur Online-Ausbildung

Ausbildung

Industrie

Automobil und Luftfahrt

Robotik

Maschinendesign & industrielle Automation

Andere

Lösungen für die Industrie

Kaufen: Einzelheiten und Preise

Kaufen

Institutionelle Studentenlizenzierung

Elite-Wartungsprogramm

Support

Produktschulung

Online Hilfe zu Produkten

Webinare und Veranstaltungen

Veröffentlichungen

Content Hubs

Anwendungsbeispiele

Community

Über Maplesoft

Pressezentrum

User Community

Kontakt

Online Help

All Products Maple MapleSim

Maple

Leistungsfähige intuitive Mathematiksoftware

Erweiterungen für Maple

Student Success Platform

Verbesserung der Studienerfolgsquote

Maple Flow

Engineering calculations & documentation

MapleSim

Modellierung auf Systemebene

Beratende Dienstleistungen

Produkte zur Online-Ausbildung

Ausbildung

Industrie

Automobil und Luftfahrt

Robotik

Maschinendesign & industrielle Automation

Andere

Lösungen für die Industrie

Kaufen: Einzelheiten und Preise

Kaufen

Institutionelle Studentenlizenzierung

Elite-Wartungsprogramm

Support

Produktschulung

Online Hilfe zu Produkten

Webinare und Veranstaltungen

Veröffentlichungen

Content Hubs

Anwendungsbeispiele

Community

Über Maplesoft

Pressezentrum

User Community

Kontakt

Online Help

All Products Maple MapleSim