UPDATE (November 17, 2016): All timings in table have been updated to reflect speed improvements in new version of toolbox (
4.3.0). Now toolbox computes elementary functions using multi-core parallelism. Also we included timings for the the latest version of
UPDATE (June 1, 2016): Initial version of the post included statement that newest version of
MATLAB R2016a uses
MAPLE engine for variable precision arithmetic (instead of
MuPAD as in previous versions). After more detailed checks we have detected that this is not true. As it turned out,
MAPLE 2016 silently replaced
VPA functionality of
MATLAB during installation. Thus we (without knowing it) tested
MAPLE Toolbox for MATLAB instead of
MathWorks Symbolic Math Toolbox. We apologize for misinformation. Now post provides correct comparison results with
Symbolic Math Toolbox/VPA.
Thanks to Nick Higham, Massimiliano Fasi and Samuel Relton for their help in finding this mistake!
From the very beginning we have been focusing on improving performance of matrix computations, linear algebra, solvers and other high level algorithms (e.g. 3.8.0 release notes).
With time, as speed of advanced algorithms has been increasing, elementary functions started to bubble up in top list of hot-spots more frequently. For example the main bottleneck of the multiquadric collocation method in extended precision was the coefficient-wise power function (
Thus we decided to polish our library for computing elementary functions. Here we present intermediate results of this work and traditional comparison with the latest
MATLAB R2016b (Symbolic Math Toolbox/Variable Precision Arithmetic) and
Timing of logarithmic and power functions in
>> mp.Digits(34); >> A = mp(rand(2000)-0.5); >> B = mp(rand(2000)-0.5); >> tic; C = A.^B; toc; Elapsed time is 67.199782 seconds. >> tic; C = log(A); toc; Elapsed time is 22.570701 seconds.
Speed of the same functions after optimization, in
>> mp.Digits(34); >> A = mp(rand(2000)-0.5); >> B = mp(rand(2000)-0.5); >> tic; C = A.^B; toc; % 130 times faster Elapsed time is 0.514553 seconds. >> tic; C = log(A); toc; % 95 times faster Elapsed time is 0.238416 seconds.
Now toolbox computes 4 millions of logarithms in quadruple precision (including negative arguments) in less than a second!
Inspired by this result, we have applied our ideas to speed-up some other elementary functions. Summary table with timings and comparison against
MATLAB R2016b (VPA) and
MAPLE 2016 on
Core i7 990x / Windows 7 64-bit:
|Function||Timing (sec)||Speed-up (times)|
|MATLAB R2016b (VPA)||Maple 2016||Advanpix 4.3.0||Over VPA||Over Maple|
|Power & exponential:|
Advanpix toolbox outperforms MATLAB/VPA by 5000 times and MAPLE by 6766 times by speed in average. Test scripts are available for download:
timing_elementary_advanpix to test Advanpix toolbox, and
timing_elementary_vpa to test VPA. Don’t forget to add toolbox directory to search path before running the toolbox tests!
†Toolbox’s timings are higher on GNU Linux & Apple Mac OSX. We can do deeper performance optimization on Windows since we have full license of Intel Developer tools on the platform.