**UPDATE (April 22, 2017)**: Timings for Mathematica 11.1 have been added to the table, thanks to test script contributed by Bor Plestenjak. I suggest to take a look at his excellent toolbox for multiparameter eigenvalue problems – MultiParEig.

**UPDATE (November 17, 2016)**: All timings in table have been updated to reflect speed improvements in new version of toolbox (`4.3.0`

). Now toolbox computes elementary functions using multi-core parallelism. Also we included timings for the the latest version of `MATLAB`

– `2016b`

.

**UPDATE (June 1, 2016)**: Initial version of the post included statement that newest version of `MATLAB R2016a`

uses `MAPLE`

engine for variable precision arithmetic (instead of `MuPAD`

as in previous versions). After more detailed checks we have detected that this is not true. As it turned out, `MAPLE 2016`

silently replaced `VPA`

functionality of `MATLAB`

during installation. Thus we (without knowing it) tested `MAPLE Toolbox for MATLAB`

instead of `MathWorks Symbolic Math Toolbox`

. We apologize for misinformation. Now post provides correct comparison results with `Symbolic Math Toolbox/VPA`

.

Thanks to Nick Higham, Massimiliano Fasi and Samuel Relton for their help in finding this mistake!

From the very beginning we have been focusing on improving performance of matrix computations, linear algebra, solvers and other high level algorithms (e.g. 3.8.0 release notes).

With time, as speed of advanced algorithms has been increasing, elementary functions started to bubble up in top list of hot-spots more frequently. For example the main bottleneck of the multiquadric collocation method in extended precision was the coefficient-wise power function (`.^`

).

Thus we decided to polish our library for computing elementary functions. Here we present intermediate results of this work and traditional comparison with the latest `MATLAB R2016b`

(Symbolic Math Toolbox/Variable Precision Arithmetic), `MAPLE 2016`

and `Wolfram Mathematica 11.1.0.0`

.

Timing of logarithmic and power functions in `3.9.4.10481`

:

>> mp.Digits(34); >> A = mp(rand(2000)-0.5); >> B = mp(rand(2000)-0.5); >> tic; C = A.^B; toc; Elapsed time is 67.199782 seconds. >> tic; C = log(A); toc; Elapsed time is 22.570701 seconds.

Speed of the same functions after optimization, in `4.3.0.12057`

:

>> mp.Digits(34); >> A = mp(rand(2000)-0.5); >> B = mp(rand(2000)-0.5); >> tic; C = A.^B; toc; % 130 times faster Elapsed time is 0.514553 seconds. >> tic; C = log(A); toc; % 95 times faster Elapsed time is 0.238416 seconds.

Now toolbox computes 4 millions of logarithms in quadruple precision (including negative arguments) in less than a second!

Inspired by this result, we have applied our ideas to speed-up some other elementary functions. Summary table with timings and comparison against `MATLAB R2016b (VPA)`

, `MAPLE 2016`

and `Wolfram Mathematica 11.1.0.0`

on `Core i7 990x / Windows 7 `

64-bit:

Function | Timing (sec) | Speed-up (times) | |||||
---|---|---|---|---|---|---|---|

MATLAB (VPA) | Maple | Mathematica | Advanpix | Over VPA | Over Maple | Over Mathematica | |

Power & exponential: | |||||||

EXP | 107.34 | 756.14 | 4.54 | 0.12 | 886.34 | 6243.90 | 37.49 |

LOG | 1161.18 | 593.98 | 6.61 | 0.23 | 5133.40 | 2625.91 | 29.21 |

LOG10 | 1438.91 | 639.46 | 11.13 | 0.24 | 5958.23 | 2647.88 | 46.09 |

LOG2 | 1442.71 | 643.17 | 11.08 | 0.25 | 5789.35 | 2580.94 | 44.48 |

SQRT | 28.75 | 427.40 | 2.60 | 0.27 | 105.74 | 1571.90 | 9.55 |

Trigonometric: | |||||||

SIN | 85.28 | 736.89 | 6.07 | 0.15 | 570.80 | 4932.33 | 40.62 |

COS | 78.96 | 513.73 | 6.10 | 0.15 | 516.44 | 3359.92 | 39.89 |

TAN | 1261.92 | 844.05 | 8.91 | 0.17 | 7277.51 | 4867.64 | 51.37 |

ASIN | 105.12 | 1181.83 | 12.39 | 0.39 | 266.40 | 2995.01 | 31.39 |

ACOS | 100.49 | 1330.99 | 23.10 | 0.39 | 257.55 | 3411.03 | 59.19 |

ATAN | 131.92 | 1039.55 | 5.71 | 0.14 | 974.28 | 7677.64 | 42.17 |

SEC | 1466.09 | 778.14 | 8.00 | 0.18 | 8199.59 | 4352.01 | 44.76 |

CSC | 1503.75 | 793.87 | 8.35 | 0.18 | 8490.95 | 4482.60 | 47.13 |

COT | 1511.67 | 1014.76 | 10.46 | 0.20 | 7728.36 | 5187.95 | 53.48 |

ASEC | 1610.29 | 1962.87 | 18.45 | 0.28 | 5815.44 | 7088.72 | 66.62 |

ACSC | 1648.31 | 1720.76 | 21.96 | 0.28 | 5965.65 | 6227.86 | 79.47 |

ACOT | 140.37 | 1179.84 | 16.61 | 0.16 | 867.58 | 7291.96 | 102.63 |

SINH | 117.85 | 781.78 | 6.88 | 0.13 | 910.78 | 6041.59 | 53.17 |

COSH | 117.73 | 795.34 | 7.00 | 0.13 | 924.06 | 6242.87 | 54.92 |

TANH | 121.37 | 976.78 | 9.20 | 0.10 | 1198.14 | 9642.45 | 90.78 |

ASINH | 92.55 | 778.46 | 13.51 | 0.14 | 656.38 | 5521.02 | 95.81 |

ACOSH | 103.78 | 1349.79 | 20.51 | 0.31 | 332.10 | 4319.31 | 65.65 |

ATANH | 121.46 | 2287.94 | 11.60 | 0.32 | 378.49 | 7129.76 | 36.14 |

SECH | 1922.54 | 978.91 | 9.10 | 0.17 | 11602.53 | 5907.73 | 54.93 |

CSCH | 1947.11 | 960.78 | 8.96 | 0.17 | 11652.35 | 5749.72 | 53.63 |

COTH | 1958.51 | 1268.98 | 10.90 | 0.12 | 16266.72 | 10539.72 | 90.502 |

ASECH | 2378.24 | 2921.78 | 18.75 | 0.43 | 5476.04 | 6727.56 | 43.18 |

ACSCH | 2087.72 | 1188.18 | 17.78 | 0.16 | 12831.71 | 7302.87 | 109.26 |

ACOTH | 2117.19 | 2335.23 | 19.77 | 0.26 | 8083.95 | 8916.49 | 75.47 |

Selected special: | |||||||

gamma | 2491.81 | 7734.53 | 228.35 | 0.76 | 3266.23 | 13018.78 | 299.31 |

erf | 104.11 | 321.20 | 125.88 | 0.16 | 669.96 | 2163.26 | 810.02 |

bessely(0,x) | 7855.70 | 14923.53 | 250.38 | 0.83 | 9482.98 | 18014.89 | 302.25 |

bessely(1,x) | 7302.29 | 14964.26 | 267.94 | 0.83 | 8786.29 | 18005.36 | 322.39 |

besselj(0,x) | 7273.29 | 9998.60 | 90.54 | 0.75 | 9684.81 | 13313.72 | 120.55 |

besselj(1,x) | 5987.67 | 10153.13 | 91.89 | 0.74 | 8077.25 | 13696.38 | 123.96 |

Advanpix toolbox outperforms MATLAB/VPA by **5000** times, MAPLE by **6766** times and Wolfram Mathematica by **100** times by speed in average. Test scripts are available for download:

Run `timing_elementary_advanpix`

to test Advanpix toolbox, and `timing_elementary_vpa`

to test VPA. Don’t forget to add toolbox directory to search path before running the toolbox tests!

***

^{†}Toolbox’s timings are higher on GNU Linux & Apple Mac OSX. We can do deeper performance optimization on Windows since we have full license of Intel Developer tools on the platform.

{ 0 comments… add one now }