Version History
August 20, 2024 – 5.3.1.15613
- Speed-up of the
FFT
, especially for the lengths with high prime factors. Now:Before:>> A = rand(1771561,1,'mp'); % 11^6 >> tic; X = fft(A); toc; Elapsed time is 0.885680 seconds. >> A = rand(4826809,1,'mp'); % 13^6 >> tic; X = fft(A); toc; Elapsed time is 2.212597 seconds. >> A = rand(4757297,1,'mp'); % 17*23^4 >> tic; X = fft(A); toc; Elapsed time is 3.453519 seconds.
Performance improvement of the power-of-two lengths are ~15%.>> A = rand(1771561,1,'mp'); % 11^6 >> tic; X = fft(A); toc; Elapsed time is 5.108077 seconds. >> A = rand(4826809,1,'mp'); % 13^6 >> tic; X = fft(A); toc; Elapsed time is 12.924675 seconds. >> A = rand(4757297,1,'mp'); % 17*23^4 >> tic; X = fft(A); toc; Elapsed time is 13.426163 seconds.
July 23, 2024 – 5.3.0.15577
- Fixed incompatibility with MATLAB ≥R2024a in ODE-related functions. Thanks to Dariche Nguyen for the report.
- Fixed incompatibility with x86-64 emulator on macOS Apple Silicon platforms.
June 26, 2024 – 5.2.9.15553
- Added functions to compute zeros of the Bessel functions:
BESSELJZERO
,BESSELYZERO
.Please check help for more details (e.g.>> besseljzero(mp('1/3'),[0:5]') ans = 0 2.902586248416952480224261953123814 6.032747057265841959367811512637093 9.170506669463887768090003068219286 12.31019377164492861130228997423933 15.45064896781712201939712813318986 >> besselyzero(mp('-2.5'),[0:5]') ans = 0 5.763459196894549791406466653952735 9.095011330476355156337698327989695 12.32294097056658205196956792532973 15.51460301088674823044142932724595 18.68903635536282220199816551630472
help besseljzero
) - Minor bug fixes and improvements.
May 24, 2024 – 5.2.8.15537
- Fixed
FSOLVE
incompatibility with MATLAB versions ≥R2023b. - Performance improvement of all arbitrary precision operations (non-quadruple) on x86-64 platforms.
April 30, 2024 – 5.2.7.15522
- Minor bug fixes and small performance improvements.
- Functions
prevprime
andnextprime
have been updated for better compatibility with new versions of MATLAB (e.g. more permissive on input arguments, etc.).
March 27, 2024 – 5.2.7.15512
- Performance improvement of all arbitrary precision operations (non-quadruple) on x86-64 GNU Linux and macOS. Now the minimum required microarchitecture on these platforms is Core 2 and Sandy Bridge respectively.
February 29, 2024 – 5.2.6.15487
- Added upper incomplete gamma function
IGAMMA
. - Fixed minor bugs in
ROUND
andSPDIAGS
functions. Thanks to Michal Outrata and Sergio Zarantonello for the reports.
January 5, 2024 – 5.2.5.15470
- Native support for Apple Silicon chips on macOS. Minimum requirement is MATLAB R2023b and macOS Monterey (12.6).
- Speed boost in all arbitrary precision operations (non-quadruple) on GNU Linux and macOS.
- New special functions – Lerch transcendent, dilogarithm and polylogarithm (
LerchPhi
,DILOG
andPOLYLOG
). - Fixed minor bugs and incompatibilities with newest MATLAB and macOS.
August 13-15, 2023 – 5.1.0.15432
- Added new algorithm/code for Cholesky factorization for sparse matrices. All variants supported, including preordering and flag:
R = chol(A) R = chol(A,triangle) [R,flag] = chol(___) [R,flag,P] = chol(S) [R,flag,P] = chol(___,outputForm)
New CHOL extensively uses multi-core parallelism. Expected speed-up is ~10 times depending on matrix structure and number of CPU cores:% Now bcsstk15.mtx 3948 x 3948 nnz = 117816 time = 0.3485 sec error = 6.04686e-44 bcsstk16.mtx 4884 x 4884 nnz = 290378 time = 0.4825 sec error = 4.18599e-43 bcsstk17.mtx 10974 x 10974 nnz = 428650 time = 0.5528 sec error = 3.6355e-44 bcsstk18.mtx 11948 x 11948 nnz = 149090 time = 0.4497 sec error = 3.29082e-45 s1rmq4m1.mtx 5489 x 5489 nnz = 262411 time = 0.4612 sec error = 1.34992e-38 s2rmq4m1.mtx 5489 x 5489 nnz = 263351 time = 0.5213 sec error = 1.28795e-37 s3rmq4m1.mtx 5489 x 5489 nnz = 262943 time = 0.4547 sec error = 1.22854e-36 s3dkq4m2.mtx 90449 x 90449 nnz = 4427725 time = 25.9230 sec error = 7.68671e-36 s3dkt3m2.mtx 90449 x 90449 nnz = 3686223 time = 15.2523 sec error = 5.58745e-36 bmw7st_1.mtx 141347 x 141347 nnz = 7318399 time = 24.4653 sec error = 3.54576e-50 % Before bcsstk15.mtx 3948 x 3948 nnz = 117816 time = 3.7215 sec error = 1.05308e-43 bcsstk16.mtx 4884 x 4884 nnz = 290378 time = 4.6282 sec error = 8.0446e-43 bcsstk17.mtx 10974 x 10974 nnz = 428650 time = 3.9506 sec error = 4.20629e-44 bcsstk18.mtx 11948 x 11948 nnz = 149090 time = 3.1897 sec error = 5.16123e-45 s1rmq4m1.mtx 5489 x 5489 nnz = 262411 time = 4.1850 sec error = 2.22199e-38 s2rmq4m1.mtx 5489 x 5489 nnz = 263351 time = 4.8899 sec error = 2.12803e-37 s3rmq4m1.mtx 5489 x 5489 nnz = 262943 time = 4.3369 sec error = 2.16643e-36 s3dkq4m2.mtx 90449 x 90449 nnz = 4427725 time = 395.1866 sec error = 1.49489e-35 s3dkt3m2.mtx 90449 x 90449 nnz = 3686223 time = 207.7296 sec error = 1.02667e-35 bmw7st_1.mtx 141347 x 141347 nnz = 7318399 time = 348.0016 sec error = 3.60662e-50
- Speed boost in all operations involving sparse matrices (unary/binary, multiplication, etc.).
- Updated
MLDIVIDE
,INV
andNORM
for sparse inputs (speed-up). - Fixed bug in
FILTER(a,b,x)
when numerator coefficientsb
is shorter thana
. Thanks to Karthick Manivannan for the bug report.
July 19, 2023 – 5.0.0.15222
- Added new algorithm/code for LU decomposition for sparse matrices. New LU supports all variants, including scaling:
[L,U,P,Q] = lu(S) [L,U,P,Q,D] = lu(S) [___] = lu(S,thresh) [___] = lu(___,outputForm)
More importantly, new LU heavily relies on multi-core parallelism and it is much faster compared to the previous algorithm we used. Expected speed-up is 5-10 times depending on matrix structure and number of CPU cores. As a consequence, all functions based on LU decomposition are faster now. For example, current solver timings (inspired by the benchmark):% Now sherman2.mtx 1080 x 1080 nnz = 23094 time = 0.1051 sec error = 4.01902e-44 e20r0500.mtx 4241 x 4241 nnz = 131556 time = 0.3023 sec error = 3.62548e-35 mark3jac020.mtx 9129 x 9129 nnz = 56175 time = 1.8941 sec error = 5.87223e-42 mark3jac020sc.mtx 9129 x 9129 nnz = 56175 time = 2.0599 sec error = 6.58749e-41 psmigr_2.mtx 3140 x 3140 nnz = 540022 time = 18.6376 sec error = 2.69133e-34 psmigr_3.mtx 3140 x 3140 nnz = 543162 time = 21.3257 sec error = 7.35174e-36 sinc12.mtx 7500 x 7500 nnz = 294986 time = 28.6756 sec error = 3.599e-39 % Before sherman2.mtx 1080 x 1080 nnz = 23094 time = 1.2338 sec error = 1.84168e-43 e20r0500.mtx 4241 x 4241 nnz = 131556 time = 3.9836 sec error = 1.81892e-34 mark3jac020.mtx 9129 x 9129 nnz = 56175 time = 17.0367 sec error = 7.01368e-41 mark3jac020sc.mtx 9129 x 9129 nnz = 56175 time = 18.5480 sec error = 6.0482e-40 psmigr_2.mtx 3140 x 3140 nnz = 540022 time = 113.7182 sec error = 2.87561e-33 psmigr_3.mtx 3140 x 3140 nnz = 543162 time = 116.8226 sec error = 8.0958e-34 sinc12.mtx 7500 x 7500 nnz = 294986 time = 172.6728 sec error = 2.57939e-39
- Added support for complex inputs in
LYAP
. - Updated
DET
andTRACE
functions for sparse inputs. - Fixed convergence criteria of
MRRR
algorithm inEIG
. - Fixed sign of imaginary part of
ASEC/ACSC
for negative arguments. - Fixed bug in
MLDIVIDE
when right-hand side is sparse.
April 13, 2023 – 4.9.3.15018
- Speed-up of FFT with improved parallelism and reduced memory footprint. Effective for all platforms, quadruple and arbitrary precision cases are covered.
>> mp.Digits(34) >> x = randn(67108864,1,'mp'); % Now >> tic; y=fft(x); toc; Elapsed time is 10.556954 seconds. >> tic; z=ifft(y,'symmetric'); toc; Elapsed time is 10.055833 seconds. % Before >> tic; y=fft(x); toc; Elapsed time is 16.559021 seconds. >> tic; z=ifft(y,'symmetric'); toc; Elapsed time is 16.845096 seconds.
February 10, 2023 – 4.9.2.14905
- Speed-up of all (full) matrix operations (including eigensolvers) on many-core CPUs. Effective for all platforms, quadruple and arbitrary precision cases are covered.
December 6, 2022 – 4.9.1.14792
- Added 2-argument variant of floating-point relative accuracy (machine epsilon) function:
EPS('like',p)
. Thanks to Matthew Turner for the request. - Added function
RESIDUEZ
(z-transform partial-fraction expansion). Thanks to Karthick Manivannan for the request. - Improved performance of O(n3) basic matrix operations on many-core CPUs.
August 9, 2022 – 4.9.0.14753
- QR decomposition has been updated with proper support for empty matrices. Thanks to Bor Plestenjak for the report.
- Added function
FILTER
(1-D digital filter). The function is enabled with multi-core parallelism and can process several arrays simultaneously. Thanks to Jon Dattorro for the request.
May 1, 2022 – 4.8.7.14677
- Added support for large arrays, with more than 2^32-1 elements. Requested by Karthick Manivannan.
- Singular value decomposition
SVD(A)
has been updated with proper support for empty matrices. Thanks to Bor Plestenjak for the report.
February 21, 2022 – 4.8.6.14636
- Generalized eigenvalue problem solver
EIG(A,B)
has been updated for the case of complex matrices. Convergence criteria and computation of shifts in QZ decomposition have been completely revised and improved. Thanks to Bor Plestenjak for detailed tests and reports. - Added support for scaled Airy functions. Requested by Jon Vegard Venås.
- Fixed bug in
REALMIN
function. Now it returns “safe” minimum so that1/REALMIN
doesn’t overflow. - Updated multiprecision libraries to the newest versions – MPIR, MPFR and MPC.
September 26, 2021 – 4.8.5.14569
- Extended subscripted assignment with the support of automatic reshaping and dimension expansion for empty left-hand side:Thanks to Manolis Chatzis for the suggestion.
>> B = eye(3,'mp') >> A(1,:,:) = B A(:,:,1) = 1 0 0 A(:,:,2) = 0 1 0 A(:,:,3) = 0 0 1
- Sparse solver (for general matrices) has been updated for better parallelism.
May 17, 2021 – 4.8.4.14478
- Added function
NCHOOSEK
. Thanks to Jon Dattorro for the request. - The
BESSELI
function has been updated for compatibility with x86 emulation on Apple M1 chip. Thanks to Yasutaka Hanada and Larry Stotts for reports and tests. - Small improvements in
CHOL
andINV
. Thanks to Mark Gockenbach for the bug report. - Speed-up of expression parsing and evalutation. Thanks to Jacob for the suggestion. Conversion of 10 million digits of
pi
:>> mp.Digits(10000000); >> x = mp('pi'); >> s = num2str(x); % Now: >> tic; y = mp(s); toc; Elapsed time is 1.447296 seconds. % Before >> tic; y = mp(s); toc; Elapsed time is 30.175458 seconds.
January 20, 2021 – 4.8.3.14440
- Computation of matrix exponential
EXPM
has been revised and greatly improved. Now we use on-the-fly backward error estimation and scaling reduction using Gelfand upper bound for matrix spectral radius.As a result we have ~4 times speed improvement in real and complex cases:>> mp.Digits(100); >> M=100; >> A=rand(M,'mp')*10; >> B=(randn(M,'mp')+1i*randn(M,'mp'))*10; >> tic; expm(A); toc; Elapsed time is 0.260661 seconds. >> tic; expm(B); toc; Elapsed time is 1.045113 seconds.
Before:
>> mp.Digits(100); >> M=100; >> A=rand(M,'mp')*10; >> B=(randn(M,'mp')+1i*randn(M,'mp'))*10; >> tic; B = expm(A); toc; Elapsed time is 0.855695 seconds. >> tic; B = expm(A); toc; Elapsed time is 3.843145 seconds.
- The
NTHROOT
function has been extended with specialized code for quadruple case. - Fixed minor bugs in
EIGS
and inEIG(A,B)
functions. Thanks to Yury Grabovsky for bug report.
December 12, 2020 – 4.8.2.14203
- Added function
LSQNONNEG
to solve nonnegative linear least-squares problem. - Added (new) function for cosine-sine decomposition (
CSD
). Both full (2×2) and thin (2×1) matrix partitions are supported. Overall we follow the LAPACK semantic:*** Full (2 x 2): [C,S,U1,V1,U2,V2] = CSD(X11,X12,X21,X22) [ I 0 0 | 0 0 0 ] Q M-Q [ 0 C 0 | 0 -S 0 ] [ X11 | X12 ] P [ U1 | ] [ 0 0 0 | 0 0 -I ] [ V1 | ]^T X = [-----------] = [---------] [---------------------] [---------] [ X21 | X22 ] M-P [ | U2 ] [ 0 0 0 | I 0 0 ] [ | V2 ] [ 0 S 0 | 0 C 0 ] [ 0 0 I | 0 0 0 ] *** Thin (2 x 1): [C,S,U1,V1,U2] = CSD(X11,X21) [ I 0 0 ] Q [ 0 C 0 ] [ X11 ] P [ U1 | ] [ 0 0 0 ] X = [-----] = [---------] [----------] V1^T [ X21 ] M-P [ | U2 ] [ 0 0 0 ] [ 0 S 0 ] [ 0 0 I ]
The
CSD
has been implemented in C++ with multi-core optimizations for all cases – real, complex, quadruple and arbitrary precisions.>> mp.Digits(34); >> M=500;P=M/2;Q=M/2; >> X=rando(M,'mp'); >> X11 = X(1:P,1:Q); >> X12 = X(1:P,(Q+1):end); >> X21 = X((P+1):end,1:Q); >> X22 = X((P+1):end,(Q+1):end); >> tic; [C1, S1, U1, V1, U2, V2] = csd(X11, X12, X21, X22); toc; Elapsed time is 3.275797 seconds. >> mp.Digits(100); >> M=500;P=M/2;Q=M/2; >> X=rando(M,'mp'); >> X11 = X(1:P,1:Q); >> X12 = X(1:P,(Q+1):end); >> X21 = X((P+1):end,1:Q); >> X22 = X((P+1):end,(Q+1):end); >> tic; [C1, S1, U1, V1, U2, V2] = csd(X11, X12, X21, X22); toc; Elapsed time is 16.315847 seconds.
- Most of the linear algebra functions have been switched to multiprecision fused multiply–add operation with only one rounding:
FMA(A,B,C) = A+B*C
. This might lead to higher accuracy especially in ill-conditioned cases.
September 22, 2020 – 4.8.0.14105
- Fixed bug in functions for for low-level access to quadruple variables –
MP.GETWORDS64/MP.SETWORDS64
. Thanks to Kirill Serkh for bug report.
June 11, 2020 – 4.8.0.14100
Speed has been improved in all O(n3)
matrix computations. All cases are covered – real, complex, quadruple and arbitrary precision. Expected speed-up depends on particular function and on number of hardware cores of the target CPU.
As an example, using 16-core Core i9-7960X we have got following timings:
% 4.7.1.13688 (before) >> mp.Digits(34); >> A = randn(500,'mp'); >> B = randn(500,'mp'); >> tic; [V,D,W] = eig(A); toc; Elapsed time is 41.566000 seconds. >> tic; [V,D,W] = eig(A,B); toc; Elapsed time is 134.004006 seconds. >> tic; [U,S,V] = svd(A); toc; Elapsed time is 10.305934 seconds. >> mp.Digits(100); >> A = randn(500,'mp'); >> B = randn(500,'mp'); >> tic; [V,D,W] = eig(A); toc; Elapsed time is 133.028512 seconds. >> tic; [V,D,W] = eig(A,B); toc; Elapsed time is 495.003402 seconds. >> tic; [U,S,V] = svd(A); toc; Elapsed time is 27.809539 seconds.
% 4.8.0.14100 (now) >> mp.Digits(34); >> A = randn(500,'mp'); >> B = randn(500,'mp'); >> tic; [V,D,W] = eig(A); toc; Elapsed time is 11.165819 seconds. <-- 3.7 times speed-up >> tic; [V,D,W] = eig(A,B); toc; Elapsed time is 22.996163 seconds. <-- 5.8 times speed-up >> tic; [U,S,V] = svd(A); toc; Elapsed time is 2.582599 seconds. <-- 3.9 times speed-up >> mp.Digits(100); >> A = randn(500,'mp'); >> B = randn(500,'mp'); >> tic; [V,D,W] = eig(A); toc; Elapsed time is 34.167268 seconds. <-- 3.9 times speed-up >> tic; [V,D,W] = eig(A,B); toc; Elapsed time is 76.882287 seconds. <-- 6.4 times speed-up >> tic; [U,S,V] = svd(A); toc; Elapsed time is 7.226432 seconds. <-- 3.8 times speed-up
Similar performance improvement can be observed for all other matrix functions: QR, LU, CHOL,
etc.
One of the toolbox users has published comprehensive speed benchmark against Julia, Wolfram Mathematica and several Python libraries (mpmath, flint+arb
) for extended precision computations. The page is in Japanese but plots and tables are easy to understand.
Thank you Yasutaka Hanada for all your efforts and time making the benchmark!
March 28, 2020 – 4.7.1.13688
- The code of Krylov subspace expansion in
EIGS
has been converted to C++ with various optimizations including multi-core parallelism. To achieve the maximum speed, we had to switch from modified to classic Gram–Schmidt with bulk re-orthogonalization. Classic Gramm-Shmidt is based on level-2BLAS
operations (xGEMV
), which allow nice parallel execution.
March 22, 2020 – 4.7.0.13642
- The
EIGS
has been improved for the case when orthogonal Krylov subspace cannot be expanded. We added few extra steps of re-orthogonalization and last-resort attempt of explicit restart with few random vectors. Thanks to Joseph Benzaken and Gaoyong Sun for tests and reports.
January 21, 2020 – 4.7.0.13619
- Fixed crash in FFT/IFFT for the case when one of the dimensions of input argument is empty. Thanks to Dr. Luo Guo for the bug report.
November 6, 2019 – 4.7.0.13560
- Comprehensive update of quadruple precision computations library on Windows with emphasis on speed. Most of the functions (arithmetic, elementary and special functions, etc.) have became faster and more accurate.
- Code of all special functions have been revised (all platforms, including arbitrary precision mode). Now all of them are significantly faster (e.g.
besselK
is ~10 times faster for complex arguments) and more accurate. - Great number of special functions have been added: generalized hypergeometric pFq,
kummerM, kummerU, FresnelS, FresnelC, psi, gammainc, betainc, zeta, hurwitzZeta, erfi, airy, ellipke, sinc
and many others. See Function Reference for full list of special functions provided in toolbox.
August 1, 2019 – 4.6.4.13348
- Functions
ISHERMITIAN, ISBANDED, BANDWIDTH
and similar have been improved. Now all are (much) faster especially for large matrices. - Speed of all functions related to sparse matrices have been increased. This is because we switched to our new engine for computations with sparse matrices.
The most notable speed-up is in sparse LU – decomposition is 2-3 times faster now, solver is up to 10 times depending on matrix size & structure. - New functions
MP.GETWORDS64
andMP.SETWORDS64
for low-level access to quadruple number/array have been added.
Toolbox represents quadruple numbers in 128-bit IEEE-754 format which can be considered as 2 sequentially stored 64-bit unsigned integers (high and low part). FunctionMP.GETWORDS64
allows to extract these integers, whereasMP.SETWORDS64
generates quadruple number/array from integer parts:>> x = mp('pi') x = 3.141592653589793238462643383279503 >> format hex >> [h,l] = mp.getwords64(x) % Get the high & low integer parts of quadruple. h = 4000921fb54442d1 l = 8469898cc51701b8 >> y = mp.setwords64(h,l) % Generate quadruple number from integer parts. y = 3.141592653589793238462643383279503
These functions can be used for cross-platform data exchange or to interface with other quadruple-enabled codes.
- Added function to compute
KOORNWINDER
orthogonal polynomials on triangle.
March 28, 2019 – 4.6.2.13225
- Functions
MAX
andMIN
have been completely re-implemented. Now both are faster and support extended set of options –'omitnan', 'includenan'
and'all'
. Summary on currently supported variants:C = max(A,B) C = max(A,B,nanflag) M = max(A) M = max(A,[],dim) M = max(A,[],nanflag) M = max(A,[],dim,nanflag) [M,I] = max(___) M = max(A,[],'all') M = max(A,[],'all',nanflag)
At this moment, option
'vecdim'
is not yet supported. - Added
CUMMAX
andCUMMIN
functions. - All cumulative functions
CUMPROD, CUMSUM, CUMMAX
andCUMMIN
have been enabled with the support for'omitnan', 'includenan', 'forward'
and'reverse'
options. Thanks to Abe Ellison for request. - Fixed incompatibility with older syntax of
RAND
andRANDN
. Although it is now considered obsolete, some old code is still using it. Thanks to Nick Higham for requesting this feature.
February 25, 2019 – 4.6.0.13135
- Added implicit expansion/broadcasting of dimensions to following operations:
+ - .* ./ .\ .^ < <= > >= == ~= | & xor min max mod rem hypot atan2
This feature can be enabled/disabled by special command
mp.EnableImplicitExpansion
:>> mp.EnableImplicitExpansion(true); >> mp([1 2 3]).*mp([1;2;3]) ans = 1 2 3 2 4 6 3 6 9 >> mp.EnableImplicitExpansion(false); >> mp([1 2 3]).*mp([1;2;3]) Error using .* (line 1677) Matrix dimensions must agree
- Greatly improved load balancing among computational threads. Expected speed-up is up to
25%
in all “heavy” matrix operations (depending on number of CPU cores, CPU load and operation). - Numerous smaller fixes and improvements.
January 17, 2019 – 4.5.3.12859
- Version
4.5.3
is available for all platforms now. Update is strongly recommended.
December 7, 2018 – 4.5.3.12856
- Numerical stability of
EXPM
has been improved (=higher accuracy in working with highly ill-conditioned matrices). - All service routines has been reviewed and improved:
mp.NumberOfThreads, mp.GuardDigits, mp.ExtendConstAccuracy, mp.FollowMatlabNumericFormat
andmp.OverrideDoubleBasicArrays
. - Added support for multidimensional slice assignments with same number of elements but different shapes:
a = zeros(10,10,10,'mp'); b = ones(10,10,10,'mp'); a(:,:,1) = b(:,1,:);
Thanks to Taras Plakhotnik for requesting this feature.
November 1, 2018 – 4.5.2.12841
- Version
4.5.2
is available for all platforms now. Update is strongly recommended.
October 23, 2018 – 4.5.2.12840
- Stability and convergence of divide & conquer algorithm in
SVD
has been improved. Thanks to Tao Meng from Peking University for reporting the issue.
September 21, 2018 – 4.5.1.12833
- Major update of toolbox including new optimized code for operations in small to medium precision range.
Now speed difference between true quadruple(34) and other precision of comparable level is reduced.Timings of
EIG(A)
, with random unsymmetric matrix of 200×200 size:% 4.4.7.12740 (before) 20 digits, time = 6.01 sec 25 digits, time = 6.26 sec 30 digits, time = 6.82 sec 34 digits, time = 3.39 sec 35 digits, time = 7.07 sec 40 digits, time = 9.35 sec 45 digits, time = 9.33 sec 50 digits, time = 9.74 sec 55 digits, time = 10.3 sec
% 4.5.1.12833 (now) 20 digits, time = 4.51 sec 25 digits, time = 4.77 sec 30 digits, time = 5.06 sec 34 digits, time = 3.20 sec 35 digits, time = 5.24 sec 40 digits, time = 7.03 sec 45 digits, time = 7.01 sec 50 digits, time = 7.27 sec 55 digits, time = 7.81 sec
The code to reproduce the timings:
rng(0); for p=[20:5:30 34 35:5:55] mp.Digits(p); A = randn(200,'mp'); s = tic; [V,D] = eig(A); e = toc(s); fprintf('%d digits, time = %.3g sec\trel.error = %g\n', p, e, norm(A*V - V*D,1)/norm(A,1)); clear A V D e; end
Similar performance gain can be seen in all operations in arbitrary precision mode.
July 25, 2018 – 4.4.7.12740
- The
CIRCSHIFT
has been updated to support 3rd argument. This syntax was introduced in recent MATLAB versions. Thanks to Patrick Dorey for requesting this update.
April 18, 2018 – 4.4.7.12739
- Now
SORT
treatsNaN
as the largest value in array. This is needed for one-to-one compatibility withMATLAB
. Thanks to Massimiliano Fasi for reporting this issue.
March 27, 2018 – 4.4.7.12736
- Fixed incompatibility with
R2018a
on all platforms. Our deepest thanks to Andre Van Moer, Michal Kvasnicka, Enrico Onofri and Siegfried Rump for their help with tests on various systems and MATLAB versions. - Improved
QZ
decomposition to avoid premature exit in case of extremely high precision requested. Thanks to Bor Plestenjak. - Switched to new TLS model on GNU Linux. This might help for some GNU Linux installations where famous TLS-issue still shows up. Thanks to Denis Tkachenko for detailed tests.
March 15, 2018 – 4.4.6.12719
- Added ERFINV, ERFCINV and NORMINV routines in quadruple precision.
- Improved
ORDQZ
to avoid false positive alarm on numerical instability. Thanks to Denis Tkachenko.
December 25, 2017 – 4.4.5.12711
- Fast Fourier Transform (FFT) routines have been upated with various speed optimizations, including multi-core parallelism, extended set of small FFT with minimum number of arithmetic operations, etc.Now toolbox uses split-radix for power-of-two, mixed-radix Stockham framework for composite and Bluestein algorithm for prime lengths FFT. All algorithms have been optimized for parallel execution and quadruple/multi-precision modes.
Before:>> mp.Digits(34); >> x = rand(2^23,1,'mp'); % n = 8M/quadruple >> tic; y = fft(x); toc; Elapsed time is 16.759009 seconds. >> tic; z = ifft(y); toc; Elapsed time is 29.263380 seconds. >> mp.Digits(100); >> x = rand(2^20,1,'mp'); % n = 1M/100-digits >> tic; y = fft(x); toc; Elapsed time is 14.243092 seconds. >> tic; z = ifft(y); toc; Elapsed time is 21.368476 seconds.
Now:
>> mp.Digits(34); >> x = rand(2^23,1,'mp'); % n = 8M/quadruple >> tic; y = fft(x); toc; % ~5 times faster Elapsed time is 3.068695 seconds. >> tic; z = ifft(y); toc; % ~6 times faster Elapsed time is 4.953689 seconds. >> mp.Digits(100); >> x = rand(2^20,1,'mp'); % n = 1M/100-digits >> tic; y = fft(x); toc; % ~8 times faster Elapsed time is 1.753420 seconds. >> tic; z = ifft(y); toc; % ~9 times faster Elapsed time is 2.293286 seconds.
- The
COSINT
andSININT
have been extended for negative arguments. Thanks to Tomoaki Okayama and Ryota Hara. - Fixed
AIRY
for pure imaginary arguments. Thanks to Tom Wallace for detailed bug report. - Fixed bug in
SUBSAGN
reported by Jon Vegard.
October 24, 2017 – 4.4.4.12666
- Improved storage format for cross-platform compatibility. Multiprecision variables stored in file using standard
SAVE/LOAD
commands can now be transferred to any platform without loss in accuracy (e.g. from GNU Linux to Windows). Please note, quadruple precision data were always cross-platform. - Fixed numerical instability in
SYLVESTER_TRI
. Thanks to Massimiliano Fasi. - Fixed accuracy loss in
ELLIPKE
andELLIPJ
. Thanks to Denis Silantyev. - Fixed spurious complex output for extreme arguments in
BESSELJ/Y
. Thanks to Taras Plakhotnik. - Fixed premature overflow/underflow in
BESSELI
for corner case arguments.
October 3, 2017 – 4.4.3.12624
- Added extended set of routines to compute orthogonal polynomials:
chebyshevV
Chebyshev polynomials of the third kind chebyshevW
Chebyshev polynomials of the fourth kind laguerreL
Generalized Laguerre polynomials gegenbauerC
Gegenbauer (ultraspherical) polynomials jacobiP
Jacobi polynomials zernikeR
Radial Zernike polynomials All routines are optimized for parallel execution, quadruple and multi-precision modes, e.g.:
>> x = randn(500); >> tic; laguerreL(10,vpa(x)); toc; Elapsed time is 308.504995 seconds. >> tic; laguerreL(10,mp(x)); toc; % ~ 2500 times faster Elapsed time is 0.120173 seconds.
- Improved stability of divide & conquer algorithm for SVD. Thanks to Bartosz Protas for detailed bug report.
- Various small changes for better compatibility with plotting engine of MATLAB.
August 22, 2017 – 4.4.1.12580
- Parallelism. Modern processors usually run two logical threads per one physical CPU core (hyper-threading). This improves performance in situations when user runs several independent tasks (e.g. usual desktop applications). However this leads to sub-optimal results in case of heavy-load scientific computations, when multi-threaded algorithms are specifically designed to be well-balanced and exhaust all available resources of each core.To alleviate the issue, now toolbox binds its threads one-to-one to physical cores on CPU, ignoring hyper-threaded (logical) cores. You might see performance boost up to 15% depending on algorithm and your CPU:
>> mp.Digits(100); >> A = randn(500,'mp'); % 4.3.6 >> tic; [S,V,D] = svd(A); toc; Elapsed time is 33.177931 seconds. % 4.4.1 >> tic; [S,V,D] = svd(A); toc; Elapsed time is 28.520263 seconds.
This is experimental feature, implemented only in Windows version of toolbox. Other platforms will follow.
- Bessel functions. Whole family of Bessel and Hankel functions have been revised. Majority of improvements are related to the case of non-integer orders, but other cases have been polished too:
- Fixed accuracy loss for half-integer orders when |v| >> |z|. Thanks to Mert Kalfa for detailed tests and reports.
- Fixed accuracy loss in modified Bessel functions in case of pure imaginary arguments and large orders. Now we apply several different asymptotic approximations depending on parameters (one form is specifically tuned for imaginary arguments).
- Added fast implementation for Hankel functions for real arguments and integer orders.
- Added fast quadruple implementation for modified Bessel functions (real arguments and orders).
Besides, now all Bessel functions rely on parallel execution, for all kinds of arguments.
- Miscellaneous.
- Now toolbox includes pre-computed coefficients of Gauss-type quadrature for up to 512 order and up to 250 digits of precision. Which means retrieving quadrature nodes & weights now costs
O(1)
! All usual quadrature are covered – Legendre, Hermite, Gegenbauer, Jacobi and Laguerre.% 4.3.6 >> mp.Digits(34); >> tic; [x,w] = mp.GaussJacobi(256); toc; Elapsed time is 3.265601 seconds. >> mp.Digits(100); >> tic; [x,w] = mp.GaussJacobi(256); toc; Elapsed time is 14.518643 seconds. % 4.4.1 >> mp.Digits(34); >> tic; [x,w] = mp.GaussJacobi(256); toc; Elapsed time is 0.000994 seconds. >> mp.Digits(100); >> tic; [x,w] = mp.GaussJacobi(256); toc; Elapsed time is 0.002819 seconds.
This improvement was inspired by communication with Massimiliano Fasi (Padé approximation for
LOGM
) and Enrico Onofri. We hope this will be helpful for various spectral methods and approximations. - New platform-independent storage format for multiprecision arrays of non-quadruple precision. Multiprecision variables stored in file using standard
SAVE/LOAD
commands can now be transferred to any platform without loss in accuracy (e.g. from GNU Linux to Windows). Please note, quadruple precision data were always cross-platform. - Fixed critical bug in computing Bernoulli coefficients (widely used for evaluation of some special functions). The bug caused race conditions in multi-threaded computations with possible MATLAB crash.
- Now toolbox includes pre-computed coefficients of Gauss-type quadrature for up to 512 order and up to 250 digits of precision. Which means retrieving quadrature nodes & weights now costs
- Parallelism. Modern processors usually run two logical threads per one physical CPU core (hyper-threading). This improves performance in situations when user runs several independent tasks (e.g. usual desktop applications). However this leads to sub-optimal results in case of heavy-load scientific computations, when multi-threaded algorithms are specifically designed to be well-balanced and exhaust all available resources of each core.To alleviate the issue, now toolbox binds its threads one-to-one to physical cores on CPU, ignoring hyper-threaded (logical) cores. You might see performance boost up to 15% depending on algorithm and your CPU:
The large number of changes allows us to bump the version directly to 4.4.1 bypassing smaller revisions.
June 6, 2017 – 4.3.6.12354
- Improved accuracy of matrix logarithm –
LOGM
. Now it has better handling of near-singular matrices and refined Pade approximation. Thanks to Massimiliano Fasi who reported the issues and provided test cases. - The
SORT
function in toolbox has been enabled with undocumented features of MATLAB’s built-inSORT
. Thanks to Daniele Prada from “Istituto di Matematica Applicata e Tecnologie Informatiche”, in Pavia, Italy.
May 12, 2017 – 4.3.5.12344
- Speed-up of
EIGS
routine. The heaviest part (modified Gram-Schmidt) has been moved to C++. Performance improvement is 4-8 times depending on a problem. - Various bug fixes in
EIGS
,ORSCHUR
and in mixed-precision computations (especially with sparse matrices).
March 28, 2017 – 4.3.3.12232
- Added
mp.NumberOfThreads(N)
function to control how many threads toolbox can use in computations enabled with multi-core parallelism. Be default, toolbox uses all available CPU cores, which might not be optimal for all cases. Now user can adjust this flexibly. Requested by Clay Thompson.>> A = rand(1000,'mp'); >> mp.NumberOfThreads(1); >> tic; exp(A); toc; Elapsed time is 1.297683 seconds. >> mp.NumberOfThreads(4); >> tic; exp(A); toc; Elapsed time is 0.425246 seconds.
- Added support for second input argument in
ROUND
function. Requested by Michael Klanner. - Added support for second output argument in
LINSOLVE
function. Requested by Zhaolong Hu. - Added optimization to
BESSELK
for half-integer orders. - Fixed “freeze” bug in
BESSELJN
for high-orders and real quadruple arguments, only Windows version was affected. Reported by Maxime Dana. - Improved integration with warning system in MATLAB. In some cases toolbox showed warnings ignoring the fact that they are disabled in MATLAB. This is ongoing work.
January 25, 2017 – 4.3.3.12177
- Added
ACCUMARRAY, CELL2MAT
andMP.CELLFUN
routines. - Fixed bug related to empty arrays in
BSXFUN
, reported by Vladimír Šedenka. - Added special functions-overloads to retrieve machine epsilon, smallest/largest floating numbers for a given precision:
% Before >> mp.eps >> mp.eps(mp(1)) >> mp.realmin >> mp.realmax % Now / 4.3.3.12177 >> eps('mp') >> eps(mp(1)) >> realmin('mp') >> realmax('mp')
We have replaced
MP.EPS, MP.REALMIN
andMP.REALMAX
with functions without'MP.'
prefix. This is important change needed to write precision independent code.
January 4, 2017 – 4.3.2.12144
- Added
EIGS
routine for computing subset of eigenvalues and eigenvectors. All features are supported: standard and generalized eigenproblems, matrix and function handle inputs and various parts of the spectrum –'SM', 'LM', 'SA', 'LA', 'SI', 'LI', 'SR'
and'LR'
. - Added
FFTN/IFFTN
– routines for multi-dimensional Fourier transformation. - All Fourier-related routines have been updated with specific optimizations for quadruple precision. Expected speed-up is
2-4
times. - Fixed several bugs related to empty arrays in assignment and relational operators, reported by Clay Thompson.
December 6, 2016 – 4.3.0.12070
- Load balancing in multi-threading pool has been improved (especially on Windows).
ODE15S
has been updated to include support forJPattern
option, requested by Nicola Courtier.- Fixed crash in case when inputs to
ORDSCHUR
andORDQZ
have different complexity.
November 8, 2016 – 4.3.0.12050
- All elementary and special mathematical functions (
exp, sqrt, sin, cos, gamma, bessel,
etc. ) have been updated with parallel execution on multi-core CPUs. Speed-up is proportional to number of CPU cores.
For example, timing reduction for Bessel Y0 on Core i7 990x (6 hardware cores):% Before >> A = rand(2000,'mp'); >> tic; bessely(0,A); toc; Elapsed time is 4.278553 seconds. % Now / 4.3.0 >> A = rand(2000,'mp'); >> tic; bessely(0,A); toc; Elapsed time is 0.678716 seconds.
- Bug fixes: memory leak in complex arithmetic when used in multi-threading environments. Quadruple precision mode wasn’t affected by the bug.
November 2, 2016 – 4.2.3.12019
- The right matrix division (
MRDIVIDE
) has been updated to match the speed of the left matrix division (MLDIVIDE
). Reported by Massimiliano Fasi.
October 26, 2016 – 4.2.3.12016
- The eigensolver
EIG(A[,B])
has been updated and optimized further. One of the important changes is that now we have special ultra-fast solver for (k,p)-diagonal matrices in case of generalized eigenproblem. In some cases our quadruple precision solver is faster than built-in double precision solver in MATLAB! Please see the bottom of the article for example – Architecture of eigenproblem solver. - Least square linear solver has been speeded up by at least
2
times (2000x1500
). Actual speed-up grows with matrix size and number of CPU cores. Please note that now we use Rank-Revealing QR (RRQR
) instead ofSVD
. - Bug fixes: memory leaks in generalized eigensolver (complex inputs), SVD for scalar inputs (reported by Denis Tkachenko) and MIN/MAX now ignore NaN values for sure (reported by Gerard Ketefian).
October 5, 2016 – 4.2.2.11893
- The linear system solver
MLDIVIDE
has been updated and optimized further. Now every decomposition in the solver (LU, LDLT and Cholesky) run faster by10-25%
. The final results and algorithm are outlined in our recent article Architecture of linear systems solver. - Computation of eigenvalues of generalized symmetric eigenproblem has been speeded up by
2
times. This is the result of refined parallel algorithm for our D&C solver. - Also we have added MRRR solver for generalized symmetric eigenproblem
[V,D] = eig(A,B)
. It is slightly faster than D&C for computing eigenvectors, but most importantly – it provides better accuracy in some cases. - We have added special algorithm to solve n-diagonal/banded standard eigenproblems –
eig(A)
. We are preparing article with detailed outline of our poly-algorithm for standard eigenproblems (similar toMLDIVIDE
).
September 11, 2016 – 4.2.0.11601
MLDIVIDE
(linear system solver) has been updated with specialized solver for banded matrices.
Timings have been substantially improved for such matrices, e.g.5Kx5K
pentadiagonal matrix:% Before: >> n = 5000; >> A = mp(diag(1:n,0)+diag(1:n-1,1)+diag(1:n-2,2)+diag(-(1:n-1),-1)+diag(-(1:n-2),-2)); >> b = mp(rand(n,1)); >> tic; x = A\b; toc; % LU+pivoting, O(n^3) Elapsed time is 257.355805 seconds. >> norm(A*x-b,1)/norm(A,1) ans = 2.981084158350920545602126981830846e-35
% Now: >> n = 5000; >> A = mp(diag(1:n,0)+diag(1:n-1,1)+diag(1:n-2,2)+diag(-(1:n-1),-1)+diag(-(1:n-2),-2)); >> b = mp(rand(n,1)); >> tic; x = A\b; toc; % Specialized LU+pivoting within bandwidth, O(n*p*q) Elapsed time is 0.233104 seconds. % ~x1000 times faster >> norm(A*x-b,1)/norm(A,1) ans = 2.981084158350920545602126981830846e-35
The speed-up comes from the fact that now algorithms works with only few non-zero diagonals, instead of crunching full matrix.
- Positive definite banded matrices have received specialized solver as well (only superdiagonals are used = less computations).
MLDIVIDE
is a poly-algorithm which selects the best solver depending on matrix properties, basically we have rewritten it from scratch lately. Now it has (much)faster analysis and full set of specialized solvers + rcond estimators for dense matrices.
August 27, 2016 – 4.1.0.11461
MLDIVIDE
(linear system solver) has been updated with specialized solver for tridiagonal matrices.
Now computation complexity for the case is justO(n)
instead ofO(n^3)
:% Before: >> n = 2000; >> A = mp(diag(1:n,0)+diag(1:n-1,1)+diag(1:n-1,-1)); >> b = mp(rand(n,1)); >> tic; x = A\b; toc; % LU+pivoting, O(n^3) Elapsed time is 27.767122 seconds. >> norm(A*x-b,1)/norm(A,1) ans = 2.817286843369241138991794285028987e-34
% Now: >> n = 2000; >> A = mp(diag(1:n,0)+diag(1:n-1,1)+diag(1:n-1,-1)); >> b = mp(rand(n,1)); >> tic; x = A\b; toc; % Specialized tridiagonal LU+pivoting, O(n) Elapsed time is 0.079523 seconds. % ~350 times faster >> norm(A*x-b,1)/norm(A,1) ans = 2.769447752126383577306583358607416e-34
SUBSASGN
andDIAG
have been improved to be compatible with MATLAB in special cases:% Column to row assignment: >> A = magic(3,'mp'); >> x = [1;2;3]; >> A(1,:) = x A = 1 2 3 3 5 7 4 9 2 % Building matrix from empty diagonal: >> diag(rand(1,0,'mp'),0) ans = [] >> diag(rand(1,0,'mp'),1) ans = 0 >> diag(rand(1,0,'mp'),2) ans = 0 0 0 0
- Added functions for orthogonal polynomials:
legendreP, chebyshevT, chebyshevU
andhermiteH
.Variable Precision Arithmetic (VPA)/MATLAB 2016a:>> N = 1000; % polynomial degree >> M = 5000; % number of points >> digits(25); >> x = vpa(rand(M,1)); >> tic; v = chebyshevT(N,x); toc; Elapsed time is 2.600266 seconds. >> tic; v = chebyshevU(N,x); toc; Elapsed time is 1.908803 seconds. >> tic; v = hermiteH(N,x); toc; Elapsed time is 34.765995 seconds. >> tic; v = legendreP(N,x); toc; Elapsed time is 44.810674 seconds.
Advanpix Multiprecision Toolbox:
>> N = 1000; % polynomial degree >> M = 5000; % number of points >> mp.Digits(34); >> x = mp(rand(M,1)); >> tic; v = chebyshevT(N,x); toc; Elapsed time is 0.426981 seconds. % x6 times faster >> tic; v = chebyshevU(N,x); toc; % x4 times faster Elapsed time is 0.450537 seconds. >> tic; v = hermiteH(N,x); toc; % x53 times faster Elapsed time is 0.653591 seconds. >> tic; v = legendreP(N,x); toc; % x43 times faster Elapsed time is 1.035238 seconds.
Please note, our routines are not multi-core optimized (yet). In due course timings will be divided by number of CPU cores.
August 24, 2016 – 4.1.0.11420
- Added following new functions (by categories):Linear Equations
SYLVESTER, LYAP, DLYAP
– solvers for linear matrix equations of Lyapunov and Sylvester.
Continuous, discrete-time and generalized forms are covered.Matrix Decomposition
CHOLUPDATE, QRDELETE, QRINSERT, QRUPDATE
andPLANEROT
.
AlsoGSVD
for generalized singular value decomposition.Matrix Analysis
ISBANDED, BANDWIDTH, ISSCHUR, hasInfNaN
andSUBSPACE
.All new functions (with few exceptions) are implemented in toolbox core using C++ for better performance.
- Following functions have been revised and improved:
CROSS
andDOT
now support multi-dimensional arrays.QR, CHOL, SVD
andKRON
now have better handling of corner cases (e.g. empty inputs, error reports, etc.) - Added support for differential algebraic equations (DAEs) to
ODE15S
.
The number of new functions and changes allow us to bump the version directly to 4.1
bypassing smaller revisions.
August 11, 2016 – 4.0.1.11324
- Improvements to linear least-squares solver. Instead of SVD, now we use rank-revealing QR decomposition (RRQR), heavily optimized for parallel execution on multi-core CPUs:
% Before >> mp.Digits(34); >> A = rand(1000,500,'mp')-0.5; >> tic; x = A\A; toc; Elapsed time is 45.334321 seconds. % Now - 4.0.1.11324 >> mp.Digits(34); >> A = rand(1000,500,'mp')-0.5; >> tic; x = A\A; toc; Elapsed time is 12.059790 seconds.
All cases are covered – real, complex, quadruple and arbitrary precision. Speed-up ratio scales with matrix size and number of CPU cores.
July 6, 2016 – 4.0.0.11272
- New functions –
NEXTABOVE
andNEXTBELOW
have been added. They generate the next representable floating-point value towards positive/negative infinity:>> mp.Digits(34); >> nextabove(mp(1)) ans = 1.00000000000000000000000000000000019 >> nextbelow(mp(1)) ans = 0.999999999999999999999999999999999904 >> nextabove(mp(-1)) ans = -0.999999999999999999999999999999999904 >> nextbelow(mp(-1)) ans = -1.00000000000000000000000000000000019
The routines are able to generate all/any floating-point numbers representable in given precision, thus they are indispensable for accuracy checks of various algorithms. In particular, to investigate quality of approximation in close vicinity of roots or singularities. As an example, here is quick accuracy check of
MATLAB's
built-ingamma
function:>> mp.Digits(34); >> mp.FollowMatlabNumericFormat(true); >> format longE; >> x = nextabove(-1) % closest value to the pole x = -9.999999999999999e-01 >> gamma(mp(x)) % correct function value in quadruple precision ans = -9.00719925474099242278433509846728884e+15 >> gamma(x) % MATLAB gives no correct digits at all! ans = -5.545090608933970e+15
- Occasional crashes on Mac OSX have been fixed. As it turned out, sometimes
MATLAB
(especially older versions) forget to delete MEX module from memory on “clear” command, even though at-exit handlers were called. This leaves MEX in uninitialized and unusable state! Next attempt to use any command from such MEX results in crash.Now we do the unloading procedure manually on all platforms to avoid this from happening again. - Prediction on maximum number of iterations needed to reach target accuracy level in Schur & SVD algorithms have been revisited. Now we rely on more pessimistic assumption to make sure algorithms converge even if it requires much higher number of iterations.
June 21, 2016 – 3.9.9.11199
- Matrix exponential –
EXPM
has been switched to use classic scaling and squaring algorithm. Although Schur-Parlett[Higham2003] might give more accurate results in some cases [Higham2009], it is approx. 5-10 times slower. Thus we decided to use fast scaling & squaring. - Eigen-decomposition re-ordering functions:
ORDSCHUR, ORDEIG
andORDQZ
have been updated with proper error handling in corner cases.
June 3, 2016 – 3.9.9.11161
- Whole set of numerical integration routines has been refreshed:
INTEGRAL, INTEGRAL2, INTEGRAL3, QUAD2D, TRAPZ
andCUMTRAPZ
. TheINTEGRALx
routines are based on nested Gauss-Kronrod quadrature rule or its product for2-3D
.For some reason,MATLAB's
built-inINTEGRAL
doesn’t return the error bound. We consider this unacceptable and our version fixes the flaw:>>mp.Digits(34); >>f = @(x)sin((1:5)*x); >>[q, errbnd] = integral(f,mp(0),mp(1),'ArrayValued',true,'RelTol',mp('eps*100')); >>[q' errbnd'] ans = 0.4596976941318602825990633925570232 2.422586612556771991014735044455665e-34 0.7080734182735711934987841147503806 3.709251321227161864242960518419781e-34 0.6633308322001484857571909315770862 3.510290962167396105950983616082283e-34 0.4134109052159029786597920457744374 2.247449605195361383728053642397239e-34 0.1432675629073547471066721656972884 9.201635067043439859020812453968865e-35
Integration of vector-valued function with error bound for every component.
May 19, 2016 – 3.9.9.11048
- Added
X = SYLVESTER_TRI(A,B,C)
function for solving Sylvester equationA*X + X*B = C
.
The most important case is whenA
andB
are upper quasi-triangular (real Schur canonical form) – needed for computing matrix functions (SQRTM, LOGM,
etc.). - Speed-up of
HYPOT
andATAN2
, approx. 10-50 times each:% 3.9.8.11023 >> A = 100*mp(rand(1000,1000)-0.5); >> B = 100*mp(rand(1000,1000)-0.5); >> tic; atan2(A,B); toc; Elapsed time is 18.056319 seconds. % 3.9.9.11048 >> A = 100*mp(rand(1000,1000)-0.5); >> B = 100*mp(rand(1000,1000)-0.5); >> tic; atan2(A,B); toc; Elapsed time is 0.262123 seconds. % ~70 times faster
May 17, 2016 – 3.9.8.11023
- Speed-up of
SQRTM_TRI
andINTERP1
, approx. 50-100 times each:>> A = triu(100*mp(rand(100,100)-0.5)); >> tic; sqrtm_tri(A); toc; Elapsed time is 0.058902 seconds. >> tic; old_sqrtm_tri(A); toc; Elapsed time is 3.936160 seconds.
New
SQRTM_TRI
is implemented in C++, old one was implemented using MATLAB:function R = old_sqrtm_tri(T) %SQRTM_TRI Square root of upper triangular matrix. n = length(T); R = mp(zeros(n)); for j=1:n R(j,j) = sqrt(T(j,j)); for i=j-1:-1:1 R(i,j) = (T(i,j) - R(i,i+1:j-1)*R(i+1:j-1,j))/(R(i,i) + R(j,j)); end end end
Square root of upper triangular matrix is important for
SQRTM
andLOGM
functions.
May 3, 2016 – 3.9.8.10986
- Speed-up of some (basic) special functions:
gamma, gammaln, erf
anderfc
:% Before/3.9.8.10946 >> mp.Digits(34); >> A = mp(10*(rand(1000)-0.5)); >> tic; B = gamma(A); toc; Elapsed time is 73.195803 seconds. >> tic; B = erf(A); toc; Elapsed time is 31.863047 seconds.
% Now/3.9.8.10986 >> mp.Digits(34); >> A = mp(10*(rand(1000)-0.5)); >> tic; B = gamma(A); toc; Elapsed time is 1.695979 seconds. % x43 times faster >> tic; B = erf(A); toc; Elapsed time is 1.769617 seconds. % x18 times faster
Please note, in contrast to MATLAB, these functions support all kinds of input arguments (real negative, complex, sparse, etc.):
% MATLAB >> gammaln(-0.25) 'Error using gammaln' 'Input must be nonnegative.' % Advanpix >> gammaln(mp(-0.25)) ans = 1.589575312551185990315897214778783 + 3.141592653589793238462643383279503i
>> A = mp(sprand(5,5,0.25)); >> [x,y,s] = find(A); >> s = s + (rand(size(s))-0.5)*1i; % Add imaginary part >> A = sparse(x,y,s,5,5) A = (2,1) 0.1066527701805843886262437081313692 + 0.3173032206534329713321085364441387i (4,1) 0.004634224134067443934270613681292161 + 0.3686947053635096782642222024151124i (3,3) 0.9618980808550536831802446613437496 - 0.4155641544890896765807042356755119i (1,5) 0.4426782697754463313799533352721483 - 0.1002173509011035079652174317743629i (4,5) 0.774910464711502378065688390051946 - 0.2401295971493457859224918138352223i % Advanpix >> erf(A) ans = (2,1) 0.1324883666365941115619861448158018 + 0.3659492942681403578764987541731807i (4,1) 0.005990516950461289561597280148562875 + 0.4356625058347425700172815442490943i (3,3) 0.903034046175742903946764521185423 - 0.1758911959897686907097927081692329i (1,5) 0.472854450257120712106056672396968 - 0.09314858177203069504581523930686257i (4,5) 0.7550126399539313384996685128575232 - 0.1480116547266936627042332666803371i % MATLAB >> erf(double(A)) 'Error using erf' 'Input must be real and full.'
April 26, 2016 – 3.9.8.10946
- We have finished several months of optimization work for elementary functions.
Please see the results and comparisons: Performance of Elementary Functions. - Added support for signed zeros as imaginary part of complex numbers to expression evaluator. As a note, we have been supporting signed zeros from the start, with proper handling of branch cuts, etc. Some additional information: Branch Cuts and Signed Zeros in MATLAB.
April 12, 2016 – 3.9.7.10723
- Added element-wise relational & logical operations for sparse matrices:
>> A = mp(sprand(5,5,0.2)) A = (1,1) 0.4387443596563982417535498825600371 (2,2) 0.7951999011370631809114684074302204 (1,4) 0.3815584570930083962991830048849806 (1,5) 0.7655167881490023695789659541333094 (4,5) 0.1868726045543785962976812697888818 >> A(A<0.5) = 0 A = (2,2) 0.7951999011370631809114684074302204 (1,5) 0.7655167881490023695789659541333094
April 6, 2016 – 3.9.7.10708
- Performance of power and related functions has been improved:
% Before: >> mp.Digits(34); >> A = mp(rand(2000)-0.5); >> B = mp(rand(2000)-0.5); >> tic; C = A.^B; toc; Elapsed time is 82.506463 seconds. >> tic; C = log(A); toc; Elapsed time is 31.506463 seconds.
% Now: >> mp.Digits(34); >> A = mp(rand(2000)-0.5); >> B = mp(rand(2000)-0.5); >> tic; C = A.^B; toc; Elapsed time is 5.363670 seconds. % x15 times >> tic; C = log(A); toc; Elapsed time is 0.843535 seconds. % x37 times
Similarly for all related functions (
log2, log10, sqrt,
etc.) - Added support for sparse logical indices.
- Improved support for sparse matrices in
isequal, isnequal, isnan, isinf, isfinite, sqrt, expm1, log1p
and many other basic functions. - Fixed
spfun
in case when function handle has nested calls to another functions.
March 23, 2016 – 3.9.5.10580
- Added
sortrows
, few minor issues has been fixed. - Fixed small incompatibilities with
R2016a
.
March 15, 2016 – 3.9.5.10558
- Added solver for systems of nonlinear equations –
fsolve
:
x = fsolve(fun,x0)
x = fsolve(fun,x0,options)
x = fsolve(problem)
[x,fval] = fsolve(fun,x0)
[x,fval,exitflag] = fsolve(...)
[x,fval,exitflag,output] = fsolve(...)
[x,fval,exitflag,output,jacobian] = fsolve(...)
All features are supported, including sparse formulation, Jacobian and three algorithms of optimization:trust-region-dogleg, trust-region-reflective
andlevenberg-marquardt
.function test_fsolve x0 = mp([-5; -5]); options = optimset('Display','iter', 'TolFun', mp.eps,... 'TolX', mp.eps,... 'Algorithm','levenberg-marquardt'); [x,fval,exitflag,output] = fsolve(@myfun,x0,options) function F = myfun(x) F = [2*x(1) - x(2) - exp(-x(1)); -x(1) + 2*x(2) - exp(-x(2))]; end end
Output:
>> mp.Digits(34); >> test_fsolve First-Order Norm of Iteration Func-count Residual optimality Lambda step 0 3 47071.2 2.29e+04 0.01 1 6 6527.47 3.09e+03 0.001 1.45207 2 9 918.374 418 0.0001 1.49186 3 12 127.741 57.3 1e-05 1.55326 4 15 14.9153 8.26 1e-06 1.57591 5 18 0.779056 1.14 1e-07 1.27662 6 21 0.00372457 0.0683 1e-08 0.484659 7 24 9.2164e-08 0.000336 1e-09 0.0385554 8 27 5.66104e-17 8.34e-09 1e-10 0.000193709 9 30 2.35895e-34 1.7e-17 1e-11 4.80109e-09 10 33 1.70984e-51 4.58e-26 1e-12 9.80056e-18 11 36 1.8546e-68 1.51e-34 1e-13 2.63857e-26 Equation solved. fsolve completed because the vector of function values is near zero as measured by the selected value of the function tolerance, and the problem appears regular as measured by the gradient. x = 0.56714 0.56714 fval = 9.6296e-35 9.6296e-35 exitflag = 1 output = iterations: 11 funcCount: 36 stepsize: [1x1 mp] cgiterations: [] firstorderopt: [1x1 mp] algorithm: 'levenberg-marquardt' message: 'Equation solved. Equation solved. The sum of squared function values, r = 1.854603e-68, is less than sqrt(options.TolFun) = 1.387779e-17. The relative norm of the gradient of r, 6.583665e-39, is less than 1e-4*options.TolFun = 1.925930e-38. Optimization Metric Options relative norm(grad r) = 6.58e-39 1e-4*TolFun = 2e-38 (selected) r = 1.85e-68 sqrt(TolFun) = 1.4e-17 (selected)
- Fixes for various minor bugs and other small improvements.
March 8, 2016 – 3.9.4.10498
- Added light wrappers for
text
andhistogram
routines. Now both accept mp-parameters without errors. - We have upgraded to new Intel C++ compiler, MPIR, MPFR and MPC. Any compatibility tests and reports would be highly appreciated.
March 3, 2016 – 3.9.4.10481
- Basic test matrices are enabled with
classname='mp'
support:hadamard, hilb, invhilb, magic, pascal, rosser
andwilkinson
.function [c,r] = inverthilb(n, classname) % INVERTHILB Class-independent inversion of Hilbert matrix A = hilb(n,classname); c = cond(A); r = norm(eye(n,classname)-A*invhilb(n,classname),1)/norm(A,1); end
Invert Hilbert matrix using different levels of precision:
>> [c,r] = inverthilb(20,'double') c = 1.5316e+18 r = 2.1436e+11 >> mp.Digits(50); >> [c,r] = inverthilb(20,'mp') c = 2.4522e+28 r = 9.4122e-25 >> mp.Digits(100); >> [c,r] = inverthilb(20,'mp') c = 2.4522e+28 r = 2.2689e-74
- Added
cast
function for conversion of mp-entities to/from different data types.>> cast(magic(3),'like',mp('pi')) ans = 8 1 6 3 5 7 4 9 2 >> format longE >> cast(rand(2,'double'),'mp') ans = 6.9907672265668596711662985399016179e-01 9.5929142520544430361439935950329527e-01 8.9090325253579849551499592053005472e-01 5.4721552996380307121171426842920482e-01 >> cast(rand(2,'mp'),'double') ans = 1.386244428286791e-01 2.575082541237365e-01 1.492940055590575e-01 8.407172559836625e-01
- Added solvers for Riccati equations:
care, gcare, dare
andgdare
.
February 29, 2016 – 3.9.4.10458
- Added
cplxpair
andunwrap
functions. - Performance of multiplication has been improved for the precision levels <= 385 of decimal digits.
>> mp.Digits(350); >> A = mp(rand(500)-0.5); % 3.9.4.10443 >> tic; A*A; toc; Elapsed time is 22.412486 seconds. % 3.9.4.10458 >> tic; A*A; toc; % x4 times faster Elapsed time is 5.212486 seconds.
Parameters of our arithmetic engine were tuned for old CPUs. Now it is updated for modern architectures.
Thanks to Massimiliano Fasi who noticed and reported to us performance drop for small precisions. - Conjugate pairs computed by generalized eigen-solver are now guaranteed to match exactly (bit-to-bit).
Before they matched up to machine epsilon in any precision. Reported by Stefan Güttel. - Now toolbox overloads global built-in functions for array creation:
zeros, ones, eye, nan, inf, rand, randn
andrandi
.The main goal is to embed support for ‘mp’ arrays in MATLAB out-of-the-box, without the need formp(zeros(...))
wrapper:% 3.9.4.10443 >> A = zeros(3,3,'mp'); % didn't work since built-in zeros doesn't know what 'mp' is. >> A = mp(zeros(3,3)); % manual modification was needed. % 3.9.4.10458 >> A = zeros(3,3,'mp'); % now works as expected >> whos A Name Size Bytes Class Attributes A 3x3 376 mp
This change is very important, as it allows much easier conversion of existing scripts to multiprecision.
In fact, we have added option to override behavior of such basic functions to always produce mp-outputs for floating-point types:
double
andsingle
. Be careful with such powerful option (it is turned off by default).Please run
doc mp.OverrideDoubleBasicArrays
for more information on its usage, pros and cons.
February 22, 2016 – 3.9.4.10443
- Added arbitrary-precision overloads for the following functions:
orth, rref, airy, beta, betaln, ellipj, ellipke
andlegendre
. - The
svd, null
andnorm
have been updated to support more special cases.>> A = [1 2 3; 1 2 3; 1 2 3]; >> Z = null(mp(A)) Z = -0.8728715609439695250643899416624781 0.4082482904638630163662140124509814 -0.2182178902359923812660974854156192 -0.8164965809277260327324280249019639 0.4364357804719847625321949708312385 0.4082482904638630163662140124509821 >> norm(A*Z,1) ans = 1.733336949948512267750380148326435e-33 >> null(mp(A),'r') ans = -2 -3 1 0 0 1
While working on MATLAB’s internal scripts we found several cases of undocumented syntax:
[Q,Z] = svd(...) % two-outputs only norm(A,'inf') % Inf is passed as a string [US,TS, Success] = ordschur(...) % three-outputs, with status as last one.
Now
svd
andnorm
in toolbox support these special cases for compatibility. - Basic matrix manipulation & analysis functions have been optimized for sparse matrices:
diag, triu, tril, norm, max
andmin
.>> A = mp(sprand(10000,10000,0.01)); >> nnz(A) ans = 994960 >> tic; norm(A,1); toc; Elapsed time is 0.0531895 seconds. >> tic; max(sum(abs(A))); toc; Elapsed time is 0.103413 seconds.
Not too bad for handling
1M
of nonzero quadruples in sparse format!
February 15, 2016 – 3.9.3.10265
- Added LDLT decomposition for Hermitian indefinite matrices (dense). Real and complex cases are supported and optimized for multi-core CPUs.
>> A = mp(rand(1000)-0.5); >> A = A+A'; >> tic; [L,D,p] = ldl(A,'vector'); toc; Elapsed time is 7.412486 seconds. >> norm(A(p,p) - L*D*L',1)/norm(A,1) ans = 9.900420499644941245191320505579399e-33
Still we consider this as a draft version – since there is further potential for speed-up.
February 9, 2016 – 3.9.2.10193
- Added iterative methods for large/sparse systems:
bicg, bicgstab[l], cgs, gmres, minres
andpcg
.
All special cases are supported including the function handle inputs and preconditioners. Usage examples can be found in updated posts of Krylov Subspace Methods in Arbitrary Precision and Arbitrary Precision Iterative Solver – GMRES. - Added ODE solver for stiff problems –
ode15s
. - Many basic functions have been enabled with sparse matrices support.
January 30, 2016 – 3.9.1.10015
- Added
spline, pchip, ppval, mkpp, unmkpp
andinterp1
routines. - Fixed accuracy loss in
expm
and crash inordschur
when U is real but T is complex matrix. Thanks to Massimiliano Fasi for tests and reports! - Indexing engine
subsref
has been enabled with all the ad-hoc rules of MATLAB in case when the first (and only) index is semi-empty matrix. This is needed to match the MATLAB behavior in rare cases, e.g. when empty matrices are used as indices in operands of arithmetic operations.
January 19, 2016 – 3.9.0.9998
- Added
norm
computation for sparse matrices and vectors. All norms are supported except the 2-norm for sparse matrices (since it needssvds
). Please use1, Inf
or recommended'fro'
norm for matrices. - Added
mp.GaussKronrod
function for computation of Gauss-Kronrod nodes and weights. - Improved accuracy of
eig
in computing eigenvectors of real symmetric tridiagonal matrices.The method we used previously (inverse iteration) suffers from numerical instability for very ill-conditioned eigenvectors. Now we have switched to implicit QR.
December 12, 2015 – 3.9.0.9970
- Added concatenation for sparse matrices:
cat, vertcat
andhorzcat
. - Fixed
find
in case when no named output parameters is provided. - Changed error messages to be shorter and more informative. Default function from MEX API –
mexErrMsgTxt
shows intimidating messages with little information:% mexErrMsgTxt (Before): >> A = mp(magic(3)); >> A(-1,0) = 10 'Error using mpimpl' 'Subscript indices must either be real positive integers or logicals.' 'Error in mp/subsasgn (line 871)' ' [varargout{1:nargout}] = mpimpl(171, varargin{:});' % Using our custom workaround: >> A = mp(magic(3)); >> A(-1,0) = 10 'Error using mp/subsasgn (line 871)' 'Subscript indices must either be real positive integers or logicals.'
Now error messages two times shorter with all the necessary information.
December 2, 2015 – 3.9.0.9938
- Fixed
rcond
for better handling of singular matrices. - Improved generalized eigen-solver (unsymetric case). Now it tries to recover converged eigenvalues even if QZ has failed. Thanks to Mohammad Rahmanian for reporting and helping to reproduce the issue!
December 1, 2015 – 3.9.0.9935
- Added
linsolve
function as a simple wrapper overmldivide
. Toolbox already has mature solver implemented inmldivide
which analyses input matrix properties and applies the best suitable algorithm automatically. No need to re-implementlinsolve
separately. - Improved performance and fixed bug in power functions:
.^
and^
. - Added functions for saving/loading mp-objects to/from text file:
mp.Digits(34); A = mp(rand(25)); mp.write(A,'matrix.txt'); % Write mp-matrix to the text file B = mp.read('matrix.txt'); % Read it back norm(A-B,1) % check accuracy - difference should be 0 0
Function
mp.write
converts mp-matrix to text format and stores it to file. To load the saved matrix back to MATLAB usemp.read
. Data is saved with enough precision to be restored without loss of accuracy.
November 6, 2015 – 3.8.9.9901
- Special functions – hypergeometric, gamma and whole family of Bessel functions have been updated. In particular, Bessel functions have been rewritten from scratch to support arbitrary orders and arguments, avoid instability regions and accuracy loss in some cases. Now we are not only the fastest in MATLAB world:
% Symbolic Math Toolbox - R2015a >> digits(34); >> z = 150*vpa((rand(50)-0.5)+(rand(50)-0.5)*1i); >> tic; hypergeom([],vpa(1000-1000*i),z); toc; Elapsed time is 6.208366 seconds. >> tic; besseli(0,z); toc; Elapsed time is 8.635691 seconds. >> tic; besselk(0,z); toc; Elapsed time is 9.938277 seconds. >> tic; bessely(0,z); toc; Elapsed time is 17.444061 seconds. >> tic; besselj(0,z); toc; Elapsed time is 10.570827 seconds. % Advanpix Multiprecision Toolbox - 3.8.9.9901 >> mp.Digits(34); >> Z = sym2mp(z); >> tic; hypergeom([],mp('1000-1000*i'),Z); toc; Elapsed time is 0.155974 seconds. >> tic; besseli(0,Z); toc; Elapsed time is 2.131102 seconds. >> tic; besselk(0,Z); toc; Elapsed time is 2.163481 seconds. >> tic; bessely(0,Z); toc; Elapsed time is 4.301771 seconds. >> tic; besselj(0,Z); toc; Elapsed time is 3.228625 seconds.
but also the most accurate:
% Symbolic Math Toolbox - R2015a >> digits(34); >> z = 8-43*1i; >> besseli(1,vpa(z)) ans = -17.2937_918159785... + 178.7197_1803657...i % only 6-7 digits are correct!! % Advanpix Multiprecision Toolbox - 3.8.9.9901 >> mp.Digits(34); >> besseli(1,mp(z)) ans = -17.29378620503294288287975364153844 + 178.7197375141310577798644760813918i % full precision Maple: -17.29378620503294288287975364153844 + 178.7197375141310577798644760813918*I Mathematica: -17.29378620503294288287975364153844 + 178.7197375141310577798644760813918*I % One more example: >> bessely(0,vpa(1000*i)) ans = -2.54099907...376797872e381 + 2.485686096075864174562771484145676e432i % huge real part? >> bessely(0,mp(1000*i)) ans = -1.28057169...387105221e-436 + 2.485686096075864174562771484145677e+432i % not really
This is just simplest examples, we have logged many other cases.
IMPORTANT. Bessel functions from MathWorks Symbolic Math Toolbox do suffer from accuracy loss. Please avoid using it if you need high accuracy.
October 19, 2015 – 3.8.9.9541
- Added/improved following functions:
gradient, linspace, logspace, meshgrid
andndgrid
. - Fixed bugs in
svd(X,0)
and indiff
. - Fixed issue with complex division with infinite components:
% Before >> 1/mp(Inf + 0.1*1i) ans = NaN - NaNi % Now (correct) >> 1/mp(Inf + 0.1*1i) ans = 0
Thanks to Ciprian Necula for bug reports!
October 16, 2015 – 3.8.9.9528
- Thread manager has been improved with better load balancing. Now majority of dense matrix operations are faster by
15-20%
(even the matrix multiplication). - Fixed two issues reported here and here.
- Added support for MATLAB’s line spacing setting (
format compact/loose
):>> mp.Digits(34) >> mp.FollowMatlabNumericFormat(true); >> A = mp(rand(2)); >> format loose >> A A = 0.4853756487228412241918817926489282 0.1418863386272153359612957501667552 0.8002804688888001116708892368478701 0.4217612826262749914363325842714403 >> format compact >> A A = 0.4853756487228412241918817926489282 0.1418863386272153359612957501667552 0.8002804688888001116708892368478701 0.4217612826262749914363325842714403 >> format longE >> A A = 4.8537564872284122419188179264892824e-01 1.4188633862721533596129575016675517e-01 8.0028046888880011167088923684787005e-01 4.2176128262627499143633258427144028e-01 >> format shortE >> A A = 4.8538e-01 1.4189e-01 8.0028e-01 4.2176e-01
October 8, 2015 – 3.8.9.9464
- Speed of symmetric eigen-decompositions has been boosted up. All cases are covered: standard and generalized problems in quadruple and arbitrary precision modes and both complexities.Actual speed-up factor depends on number of cores and matrix size. On our 1st-gen Core i7 we see ~3 times ratio for
1Kx1K
matrix:mp.Digits(34) A = mp(rand(1000)-0.5); A = A+A'; % 3.8.8.9254 tic; eig(A); toc; Elapsed time is 48.154 seconds. % 3.8.9.9464 tic; eig(A); toc; % x3.4 times faster Elapsed time is 14.212 seconds.
Speed-up is even higher for other precision levels:
mp.Digits(50) A = mp(rand(1000)-0.5); A = A+A'; % 3.8.8.9254 tic; eig(A); toc; Elapsed time is 204.431 seconds. % 3.8.9.9464 tic; eig(A); toc; % x5.3 times faster Elapsed time is 38.534 seconds.
- In fact, all operations with dense symmetric/Hermitian and SPD/HPD matrices have became faster:
chol, inv, det, solvers,
etc. Speed-up ratio forinv
is up to 10 times depending on matrix structure and number of cores.
Full comparison table is in preparation.
September 22, 2015 – 3.8.8.9254
- Estimation of reciprocal of the condition number has been updated for all matrix types in solver (
\
) and inversion(inv
). Now it is much faster and always produce accurate values (before we encountered occasional0
‘s).Just for fun:A = magic(5); A(3,:) = A(2,:)+10*eps; % make problem ill-conditioned in double precision b = rand(5,1); x = A\b; Warning: Matrix is close to singular or badly scaled. RCOND = 2.899200e-18. norm(A*x-b,1) % computed solution is useless (as expected) ans = 4.344157433375132e+00 mp.Digits(34); x = mp(A)\mp(b); % problem is not ill-conditioned when we use 34-digits norm(A*x-b,1) ans = 4.336808689942017736029811203479767e-18
Although half of digits is lost (see the RCOND magnitude), our solver still gives us ~17 digits of accuracy.
September 21, 2015 – 3.8.8.9242
- Architecture of multi-core parallelism has been revised and improved. You may notice better timings in various dense operations, depending on number of cores and matrix size.For example, this is very beneficent for computations of eigenvectors in unsymmetric case. Which is faster up to
45%
on Core i7 (4 hardware cores):A = mp(rand(300)-0.5 + 1i*(rand(300)-0.5)); % 3.8.6.9165 tic; [V,D]=eig(A); toc; Elapsed time is 71.154 seconds. % 3.8.8.9242 tic; [V,D]=eig(A); toc; Elapsed time is 38.5 seconds.
Both modes (quadruple & multiprecision) and complexities (real & complex) are improved.
September 13, 2015 – 3.8.6.9165
- Following functions have been updated:
cond, svd, rank, norm, det, inv, trace, eps
andnull
. Performance optimization, stability, special cases.
September 6, 2015 – 3.8.6.9106
- Improved matrix analysis stage in dense matrix solver.
- Improved m-wrapper for
conv
andcolon
. - Improved compatibility of
inv
with MATLAB. Now it returns fullInf
matrix in case of singular input matrix, estimates RCOND and provides warning if problem is near-singular. - Added check for
NaN/Inf
elements in input matrix ineig
.
August 6, 2015 – 3.8.5.9081
- Added option
'valid'
forconv
.
August 2, 2015 – 3.8.5.9059
- Fixed critical issue in
subsasgn
in multiple-precision mode (when digits<>34
). - Added workaround for occasional Matlab crashes due to incorrect unload of toolbox (after
clear all
).
Update is strongly recommended!
July 24, 2015 – 3.8.5.8939
- Added two-output variants of Cholesky decomposition:
[R,p] = chol(A)
[L,p] = chol(A,'lower')
[R,p] = chol(A,'upper')
- Added support for
'vector'/'matrix'
option inLU
decomposition.
July 22, 2015 – 3.8.4.8915
- Fixed bugs in
chol
and in auto-detection of numeric constants.>> mp.Digits(50); >> mp.ExtendConstAccuracy(false); % auto-detection is disabled by default >> mp(1/3) ans = 0.33333333333333331482961625624739099293947219848633 >> mp.ExtendConstAccuracy(true); >> mp(1/3) ans = 0.33333333333333333333333333333333333333333333333333
July 6, 2015 – 3.8.4.8901
- Added
spdiags
for sparse matrices. - Improved multi-core parallelism in operations with really large matrices (N > 5000). Different thread-scheduling is required (and has been implemented) for the case.
June 4, 2015 – 3.8.3.8882
- Improved performance of
pinv
andnull
with optimized SVD code. - Fixed
norm
to prevent crashes in some cases when input matrix is empty.
May 28, 2015 – 3.8.3.8861
- Fixes in multi-dimensional array slicing and indexing of complex matrices.
Thanks to Thomas Rinder, Stefan Güttel and Maxime Pigou for reports and help. - Improved performance of arithmetic operations when one of the arguments is complex scalar.
This is maintenance release. Meanwhile we continue focusing on adding sparse matrices functionality in development branch.
April 20, 2015 – 3.8.3.8819
- Changes in architecture for faster work with sparse matrices.
- Ordering functions for sparse matrices (to reduce fill-in prior direct solvers) have been added:
amd, colamd, symamd
andsymrcm
.Efficient direct solvers are needed for spectral transformation in generalized eigenvalue solver for large matrices. - Improved performance of
find
for sparse matrices.
April 1, 2015 – 3.8.2.8775
- Minor speed-up in dense linear algebra (especially in SVD) due to more efficient memory management.
- New analysis functions have been added:
issymmetric, ishermitian, isdiag, istriu
andistril
.
Syntax and semantic are fully compatible with MATLAB’s built-in functions. - Fixed issues with Not-a-Number (NaN) handling in relational operators.
March 19, 2015 – 3.8.1.8728
- Restrictions on maximum iterations in SVD has been changed to be dependent on precision (needed for the high-precision settings, e.g. 1000 digits or more).
- Minor fixes in sub-scripted assignment operator,
sum
andprod
. - Stability and accuracy of
quadgk
has been improved. Now it can be used in arbitrary precision mode:mp.Digits(50); f = (@(x)sin(x)); [q,errbnd] = quadgk(f,mp(0),mp(1),'RelTol',100*mp.eps,'AbsTol',100*mp.eps,'MaxIntervalCount',2000) q = 0.45969769413186028259906339255702339626768957938206 errbnd = 7.872028207137840807482477381844110449433981844857e-51
March 1, 2015 – 3.8.1.8676
- System solver (
\, mldivide
) has been updated with all the recent optimizations.Solver is a meta-algorithm which automatically selects decomposition to use (LU, CHOL or SVD) depending on matrix structure and problem type/size. Now it is optimized for multi-core architectures and shows better performance overall (especially for large problems). - Minor updates to arbitrary precision arithmetic engine.
February 23, 2015 – 3.8.1.8617
- Arbitrary precision Cholesky decomposition, LU and QR have been updated with more optimizations (including multi-core). Speed-up depends on matrix size, larger matrix = higher speed-up.For instance,
1Kx1K
shows x3 times better speed whereas100x100
only 30%.
February 17, 2015 – 3.8.0.8477
- Fixed memory leak in coefficient-wise operations reported by Ito Kazuho (Thanks!).
February 10, 2015 – 3.8.0.8447
- Eigenvalue decomposition routines are switched to use “Small Bulge Multi-shift QR Algorithm with Aggressive Early Deflation” (Braman, Byers and Mathias).Speed gain is x2-x3 times for large dense matrices (>
500
) and decreases with smaller matrix size. The highest speed-up is achieved in multiprecision mode.
January 23, 2015 – 3.7.9.8323
- Further optimization of large matrix computations by using parallel algorithms on multi-core architectures. Now speed scales better with number of cores:
>> A = mp(rand(2000)); % 1 - core CPU: >> tic; lu(A); toc; Elapsed time is 24.14 seconds. % 4 - core CPU: >> tic; lu(A); toc; Elapsed time is 8.9 seconds.
Solvers, decompositions and eigen-routines benefit from the update (quadruple & multiprecision mode).
January 15, 2015 – 3.7.8.8309
- Memory manager for multiprecision objects has been completely re-written with the focus on fast allocation and minimizing memory fragmentation for large matrices. Speed gain depends on matrix size and operation. For example, addition of two
3Kx3K
complex matrices shows x3 times improvement:% 3.7.8.8309 - new version >> mp.Digits(50); >> A = mp(rand(3000)-0.5 + 1i*(rand(3000)-0.5)); >> tic; A+A; toc; Elapsed time is 3.88 seconds. % 3.7.7.8234 - previous >> mp.Digits(50); >> A = mp(rand(3000)-0.5 + 1i*(rand(3000)-0.5)); >> tic; A+A; toc; Elapsed time is 11.9 seconds.
Speed-up is higher for larger matrices.
- Several changes to formatted input/output, including new feature of whole-matrix conversion to mp-object:
>> A = mp('[1/2, sqrt(-1); (-1)^(1/2) pi]') A = 0.5 + 0i 0 + 1i 0 + 1i 3.141592653589793238462643383279503 + 0i
This applies only for matrices with constant elements – useful for scripts which store constant parameters in a matrix.
January 7, 2015 – 3.7.7.8234
- Scripts with high volume of small scale computations (scalars, small matrices, etc.) are faster by up to 2 times.Usually in such computations the most time was spent in communicating with MATLAB through MEX API functions, which are very slow and impose heavy overhead (e.g. it is not possible to access variable’s memory directly, only after making the deep copy).Today we have found long-awaited workaround for one of such issues – slow creation of ‘mp’-objects in MATLAB. It was affecting every single routine and now it is faster by 50%.
- Linux version of toolbox has been updated with all the recent changes and improvements.
January 5, 2015 – 3.7.6.8202
- Multiple precision generalized eigenproblem and SVD functions are 1.5-3 times faster (
eig(A,B), qz, svd
). Speed-up is higher for larger matrices, e.g. for 2000 x 2000 we can expect 5 times improvement or more. - Matrix multiplication is faster by 1.5-2 times – in quadruple and multiprecision mode. Recent versions of MATLAB restrict number of threads allowed to be used in parallel computations. Now we choose this independently from MATLAB. As a result, we can use more cores.
- Overall, we have been working on multi-core optimizations and you might see noticeable speed-up in large matrix computations.
December 26, 2014 – 3.7.5.7900
- Multiple precision complex computations are faster by up to 2 times. This affects all the routines including matrix computations –
eig, svd, qr, lu,
etc. - Optimized Kronecker product function
kron
has been added.
December 24, 2014 – 3.7.4.7853
- The functions
schur, ordschur, qz, ordqz, ordeig, balance
andrcond
have been extended and optimized. Now we fully support'real'/'complex'
standard and generalized Schur decomposition together with re-ordering in arbitrary and quadruple precision.In a last two weeks the singular/eigenvalues module of the toolbox has been completely changed, 80% of code refactored for improved functionality, stability, speed and feature support. This is part of the work needed for addingEIGS
function in next versions.
December 17, 2014 – 3.7.3.7605
- Eigen-decomposition routines for full matrices have been completely revised. Improved processing of symmetric, Hermitian and symmetric tridiagonal matrices by using MRRR and Divide&Conquer algorithms. Higher speed, better functionality.
- Routine for generalized eigen-problem,
eig(A,B)
, has been enabled with full multiprecision support without restrictions (before we had it in quadruple precision).
December 12, 2014 – 3.7.2.7508
- Implemented arbitrary precision divide & conquer SVD algorithm (we had it only for quadruple precision).
Overall speed-up factor is 7 times for real and complex matrices.
December 3, 2014 – 3.7.2.7464
- Added element-wise arithmetic operations with scalar for sparse matrices.
November 25, 2014 – 3.7.2.7422
- Improved compatibility with Parallel Computing Toolbox (removed intra-process blocks).
- Fixed issue in indexed assignment when LHS is empty matrix.
November 19, 2014 – 3.7.2.7355
- Added workaround for malfunctioning
mxDuplicateArray
. - Matrix inversion is sped-up in quadruple mode.
- Small fixes and improvements.
November 14, 2014 – 3.7.2.7314
- Speed-up of array manipulation operations. All platforms are covered – Windows, Linux and Mac OSX.
- Fixed bug in matrix power function (for complex matrices).
November 11, 2014 – 3.7.1.7230
- Optimized memory usage for
mp
objects after on-the-fly precision change. - Fixed minor bugs in
colon
and matrix power functions.
September 16, 2014 – 3.7.1.7217
- Performance of matrix computations are boosted up by additional 25-30% (Windows only).
September 9, 2014 – 3.7.0.7002
- Speed of matrix computations (solvers, decompositions, etc.) is boosted up by 30% for real dense, by 40% for complex dense and by 50% for sparse cases. The improvement is for pure multiple precision computations (not quadruple).
August 28, 2014 – 3.6.7.6340
- Added indexed assignment capability for sparse matrices (
subsasgn
). - Added
nextprime, prevprime, sym2mp, mp2sym, isequal, isequaln, logical
andislogical
functions. - Fixed issues with empty sparse matrices support.
August 21, 2014 – 3.6.5.6104
- Improved code for generation of nodes and weights for Gaussian-family quadrature.
August 15, 2014 – 3.6.5.6102
- Added
primes, factor, gcd, lcm
andisprime
functions in arbitrary precision.
August 8, 2014 – 3.6.5.6048
- Added
condeig
function. - Refined
eig
to produce matrix of left eigenvectors:[V,D,W] = eig(...)
.
August 6, 2014 – 3.6.5.6015
- Added polynomial functions:
poly, polyeig, residue, polyder, polyval, polyvalm, deconv, polyfit
andpolyint
. - Refined
eig
to support special flags ('vector'/'matrix'
, etc.) .
July 31, 2014 – 3.6.4.5174
- Added fast
permute, ipermute, shiftdim, blkdiag, squeeze, circshift
androt90
functions.
July 25, 2014 – 3.6.4.5128
- Added
bsxfun
function (andkron
as a consequence). - Enabled
mp
-objects to be used as indices in referencing operations.
July 24, 2014 – 3.6.4.5101
- Further speed-up of data exchange with MATLAB – total speed-up in all operations is 15-20%.
July 21, 2014 – 3.6.4.5051
- Speed-up of data exchange with MATLAB – all operations are faster now.
- Fixed bugs related to memory leaks and memory alignment.
May 2, 2014 – 3.6.3.4945
- Updated
hankel
andtoeplitz
to make it compatible with recent changes to toolbox architecture. Thanks to Ilya Tyuryukanov for reporting the bug.
April 14, 2014 – 3.6.3.4941
- Fixed bug when scalar is passed to
diag
function.
March 3, 2014 – 3.6.3.4931
- Speed of sparse matrices computations is boosted up by 3-5 times on Linux platform.
January 22, 2014 – 3.6.3.4889
- Fixed incompatible exception handling with MATLAB (on Linux).
Toolbox was catching all exceptions coming from withing the code. However some of theMEX
functions throw their own exceptions of hidden (and unknown) type to toolbox. Any attempts to catch them led to MATLAB crash. Now this is fixed – thanks to Takashi Nishikawa for sending the crash dumps.
November 22, 2013 – 3.6.3.4872
- Improved performance of dense and sparse solvers thanks to re-designed memory layout & access patterns.
- Fixed triangular solvers, improved matrix analysis routines (positive-definite, etc.) in solvers.
November 8, 2013 – 3.6.2.4812
- Added
precomputeLU
function targeted for use in iterative schemes for sparse matrices (e.g. Arnoldi process for computing eigenvalues).New function computes and stores LU factorization of a given sparse matrix directly in toolbox’s core. Then pre-computed LU can be used to solve system of equations with different right-hand side by standard"\"
. Here is simple example:mp.Digits(34); A = rand(1000); A = mp(sparse((A>0.5).*A)); F = precomputeLU(A); for k=1:10 b = mp.rand(1000,1); x = F\b; % re-use of LU with different b end;
This approach has advantage over usual
x = U\(L\(P*b))
by avoiding overhead of transferring data between MATLAB and toolbox. - Most trigonometric & exponential functions are sped up in favor to quadruple precision.
- Fixed minor bug in
sort
.
November 1, 2013 – 3.6.1.4792
- Greatly improved performance of the dense matrix solver (both real & complex) in quadruple precision mode.Now we tear up famous competitors by even greater margin: Advanpix vs. VPA vs. Maple – Dense Solvers and Factorization.This improvement gives us right to claim that now we have the fastest quadruple precision on the planet 🙂
- Lots of small performance-oriented improvements in toolbox core – avoiding temporaries where possible, better caching, etc.
October 20, 2013 – 3.5.6.4760
- Added routine for generalized Schur decomposition –
qz
. Syntax & functionality are equivalent to MATLAB’s routine with the exception that we do not support'real'
flag. Results are computed in'complex
‘ mode (default in MATLAB).mp.Digits(34); A = mp(rand(30)); B = mp(rand(30)); [AA,BB,Q,Z,V,W] = qz(A,B); a = diag(AA); b = diag(BB); l = a./b; % Check absolute error norm(Q*A*Z-AA,1) ans = 4.023065828938653331292726401171631e-32 norm(Q*B*Z-BB,1) ans = 4.06883050821433466532161841735413e-32 norm(A*V - B*V*diag(l),1) ans = 1.522296748911918101325723347183125e-31 norm(W'*A - diag(l)*W'*B,1) ans = 6.646702710471966018101416671748383e-32
- Added new function
mp.FollowMatlabNumericFormat(true | false)
. It allows user to choose numeric formatting preferences for the toolbox. Iftrue
, toolbox will obey numeric format settings in MATLAB (show limited number of digits) or display all digits of precision otherwise (by default).Default settings can be adjusted inmpstartup.m
script as usual.>> mp.Digits(30); >> mp.FollowMatlabNumericFormat(false); >> mp('pi') ans = 3.14159265358979323846264338328 >> mp.FollowMatlabNumericFormat(true); % Fixed-point formats >> format short >> mp('pi') ans = 3.1416 >> format long >> mp('pi') ans = 3.141592653589793238462643383280 % Scientific floating-point formats >> format shortE >> mp('pi') ans = 3.1416e+00 >> format longE >> mp('pi') ans = 3.141592653589793238462643383280e+00 % Fixed or scientific formats >> format shortG >> mp('pi') ans = 3.1416 >> format longG >> mp('pi') ans = 3.14159265358979323846264338328 % C99 hex float format >> format hex >> mp('pi') ans = 0x3.243f6a8885a308d313198a2e037080p+0
October 16, 2013 – 3.5.5.4731
- (Preliminary) Added adaptive Gauss-Kronrod numerical integration routine –
quadgk
. It is completely compatible with default MATLAB function including options and all arguments.>> mp.Digits(34); >> f = @(x) exp(-x.^2).*log(x).^2; >> Q = quadgk(f,mp(0),mp(inf),'RelTol',mp('1e-15'),'AbsTol',mp('1e-30')) Q = 1.947522180300781587359083284072062
- Fixed bug in
mp
constructor to handle the case with recursive precision downgrade, e.g.:mp(mp(pi,50),10)
. Both cases – dense and sparse are covered.
September 26, 2013 – 3.5.5.4665
- New sparse matrices serialization, now it is much more efficient and supports
NZMAX
parameter to reserve space / lower chances for costly re-allocations during computations. - Indexed referencing (
subsref
) is now supported for sparse matrices. For example:A(:), A(2,3), A(1,:)
, etc. Please note – indexed assignment for sparse matrices is not yet implemented. - Fixed bug in
diff
. MATLAB ignores indexation overloads and calls built-in functions in methods of the class. This is quite an unpleasant surprise (another violation of classic OOP design). Anyway ODE routines now work correctly.
September 17, 2013 – 3.5.5.4629
- All direct solvers for sparse matrices (LDLT, SuperLU and QR) are optimized for quadruple precision. Speed gain is approx.
x10
times. Tables with new timings can be found here. - Sparse QR has been fixed and improved, now we can solve undetermined systems as well.
- Added new functions:
conv
andpoly
. - Fixed bug in
subsref
, related to special case of vector indexing.
September 5, 2013 – 3.5.4.4586
- Added direct solver for sparse matrices, operator
"\"
.Solver includes sparse LDLT, SuperLU and QR decomposition enabled with arbitrary precision support. The most appropriate method is chosen automatically depending on matrix properties. - Added arithmetic operations for sparse matrices (
+,-,*
) and new functions:sparse
andtranspose
. Usage syntax & semantic is completely compatible with MATLAB’s rules. - Improved support of empty matrices when used as indices.
- Speed-up of sparse matrices handling, mixed complexity matrix multiplication, conversion to double, formatted output, etc.
- Fixed bugs in
subsasgn, subsref, norm
androots
.
August 13, 2013 – 3.5.2.4293
- Completely re-designed arbitrary precision arithmetic engine with emphasis on performance. Now real arithmetic operations are up to
20%
faster. Complex operations arex2
times faster.
This improvement has major impact on overall performance of the toolbox. Thus, Fourier transform has became x2
times faster (as direct consequence of faster complex arithmetic). Even matrix multiplication receives benefit of 20%-50%
increase in performance.
August 9, 2013 – 3.5.1.4260
- Added functions for sparse matrix manipulations:
nonzeros, spfun, spones, full, nnz, nzmax
. - Improved support of empty matrices and fixed minor bug in
numel
(case of sparse matrices).
August 6, 2013 – 3.5.1.4193
- We have re-fined common array operations with improved multi-dimensional support:
sum, prod, cumsum, cumprod, max, min, sort, fft, ifft
. - Fourier transform speed-up by 25%.
July 31, 2013 – 3.5.0.4112
- Very basic support of sparse matrices in multiple precision. Will extend it in upcoming versions. See Rudimentary Support for Sparse Matrices for more details.
- Custom implementation of indexing, referencing and similar operations (
subsref
,subsasgn
,cat
, etc.). - Matrix computations
x2-3
times speed up on Windows 64-bit. - Display & formatted output improvement
July 12, 2013 – 3.4.4.3840
- We have added
mp.randn
function for generating normally distributed pseudorandom numbers with arbitrary accuracy. All special cases are supported for full compatibility with MATLAB:r = mp.randn(n) r = mp.randn(m,n) r = mp.randn([m,n]) r = mp.randn(m,n,p,...) r = mp.randn([m,n,p,...]) r = mp.randn r = mp.randn(size(A)) r = mp.randn(..., 'double') % 'double' is ignored r = mp.randn(..., 'single') % 'single' is ignored
Also we have refined
mp.rand
, now it is faster and able to generate multidimensional arrays as well.
July 3, 2013 – 3.4.4.3828
- To resolve the problem with mixed usage of limited accuracy
double
precision constants in expressions withmp
entities, we have introduced new global setting in the toolbox:mp.ExtendConstAccuracy()
.
It enables/disables auto-detection and recomputing of the constants with higher precision to match toolbox’s settings, e.g.:>> mp.Digits(34); >> mp.ExtendConstAccuracy(false); >> sin(mp(pi)) 1.224646799147353177226065932274998e-16 >> mp.ExtendConstAccuracy(true); >> sin(mp(pi)) 8.671810130123781024797044026043352e-35
In the first example toolbox uses
pi
as it is, withdouble
precision accuracy (at most 16 correct digits). In the second example, toolbox recognizes thepi
constant and re-computes it with required precision of 34 digits.Be default (can be changed in
mpstartup.m
) we usemp.ExtendConstAccuracy(true)
.
June 26, 2013 – 3.4.3.3818
- After few reports from confused users we have removed auto-detection and recomputing of commonly used constants (
pi
,eps
, etc). Now alldouble
precision constants are converted tomp
as it is – with at most 16 correct digits. Before toolbox was trying to recompute them in higher precision.
May 20, 2013 – 3.4.3.3806
- Improved support for empty arrays/matrices.
- Optimized and improved
rcond
(including quadruple precision). - Speed up of operations with multidimensional arrays.
- Fixed minor bugs of quadruple precision mode.
April 21, 2013 – 3.4.3.3481
- Fixed bugs in incomplete gamma function computation.
- Speed up of determinant computation.
April 15, 2013 – 3.4.3.3452
- Real & complex Schur decomposition has been improved with more intelligent restriction on allowable number of Francis QR iterations. Thanks to Gokhan Tekeli from Bogazici University!
March 21, 2013 – 3.4.3.3426
- Fixed bug in
expm
. Thanks to Dmitry Smelov from Stanford! - Fixed bug and improved performance of
colon
.
March 8, 2013 – 3.4.3.3389
- Added support for multidimensional arrays and operations with them. We support coefficient-wise arithmetic operators, mathematical functions, basic information, array manipulation and other functions. Example of array manipulation:
>> x = mp(eye(2)); >> a = cat(3,x,2*x,3*x) (:,:,1) = 1 0 0 1 (:,:,2) = 2 0 0 2 (:,:,3) = 3 0 0 3 >> B = permute(a,[3 2 1]); >> C = ipermute(B,[3 2 1]); >> isequal(a,C) ans = 1
February 13, 2013 – 3.4.2.3292
- 20% performance increase in multiprecision linear algebra.
- Fixed bug in Cholesky decomposition in quadruple precision mode.
February 6, 2013 – 3.4.2.3257
New features:
- We have re-implemented QR decomposition with optimizations for quadruple precision. This resulted in significant speed-up by
10-20
times. We support following variants ofqr
function:X = qr(A)
X = qr(A,0)
[Q,R] = qr(A)
[Q,R] = qr(A,0)
[Q,R,E] = qr(A)
[Q,R,E] = qr(A,0)
Few tests with timing:
>> mp.Digits(34); % Quadruple precision >> mp.GuardDigits(0); >> A = rand(100,50); % 100 x 50 test matrix >> X = mp(A); >> tic; [Q,R] = qr(X); toc; % Full QR Elapsed time is 0.152928 seconds. >> norm(X-Q*R,1) 3.670100250272926322479797966838402e-32 >> tic; [Q,R] = qr(X,0); toc; % Economic QR Elapsed time is 0.110708 seconds. >> norm(X-Q*R,1) 3.670100250272926322479797966838402e-32
January 20, 2013 – 3.4.2.3222
New features:
- This version introduces very important feature we have been working for a few months. In order to boost performance we have made thorough speed optimization of our core engine for computations in quadruple precision (34 decimal digits).As a result we have
20-50
times better performance for matrix computations done in quadruple precision. Here is example of computing singular values of a 100×100 matrix:>> mp.Digits(34); % Switch to quadruple mode by using 34 decimal digits >> mp.GuardDigits(0); >> A = mp(rand(100)); % Use 100 x 100 matrix >> tic; svd(A); toc; Elapsed time is 0.133788 seconds. >> tic; [U,S,V] = svd(A); toc; Elapsed time is 0.668525 seconds. >> norm(A-U*S*V',1) % Accuracy check 1.605428118251356861603279613248978e-31
Basic matrix operations, eigensolvers and some decomposition routines are already updated to be much faster in quadruple precision. Further extensions are under development.
- Routine
eig()
is extended to support arbitrary matricesA,B
when computing solution of generalized eigenvalue problem with quadruple precision.>> mp.Digits(34); >> mp.GuardDigits(0); >> A = mp(rand(100)); >> B = mp(rand(100)); >> [V,D] = eig(A,B); >> norm(A*V - B*V*D,1) 2.827686455535796940129031614889614e-30
January 9, 2013 – 3.4.0.3172
New features:
- Added new solver for ordinary differential equations –
ode113
. - Both ODE solvers
ode45
andode113
are enabled with the support of event functions.
Bug fix:
- Fixed minor bug in operations involving mp-objects with different precisions.
December 28, 2012 – 3.4.0.3116
Improvements:
- Fixed incompatibility with the latest OpenMP version included in
MATLAB R2012b
. Our sincere gratitude goes to Kazuho Ito from University of Yamanashi for his excellent help in finding and investigating the problem.Unfortunately it comes with the price of dropping support of oldR2008b
. Now toolbox supports all versions ofMATLAB
starting fromR2009b
.
December 19, 2012 – 3.4.0.3041
Improvements:
- Performance has been improved by 20% – 50% in most of mathematical operations.
We have switched to Intel C++/Fortran Compilers and their new OpenMP runtime provides this gain.Side effect is that Windows & MATLAB R2008b users need to define special environment variable
KMP_DUPLICATE_LIB_OK=TRUE
in order to enable compatibility with older Intel Compilers (used to build R2008b).
December 1, 2012 – 3.4.0.3028
Improvements:
- Due to growing number of customers using Windows 8 we have added support for the OS.
November 7, 2012 – 3.4.0.3017
Improvements:
- Few months ago we have started re-implementation of linear algebra algorithms to benefit from multi-core parallelism. Basically this means that performance of the toolbox will be better on systems with more cores/CPUs.Today’s release is the first version with enabled parallelism in basic matrix operations, like multiplication.
Bug fix:
- Fixed (similar)bugs in matrix left and right divide. We have used in-proper speed optimization for a special case when one matrix is real and other has complex elements.
October 31, 2012 – 3.4.0.2947
- Performance is increased by
2.5-3
times in majority of linear algebra functions including:qr, svd, eig, schur, hess
. - Algorithm for automatic detection and re-calculation of common constants is improved to avoid false-positive errors.
October 24, 2012 – 3.3.9.2842
Bug fix:
- Fixed bug in matrix properties analysis stage of
eig()
function. Imaginary part of elements were erroneously stripped off in case of complex diagonal matrices.
Improvement:
- Speed up of multi-precision arithmetic engine by
5-10%
. - Dynamic memory manager is tuned to gain more speed on
Windows 7
.
October 3, 2012 – 3.3.8.2794
- Added new function,
balance
aimed to improve accuracy of computation of eigenvalues and/or eigenvectors. It applies similarity transformation (permutation & scaling) to a matrix so that row and column norms becomes approximately equal.Algorithm is based onxGEBAL
,xGEBAK
fromLAPACK
library. All MATLAB’s features are supported:[T,B] = balance(A)
[S,P,B] = balance(A)
B = balance(A)
B = balance(A,'noperm')
October 2, 2012 – 3.3.8.2785
- Added new function,
rcond
to compute the 1-norm estimate of the reciprocal condition number.
Algorithm is based onxGECON
routines fromLAPACK
, both complex and real matrices are supported. Usage syntax is compatible withMATLAB
:c = rcond(A)
September 25, 2012 – 3.3.8.2776
- Added new functions,
mod
andidivide
. They are needed for some built-in functions to work correctly with multi-precision entities (e.g.unwrap
).
Improvement:
- Mathematical expression parsing & evaluation have been completely re-written to be more error-robust and flexible.
August 8, 2012 – 3.3.8.2725
- Now display format of mp-entities are controlled by MATLAB’s formatted output settings. We support the following formats:
short, long, shortE, longE, shortG, longG, shortEng, longEng
andhex
.Short formats show only 4 digits after the decimal point. Long formats show full precision.>> mp.Digits(30); % Fixed-point formats >> format short >> mp('pi') 3.1416 >> format long >> mp('pi') 3.141592653589793238462643383280 % Scientific floating-point formats >> format shortE >> mp('pi') 3.1416e+00 >> format longE >> mp('pi') 3.141592653589793238462643383280e+00 % Fixed or scientific formats >> format shortG >> mp('pi') 3.1416 >> format longG >> mp('pi') 3.14159265358979323846264338328 % C99 hex float format >> format hex >> mp('pi') 0x3.243f6a8885a308d313198a2e037080p+0
July 26, 2012 – 3.3.8.2715
- Added functions for conversion to integers and vice versa:
int8, uint8, int16, uint16, int32, uint32, int64
anduint64
:>> x = mp(int64(magic(3))) 8 1 6 3 5 7 4 9 2 >> int64(x) ans = 8 1 6 3 5 7 4 9 2 >> x = mp(intmax('uint64')) 18446744073709551615 >> uint64(x) ans = 18446744073709551615
An interesting fact is that MATLAB rounds floating point numbers to nearest integer in conversion:
>> int32(1.2) ans = 1 >> int32(1.5) ans = 2
This is quite unexpected and contradicts the majority of other programming languages, where decimal parts are just truncated.
July 25, 2012 – 3.3.8.2702
- Added functions for specialized matrices computing:
compan, hankel, vander, toeplitz, mp.hilb
andmp.invhilb
.Hilbert matrix inversion test:% Multiprecision Computing Toolbox: >> mp.Digits(100); >> norm(mp.invhilb(20) - inv(mp.hilb(20)),1) 9.3295712880......440476004039710173e-51 % MATLAB: >> norm(invhilb(20) - inv(hilb(20)),1) Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 1.155429e-019. ans = 1.74653540359445e+028
Please note, integer-valued special matrices can be converted to
mp
objects directly, e.g.:mp(eye(10)) mp(ones(15)) mp(zeros(7)) mp(magic(3)) mp(rosser,100) mp(wilkinson(15)) mp(hadamard(8)) mp(pascal(5))
Bug fixes:
- Fixed bug in element-wise
power
function whenNaN
was returned for small negative arguments.
July 19, 2012 – 3.3.8.2684
- Code for data interchange between toolbox and MATLAB has been re-written to be more generic. This is the next step towards sparse matrices support. This part is greatly optimized thanks to reduced number of heap memory (de-)allocations.
July 10, 2012 – 3.3.8.2651
- We have implemented adaptive computational load balancing between Pade approximation and ISS (inverse scaling and squaring) in matrix logarithm. Now
logm()
is more stable and much faster.
July 6, 2012 – 3.3.8.2637
- Added functions for 2-D fast Fourier transform,
fft2
andifft2
. All features and special cases are supported to provide full compatibility with standard functions:
Y = fft2(X)
Y = fft2(X,m,n)
Y = ifft2(X)
Y = ifft2(X,m,n)
y = ifft2(..., 'symmetric')
y = ifft2(..., 'nonsymmetric')
- Added
subsindex
function for smooth usage ofmp
objects as indexes to matrix coefficients.
Bug fixes:
- Fixed bug in
ifft
related to special case when a'symmetric'
option is supplied.
June 22, 2012 – 3.3.7.2611
- Added
fprintf()
function. Nowmp
numbers can be printed the same way as standard floating point numbers:>> fprintf('double pi = %f and \n50-digits pi = %.50f\n', pi, mp(pi,50)) double pi = 3.141593 and 50-digits pi = 3.14159265358979323846264338327950288419716939937511
This combined with auto-recomputing of commonly used constants makes existing scripts porting to arbitrary precision even easier. There are much less modifications needed in code than before.
Improvements:
- We have implemented new heap memory management for entities of the
mp
type. Memory of “disposed” objects is not freed immediately but can be re-used for new objects without slow system calls (allocation/deallocation). This gives up to two times a performance boost in all linear algebra functions.
June 20, 2012 – 3.3.6.2600
- Automatic detection and re-calculation of commonly used
double
constants if they are encountered in expressions with multi-precision numbers. Usage of limited precisiondouble
constants (likepi
,exp(1)
,sqrt(2)
, etc.) in arbitrary precision computations has always been one of the main source of low accuracy final results.General rule is that floating-point constants should be re-computed in high precision from the beginning:mp('pi') mp('sqrt(2)') mp('exp(1)') mp('log(2)') mp('catalan') mp('euler') ...
See More on Existing Code Porting for details.
***
The new version of toolbox automatically detects and re-calculates common constants encountered in computations with arbitrary precision, includingpi
,sqrt(2)
,exp(1)
,log(2)
,eps
:>> pi ans = 3.141592653589793 >> mp(pi,50) % pi is automatically re-computed to have 50 correct digits 3.14159265358979323846264338327950288419716939937511 >> mp(eps,50) % eps is automatically recognized and adjusted to supplied precision 1.069105884036878258456214586860592751526078752042e-50 >> mp(10*sqrt(2)/571,50) % fractions with constants are also supported 0.024767312826148774935230975905598915561640488185236
Additionally toolbox correctly recognizes constants if they are used with fractional coefficients, e.g:
7*pi/3
,10*eps
,sqrt(2)/2
.Hopefully this new feature will make porting of existing programs to arbitrary precision even easier.
Bug fixes:
- Extended
dot
product to support vectors of different shapes, e.g. row – column. - Fixed incompatibility of
funm
with older versions of MATLAB.
June 14, 2012 – 3.3.5.2519
Matrix functions (logm
in particular) have been completely revised thanks to feedback of Numerical Linear Algebra Group from The University of Manchester. Now we use the state-of-the-art Schur-Parlett algorithm for computing general matrix function described in the following references:
- P. I. Davies and N. J. Higham, A Schur-Parlett algorithm for computing matrix functions. SIAM J. Matrix Anal. Appl., 25(2):464-485, 2003.
- N. J. Higham, Functions of Matrices: Theory and Computation, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008.
June 12, 2012 – 3.3.5.2511
- Added
ordschur()
– eigenvalues reordering in complex Schur factorization. All special cases are supported:
[US,TS] = ordschur(U,T,select)
[US,TS] = ordschur(U,T,keyword)
[US,TS] = ordschur(U,T,clusters)
In some situations, order of eigenvalues within clusters might not coincide with what is generated by MATLAB. This is because we do additional minimization of permutations to gain more speed. Otherwise functionality is identical to MATLAB’s built-inordschur()
. - Added solver for triangular Sylvester equation,
A*X + X*B = C
. Syntax istrisylv(A,B,C)
.
June 5, 2012 – 3.3.4.2487
- Added matrix power function,
mpower()
. We use binary squaring for integer powers and combination ofexpm
,logm
for other cases (preliminary). - Added support for logical variables, now can be used with
mp
numbers in comparisons, conversions, and expressions:
mp('1')==true
mp(false)
mp('pi') + true
Bug fixes:
- Fixed premature stop in Schur factorization due to restriction on maximum iteration count in Francis QR step algorithm.
- Fixed program crash when a user without administrator rights tries to install toolbox in restricted folders (e.g. Program Files)
Improvements:
- Implemented workaround for
libstdc++
& dynamicstd::string
problem on Mac OS X. No more segfaults because of this! - Improved performance in
mp
output routines thanks to removed dependency onboost::format
which was extremely sloooow and buggy on Mac OS X.
***
‘Version History’ above doesn’t reflect development prior June 2012.
Idea for Multiprecision Computing Toolbox was born on September 13, 2010. Development started in February 2011 (one of the first commits was done during the Great East Japan Earthquake on March 11, 2011 in shaken building midst falling shelves, computers, books and panicked people). Public version was released on September 16, 2011. Major new versions are released twice a month, with minor updates – almost every day.
ry 20, 2021 – 4.8.3.14440
- Computation of matrix exponential
EXPM
has been revised and greatly improved. Now we use on-the-fly backward error estimation and scaling reduction using Gelfand upper bound for matrix spectral radius.As a result we have ~4 times speed improvement in real and complex cases:>> mp.Digits(34); >> mp.Digits(100); >> M=100; >> A=rand(M,'mp')*10; >> B=(randn(M,'mp')+1i*randn(M,'mp'))*10; >> tic; expm(A); toc; Elapsed time is 0.260661 seconds. >> tic; expm(B); toc; Elapsed time is 1.045113 seconds.
Before:
>> mp.Digits(34); >> mp.Digits(100); >> M=100; >> A=rand(M,'mp')*10; >> B=(randn(M,'mp')+1i*randn(M,'mp'))*10; >> tic; B = expm(A); toc; Elapsed time is 0.855695 seconds. >> tic; B = expm(A); toc; Elapsed time is 3.843145 seconds.
- Fixed minor bugs in
EIGS
and inEIG(A,B)
functions. Thanks to Yury Grabovsky for bug report.
December 12, 2020 – 4.8.2.14203
- Added function
LSQNONNEG
to solve nonnegative linear least-squares problem. - Added (new) function for cosine-sine decomposition (
CSD
). Both full (2×2) and thin (2×1) matrix partitions are supported. Overall we follow the LAPACK semantic:*** Full (2 x 2): [C,S,U1,V1,U2,V2] = CSD(X11,X12,X21,X22) [ I 0 0 | 0 0 0 ] Q M-Q [ 0 C 0 | 0 -S 0 ] [ X11 | X12 ] P [ U1 | ] [ 0 0 0 | 0 0 -I ] [ V1 | ]^T X = [-----------] = [---------] [---------------------] [---------] [ X21 | X22 ] M-P [ | U2 ] [ 0 0 0 | I 0 0 ] [ | V2 ] [ 0 S 0 | 0 C 0 ] [ 0 0 I | 0 0 0 ] *** Thin (2 x 1): [C,S,U1,V1,U2] = CSD(X11,X21) [ I 0 0 ] Q [ 0 C 0 ] [ X11 ] P [ U1 | ] [ 0 0 0 ] X = [-----] = [---------] [----------] V1^T [ X21 ] M-P [ | U2 ] [ 0 0 0 ] [ 0 S 0 ] [ 0 0 I ]
The
CSD
has been implemented in C++ with multi-core optimizations for all cases – real, complex, quadruple and arbitrary precisions.>> mp.Digits(34); >> M=500;P=M/2;Q=M/2; >> X=rando(M,'mp'); >> X11 = X(1:P,1:Q); >> X12 = X(1:P,(Q+1):end); >> X21 = X((P+1):end,1:Q); >> X22 = X((P+1):end,(Q+1):end); >> tic; [C1, S1, U1, V1, U2, V2] = csd(X11, X12, X21, X22); toc; Elapsed time is 3.275797 seconds. >> mp.Digits(100); >> M=500;P=M/2;Q=M/2; >> X=rando(M,'mp'); >> X11 = X(1:P,1:Q); >> X12 = X(1:P,(Q+1):end); >> X21 = X((P+1):end,1:Q); >> X22 = X((P+1):end,(Q+1):end); >> tic; [C1, S1, U1, V1, U2, V2] = csd(X11, X12, X21, X22); toc; Elapsed time is 16.315847 seconds.
- Most of the linear algebra functions have been switched to multiprecision fused multiply–add operation with only one rounding:
FMA(A,B,C) = A+B*C
. This might lead to higher accuracy especially in ill-conditioned cases.
September 22, 2020 – 4.8.0.14105
- Fixed bug in functions for for low-level access to quadruple variables –
MP.GETWORDS64/MP.SETWORDS64
. Thanks to Kirill Serkh for bug report.
June 11, 2020 – 4.8.0.14100
Speed has been improved in all O(n3)
matrix computations. All cases are covered – real, complex, quadruple and arbitrary precision. Expected speed-up depends on particular function and on number of hardware cores of the target CPU.
As an example, using 16-core Core i9-7960X we have got following timings:
% 4.7.1.13688 (before) >> mp.Digits(34); >> A = randn(500,'mp'); >> B = randn(500,'mp'); >> tic; [V,D,W] = eig(A); toc; Elapsed time is 41.566000 seconds. >> tic; [V,D,W] = eig(A,B); toc; Elapsed time is 134.004006 seconds. >> tic; [U,S,V] = svd(A); toc; Elapsed time is 10.305934 seconds. >> mp.Digits(100); >> A = randn(500,'mp'); >> B = randn(500,'mp'); >> tic; [V,D,W] = eig(A); toc; Elapsed time is 133.028512 seconds. >> tic; [V,D,W] = eig(A,B); toc; Elapsed time is 495.003402 seconds. >> tic; [U,S,V] = svd(A); toc; Elapsed time is 27.809539 seconds.
% 4.8.0.14100 (now) >> mp.Digits(34); >> A = randn(500,'mp'); >> B = randn(500,'mp'); >> tic; [V,D,W] = eig(A); toc; Elapsed time is 11.165819 seconds. <-- 3.7 times speed-up >> tic; [V,D,W] = eig(A,B); toc; Elapsed time is 22.996163 seconds. <-- 5.8 times speed-up >> tic; [U,S,V] = svd(A); toc; Elapsed time is 2.582599 seconds. <-- 3.9 times speed-up >> mp.Digits(100); >> A = randn(500,'mp'); >> B = randn(500,'mp'); >> tic; [V,D,W] = eig(A); toc; Elapsed time is 34.167268 seconds. <-- 3.9 times speed-up >> tic; [V,D,W] = eig(A,B); toc; Elapsed time is 76.882287 seconds. <-- 6.4 times speed-up >> tic; [U,S,V] = svd(A); toc; Elapsed time is 7.226432 seconds. <-- 3.8 times speed-up
Similar performance improvement can be observed for all other matrix functions: QR, LU, CHOL,
etc.
One of the toolbox users has published comprehensive speed benchmark against Julia, Wolfram Mathematica and several Python libraries (mpmath, flint+arb
) for extended precision computations. The page is in Japanese but plots and tables are easy to understand.
Thank you Yasutaka Hanada for all your efforts and time making the benchmark!
March 28, 2020 – 4.7.1.13688
- The code of Krylov subspace expansion in
EIGS
has been converted to C++ with various optimizations including multi-core parallelism. To achieve the maximum speed, we had to switch from modified to classic Gram–Schmidt with bulk re-orthogonalization. Classic Gramm-Shmidt is based on level-2BLAS
operations (xGEMV
), which allow nice parallel execution.
March 22, 2020 – 4.7.0.13642
- The
EIGS
has been improved for the case when orthogonal Krylov subspace cannot be expanded. We added few extra steps of re-orthogonalization and last-resort attempt of explicit restart with few random vectors. Thanks to Joseph Benzaken and Gaoyong Sun for tests and reports.
January 21, 2020 – 4.7.0.13619
- Fixed crash in FFT/IFFT for the case when one of the dimensions of input argument is empty. Thanks to Dr. Luo Guo for the bug report.
November 6, 2019 – 4.7.0.13560
- Comprehensive update of quadruple precision computations library on Windows with emphasis on speed. Most of the functions (arithmetic, elementary and special functions, etc.) have became faster and more accurate.
- Code of all special functions have been revised (all platforms, including arbitrary precision mode). Now all of them are significantly faster (e.g.
besselK
is ~10 times faster for complex arguments) and more accurate. - Great number of special functions have been added: generalized hypergeometric pFq,
kummerM, kummerU, FresnelS, FresnelC, psi, gammainc, betainc, zeta, hurwitzZeta, erfi, airy, ellipke, sinc
and many others. See Function Reference for full list of special functions provided in toolbox.
August 1, 2019 – 4.6.4.13348
- Functions
ISHERMITIAN, ISBANDED, BANDWIDTH
and similar have been improved. Now all are (much) faster especially for large matrices. - Speed of all functions related to sparse matrices have been increased. This is because we switched to our new engine for computations with sparse matrices.
The most notable speed-up is in sparse LU – decomposition is 2-3 times faster now, solver is up to 10 times depending on matrix size & structure. - New functions
MP.GETWORDS64
andMP.SETWORDS64
for low-level access to quadruple number/array have been added.
Toolbox represents quadruple numbers in 128-bit IEEE-754 format which can be considered as 2 sequentially stored 64-bit unsigned integers (high and low part). FunctionMP.GETWORDS64
allows to extract these integers, whereasMP.SETWORDS64
generates quadruple number/array from integer parts:>> x = mp('pi') x = 3.141592653589793238462643383279503 >> format hex >> [h,l] = mp.getwords64(x) % Get the high & low integer parts of quadruple. h = 4000921fb54442d1 l = 8469898cc51701b8 >> y = mp.setwords64(h,l) % Generate quadruple number from integer parts. y = 3.141592653589793238462643383279503
These functions can be used for cross-platform data exchange or to interface with other quadruple-enabled codes.
- Added function to compute
KOORNWINDER
orthogonal polynomials on triangle.
March 28, 2019 – 4.6.2.13225
- Functions
MAX
andMIN
have been completely re-implemented. Now both are faster and support extended set of options –'omitnan', 'includenan'
and'all'
. Summary on currently supported variants:C = max(A,B) C = max(A,B,nanflag) M = max(A) M = max(A,[],dim) M = max(A,[],nanflag) M = max(A,[],dim,nanflag) [M,I] = max(___) M = max(A,[],'all') M = max(A,[],'all',nanflag)
At this moment, option
'vecdim'
is not yet supported. - Added
CUMMAX
andCUMMIN
functions. - All cumulative functions
CUMPROD, CUMSUM, CUMMAX
andCUMMIN
have been enabled with the support for'omitnan', 'includenan', 'forward'
and'reverse'
options. Thanks to Abe Ellison for request. - Fixed incompatibility with older syntax of
RAND
andRANDN
. Although it is now considered obsolete, some old code is still using it. Thanks to Nick Higham for requesting this feature.
February 25, 2019 – 4.6.0.13135
- Added implicit expansion/broadcasting of dimensions to following operations:
+ - .* ./ .\ .^ < <= > >= == ~= | & xor min max mod rem hypot atan2
This feature can be enabled/disabled by special command
mp.EnableImplicitExpansion
:>> mp.EnableImplicitExpansion(true); >> mp([1 2 3]).*mp([1;2;3]) ans = 1 2 3 2 4 6 3 6 9 >> mp.EnableImplicitExpansion(false); >> mp([1 2 3]).*mp([1;2;3]) Error using .* (line 1677) Matrix dimensions must agree
- Greatly improved load balancing among computational threads. Expected speed-up is up to
25%
in all “heavy” matrix operations (depending on number of CPU cores, CPU load and operation). - Numerous smaller fixes and improvements.
January 17, 2019 – 4.5.3.12859
- Version
4.5.3
is available for all platforms now. Update is strongly recommended.
December 7, 2018 – 4.5.3.12856
- Numerical stability of
EXPM
has been improved (=higher accuracy in working with highly ill-conditioned matrices). - All service routines has been reviewed and improved:
mp.NumberOfThreads, mp.GuardDigits, mp.ExtendConstAccuracy, mp.FollowMatlabNumericFormat
andmp.OverrideDoubleBasicArrays
. - Added support for multidimensional slice assignments with same number of elements but different shapes:
a = zeros(10,10,10,'mp'); b = ones(10,10,10,'mp'); a(:,:,1) = b(:,1,:);
Thanks to Taras Plakhotnik for requesting this feature.
November 1, 2018 – 4.5.2.12841
- Version
4.5.2
is available for all platforms now. Update is strongly recommended.
October 23, 2018 – 4.5.2.12840
- Stability and convergence of divide & conquer algorithm in
SVD
has been improved. Thanks to Tao Meng from Peking University for reporting the issue.
September 21, 2018 – 4.5.1.12833
- Major update of toolbox including new optimized code for operations in small to medium precision range.
Now speed difference between true quadruple(34) and other precision of comparable level is reduced.Timings of
EIG(A)
, with random unsymmetric matrix of 200×200 size:% 4.4.7.12740 (before) 20 digits, time = 6.01 sec 25 digits, time = 6.26 sec 30 digits, time = 6.82 sec 34 digits, time = 3.39 sec 35 digits, time = 7.07 sec 40 digits, time = 9.35 sec 45 digits, time = 9.33 sec 50 digits, time = 9.74 sec 55 digits, time = 10.3 sec
% 4.5.1.12833 (now) 20 digits, time = 4.51 sec 25 digits, time = 4.77 sec 30 digits, time = 5.06 sec 34 digits, time = 3.20 sec 35 digits, time = 5.24 sec 40 digits, time = 7.03 sec 45 digits, time = 7.01 sec 50 digits, time = 7.27 sec 55 digits, time = 7.81 sec
The code to reproduce the timings:
rng(0); for p=[20:5:30 34 35:5:55] mp.Digits(p); A = randn(200,'mp'); s = tic; [V,D] = eig(A); e = toc(s); fprintf('%d digits, time = %.3g sec\trel.error = %g\n', p, e, norm(A*V - V*D,1)/norm(A,1)); clear A V D e; end
Similar performance gain can be seen in all operations in arbitrary precision mode.
July 25, 2018 – 4.4.7.12740
- The
CIRCSHIFT
has been updated to support 3rd argument. This syntax was introduced in recent MATLAB versions. Thanks to Patrick Dorey for requesting this update.
April 18, 2018 – 4.4.7.12739
- Now
SORT
treatsNaN
as the largest value in array. This is needed for one-to-one compatibility withMATLAB
. Thanks to Massimiliano Fasi for reporting this issue.
March 27, 2018 – 4.4.7.12736
- Fixed incompatibility with
R2018a
on all platforms. Our deepest thanks to Andre Van Moer, Michal Kvasnicka, Enrico Onofri and Siegfried Rump for their help with tests on various systems and MATLAB versions. - Improved
QZ
decomposition to avoid premature exit in case of extremely high precision requested. Thanks to Bor Plestenjak. - Switched to new TLS model on GNU Linux. This might help for some GNU Linux installations where famous TLS-issue still shows up. Thanks to Denis Tkachenko for detailed tests.
March 15, 2018 – 4.4.6.12719
- Added ERFINV, ERFCINV and NORMINV routines in quadruple precision.
- Improved
ORDQZ
to avoid false positive alarm on numerical instability. Thanks to Denis Tkachenko.
December 25, 2017 – 4.4.5.12711
- Fast Fourier Transform (FFT) routines have been upated with various speed optimizations, including multi-core parallelism, extended set of small FFT with minimum number of arithmetic operations, etc.Now toolbox uses split-radix for power-of-two, mixed-radix Stockham framework for composite and Bluestein algorithm for prime lengths FFT. All algorithms have been optimized for parallel execution and quadruple/multi-precision modes.
Before:>> mp.Digits(34); >> x = rand(2^23,1,'mp'); % n = 8M/quadruple >> tic; y = fft(x); toc; Elapsed time is 16.759009 seconds. >> tic; z = ifft(y); toc; Elapsed time is 29.263380 seconds. >> mp.Digits(100); >> x = rand(2^20,1,'mp'); % n = 1M/100-digits >> tic; y = fft(x); toc; Elapsed time is 14.243092 seconds. >> tic; z = ifft(y); toc; Elapsed time is 21.368476 seconds.
Now:
>> mp.Digits(34); >> x = rand(2^23,1,'mp'); % n = 8M/quadruple >> tic; y = fft(x); toc; % ~5 times faster Elapsed time is 3.068695 seconds. >> tic; z = ifft(y); toc; % ~6 times faster Elapsed time is 4.953689 seconds. >> mp.Digits(100); >> x = rand(2^20,1,'mp'); % n = 1M/100-digits >> tic; y = fft(x); toc; % ~8 times faster Elapsed time is 1.753420 seconds. >> tic; z = ifft(y); toc; % ~9 times faster Elapsed time is 2.293286 seconds.
- The
COSINT
andSININT
have been extended for negative arguments. Thanks to Tomoaki Okayama and Ryota Hara. - Fixed
AIRY
for pure imaginary arguments. Thanks to Tom Wallace for detailed bug report. - Fixed bug in
SUBSAGN
reported by Jon Vegard.
October 24, 2017 – 4.4.4.12666
- Improved storage format for cross-platform compatibility. Multiprecision variables stored in file using standard
SAVE/LOAD
commands can now be transferred to any platform without loss in accuracy (e.g. from GNU Linux to Windows). Please note, quadruple precision data were always cross-platform. - Fixed numerical instability in
SYLVESTER_TRI
. Thanks to Massimiliano Fasi. - Fixed accuracy loss in
ELLIPKE
andELLIPJ
. Thanks to Denis Silantyev. - Fixed spurious complex output for extreme arguments in
BESSELJ/Y
. Thanks to Taras Plakhotnik. - Fixed premature overflow/underflow in
BESSELI
for corner case arguments.
October 3, 2017 – 4.4.3.12624
- Added extended set of routines to compute orthogonal polynomials:
chebyshevV
Chebyshev polynomials of the third kind chebyshevW
Chebyshev polynomials of the fourth kind laguerreL
Generalized Laguerre polynomials gegenbauerC
Gegenbauer (ultraspherical) polynomials jacobiP
Jacobi polynomials zernikeR
Radial Zernike polynomials All routines are optimized for parallel execution, quadruple and multi-precision modes, e.g.:
>> x = randn(500); >> tic; laguerreL(10,vpa(x)); toc; Elapsed time is 308.504995 seconds. >> tic; laguerreL(10,mp(x)); toc; % ~ 2500 times faster Elapsed time is 0.120173 seconds.
- Improved stability of divide & conquer algorithm for SVD. Thanks to Bartosz Protas for detailed bug report.
- Various small changes for better compatibility with plotting engine of MATLAB.
August 22, 2017 – 4.4.1.12580
- Parallelism. Modern processors usually run two logical threads per one physical CPU core (hyper-threading). This improves performance in situations when user runs several independent tasks (e.g. usual desktop applications). However this leads to sub-optimal results in case of heavy-load scientific computations, when multi-threaded algorithms are specifically designed to be well-balanced and exhaust all available resources of each core.To alleviate the issue, now toolbox binds its threads one-to-one to physical cores on CPU, ignoring hyper-threaded (logical) cores. You might see performance boost up to 15% depending on algorithm and your CPU:
>> mp.Digits(100); >> A = randn(500,'mp'); % 4.3.6 >> tic; [S,V,D] = svd(A); toc; Elapsed time is 33.177931 seconds. % 4.4.1 >> tic; [S,V,D] = svd(A); toc; Elapsed time is 28.520263 seconds.
This is experimental feature, implemented only in Windows version of toolbox. Other platforms will follow.
- Bessel functions. Whole family of Bessel and Hankel functions have been revised. Majority of improvements are related to the case of non-integer orders, but other cases have been polished too:
- Fixed accuracy loss for half-integer orders when |v| >> |z|. Thanks to Mert Kalfa for detailed tests and reports.
- Fixed accuracy loss in modified Bessel functions in case of pure imaginary arguments and large orders. Now we apply several different asymptotic approximations depending on parameters (one form is specifically tuned for imaginary arguments).
- Added fast implementation for Hankel functions for real arguments and integer orders.
- Added fast quadruple implementation for modified Bessel functions (real arguments and orders).
Besides, now all Bessel functions rely on parallel execution, for all kinds of arguments.
- Miscellaneous.
- Now toolbox includes pre-computed coefficients of Gauss-type quadrature for up to 512 order and up to 250 digits of precision. Which means retrieving quadrature nodes & weights now costs
O(1)
! All usual quadrature are covered – Legendre, Hermite, Gegenbauer, Jacobi and Laguerre.% 4.3.6 >> mp.Digits(34); >> tic; [x,w] = mp.GaussJacobi(256); toc; Elapsed time is 3.265601 seconds. >> mp.Digits(100); >> tic; [x,w] = mp.GaussJacobi(256); toc; Elapsed time is 14.518643 seconds. % 4.4.1 >> mp.Digits(34); >> tic; [x,w] = mp.GaussJacobi(256); toc; Elapsed time is 0.000994 seconds. >> mp.Digits(100); >> tic; [x,w] = mp.GaussJacobi(256); toc; Elapsed time is 0.002819 seconds.
This improvement was inspired by communication with Massimiliano Fasi (Padé approximation for
LOGM
) and Enrico Onofri. We hope this will be helpful for various spectral methods and approximations. - New platform-independent storage format for multiprecision arrays of non-quadruple precision. Multiprecision variables stored in file using standard
SAVE/LOAD
commands can now be transferred to any platform without loss in accuracy (e.g. from GNU Linux to Windows). Please note, quadruple precision data were always cross-platform. - Fixed critical bug in computing Bernoulli coefficients (widely used for evaluation of some special functions). The bug caused race conditions in multi-threaded computations with possible MATLAB crash.
- Now toolbox includes pre-computed coefficients of Gauss-type quadrature for up to 512 order and up to 250 digits of precision. Which means retrieving quadrature nodes & weights now costs
- Parallelism. Modern processors usually run two logical threads per one physical CPU core (hyper-threading). This improves performance in situations when user runs several independent tasks (e.g. usual desktop applications). However this leads to sub-optimal results in case of heavy-load scientific computations, when multi-threaded algorithms are specifically designed to be well-balanced and exhaust all available resources of each core.To alleviate the issue, now toolbox binds its threads one-to-one to physical cores on CPU, ignoring hyper-threaded (logical) cores. You might see performance boost up to 15% depending on algorithm and your CPU:
The large number of changes allows us to bump the version directly to 4.4.1 bypassing smaller revisions.
June 6, 2017 – 4.3.6.12354
- Improved accuracy of matrix logarithm –
LOGM
. Now it has better handling of near-singular matrices and refined Pade approximation. Thanks to Massimiliano Fasi who reported the issues and provided test cases. - The
SORT
function in toolbox has been enabled with undocumented features of MATLAB’s built-inSORT
. Thanks to Daniele Prada from “Istituto di Matematica Applicata e Tecnologie Informatiche”, in Pavia, Italy.
May 12, 2017 – 4.3.5.12344
- Speed-up of
EIGS
routine. The heaviest part (modified Gram-Schmidt) has been moved to C++. Performance improvement is 4-8 times depending on a problem. - Various bug fixes in
EIGS
,ORSCHUR
and in mixed-precision computations (especially with sparse matrices).
March 28, 2017 – 4.3.3.12232
- Added
mp.NumberOfThreads(N)
function to control how many threads toolbox can use in computations enabled with multi-core parallelism. Be default, toolbox uses all available CPU cores, which might not be optimal for all cases. Now user can adjust this flexibly. Requested by Clay Thompson.>> A = rand(1000,'mp'); >> mp.NumberOfThreads(1); >> tic; exp(A); toc; Elapsed time is 1.297683 seconds. >> mp.NumberOfThreads(4); >> tic; exp(A); toc; Elapsed time is 0.425246 seconds.
- Added support for second input argument in
ROUND
function. Requested by Michael Klanner. - Added support for second output argument in
LINSOLVE
function. Requested by Zhaolong Hu. - Added optimization to
BESSELK
for half-integer orders. - Fixed “freeze” bug in
BESSELJN
for high-orders and real quadruple arguments, only Windows version was affected. Reported by Maxime Dana. - Improved integration with warning system in MATLAB. In some cases toolbox showed warnings ignoring the fact that they are disabled in MATLAB. This is ongoing work.
January 25, 2017 – 4.3.3.12177
- Added
ACCUMARRAY, CELL2MAT
andMP.CELLFUN
routines. - Fixed bug related to empty arrays in
BSXFUN
, reported by Vladimír Šedenka. - Added special functions-overloads to retrieve machine epsilon, smallest/largest floating numbers for a given precision:
% Before >> mp.eps >> mp.eps(mp(1)) >> mp.realmin >> mp.realmax % Now / 4.3.3.12177 >> eps('mp') >> eps(mp(1)) >> realmin('mp') >> realmax('mp')
We have replaced
MP.EPS, MP.REALMIN
andMP.REALMAX
with functions without'MP.'
prefix. This is important change needed to write precision independent code.
January 4, 2017 – 4.3.2.12144
- Added
EIGS
routine for computing subset of eigenvalues and eigenvectors. All features are supported: standard and generalized eigenproblems, matrix and function handle inputs and various parts of the spectrum –'SM', 'LM', 'SA', 'LA', 'SI', 'LI', 'SR'
and'LR'
. - Added
FFTN/IFFTN
– routines for multi-dimensional Fourier transformation. - All Fourier-related routines have been updated with specific optimizations for quadruple precision. Expected speed-up is
2-4
times. - Fixed several bugs related to empty arrays in assignment and relational operators, reported by Clay Thompson.
December 6, 2016 – 4.3.0.12070
- Load balancing in multi-threading pool has been improved (especially on Windows).
ODE15S
has been updated to include support forJPattern
option, requested by Nicola Courtier.- Fixed crash in case when inputs to
ORDSCHUR
andORDQZ
have different complexity.
November 8, 2016 – 4.3.0.12050
- All elementary and special mathematical functions (
exp, sqrt, sin, cos, gamma, bessel,
etc. ) have been updated with parallel execution on multi-core CPUs. Speed-up is proportional to number of CPU cores.
For example, timing reduction for Bessel Y0 on Core i7 990x (6 hardware cores):% Before >> A = rand(2000,'mp'); >> tic; bessely(0,A); toc; Elapsed time is 4.278553 seconds. % Now / 4.3.0 >> A = rand(2000,'mp'); >> tic; bessely(0,A); toc; Elapsed time is 0.678716 seconds.
- Bug fixes: memory leak in complex arithmetic when used in multi-threading environments. Quadruple precision mode wasn’t affected by the bug.
November 2, 2016 – 4.2.3.12019
- The right matrix division (
MRDIVIDE
) has been updated to match the speed of the left matrix division (MLDIVIDE
). Reported by Massimiliano Fasi.
October 26, 2016 – 4.2.3.12016
- The eigensolver
EIG(A[,B])
has been updated and optimized further. One of the important changes is that now we have special ultra-fast solver for (k,p)-diagonal matrices in case of generalized eigenproblem. In some cases our quadruple precision solver is faster than built-in double precision solver in MATLAB! Please see the bottom of the article for example – Architecture of eigenproblem solver. - Least square linear solver has been speeded up by at least
2
times (2000x1500
). Actual speed-up grows with matrix size and number of CPU cores. Please note that now we use Rank-Revealing QR (RRQR
) instead ofSVD
. - Bug fixes: memory leaks in generalized eigensolver (complex inputs), SVD for scalar inputs (reported by Denis Tkachenko) and MIN/MAX now ignore NaN values for sure (reported by Gerard Ketefian).
October 5, 2016 – 4.2.2.11893
- The linear system solver
MLDIVIDE
has been updated and optimized further. Now every decomposition in the solver (LU, LDLT and Cholesky) run faster by10-25%
. The final results and algorithm are outlined in our recent article Architecture of linear systems solver. - Computation of eigenvalues of generalized symmetric eigenproblem has been speeded up by
2
times. This is the result of refined parallel algorithm for our D&C solver. - Also we have added MRRR solver for generalized symmetric eigenproblem
[V,D] = eig(A,B)
. It is slightly faster than D&C for computing eigenvectors, but most importantly – it provides better accuracy in some cases. - We have added special algorithm to solve n-diagonal/banded standard eigenproblems –
eig(A)
. We are preparing article with detailed outline of our poly-algorithm for standard eigenproblems (similar toMLDIVIDE
).
September 11, 2016 – 4.2.0.11601
MLDIVIDE
(linear system solver) has been updated with specialized solver for banded matrices.
Timings have been substantially improved for such matrices, e.g.5Kx5K
pentadiagonal matrix:% Before: >> n = 5000; >> A = mp(diag(1:n,0)+diag(1:n-1,1)+diag(1:n-2,2)+diag(-(1:n-1),-1)+diag(-(1:n-2),-2)); >> b = mp(rand(n,1)); >> tic; x = A\b; toc; % LU+pivoting, O(n^3) Elapsed time is 257.355805 seconds. >> norm(A*x-b,1)/norm(A,1) ans = 2.981084158350920545602126981830846e-35
% Now: >> n = 5000; >> A = mp(diag(1:n,0)+diag(1:n-1,1)+diag(1:n-2,2)+diag(-(1:n-1),-1)+diag(-(1:n-2),-2)); >> b = mp(rand(n,1)); >> tic; x = A\b; toc; % Specialized LU+pivoting within bandwidth, O(n*p*q) Elapsed time is 0.233104 seconds. % ~x1000 times faster >> norm(A*x-b,1)/norm(A,1) ans = 2.981084158350920545602126981830846e-35
The speed-up comes from the fact that now algorithms works with only few non-zero diagonals, instead of crunching full matrix.
- Positive definite banded matrices have received specialized solver as well (only superdiagonals are used = less computations).
MLDIVIDE
is a poly-algorithm which selects the best solver depending on matrix properties, basically we have rewritten it from scratch lately. Now it has (much)faster analysis and full set of specialized solvers + rcond estimators for dense matrices.
August 27, 2016 – 4.1.0.11461
MLDIVIDE
(linear system solver) has been updated with specialized solver for tridiagonal matrices.
Now computation complexity for the case is justO(n)
instead ofO(n^3)
:% Before: >> n = 2000; >> A = mp(diag(1:n,0)+diag(1:n-1,1)+diag(1:n-1,-1)); >> b = mp(rand(n,1)); >> tic; x = A\b; toc; % LU+pivoting, O(n^3) Elapsed time is 27.767122 seconds. >> norm(A*x-b,1)/norm(A,1) ans = 2.817286843369241138991794285028987e-34
% Now: >> n = 2000; >> A = mp(diag(1:n,0)+diag(1:n-1,1)+diag(1:n-1,-1)); >> b = mp(rand(n,1)); >> tic; x = A\b; toc; % Specialized tridiagonal LU+pivoting, O(n) Elapsed time is 0.079523 seconds. % ~350 times faster >> norm(A*x-b,1)/norm(A,1) ans = 2.769447752126383577306583358607416e-34
SUBSASGN
andDIAG
have been improved to be compatible with MATLAB in special cases:% Column to row assignment: >> A = magic(3,'mp'); >> x = [1;2;3]; >> A(1,:) = x A = 1 2 3 3 5 7 4 9 2 % Building matrix from empty diagonal: >> diag(rand(1,0,'mp'),0) ans = [] >> diag(rand(1,0,'mp'),1) ans = 0 >> diag(rand(1,0,'mp'),2) ans = 0 0 0 0
- Added functions for orthogonal polynomials:
legendreP, chebyshevT, chebyshevU
andhermiteH
.Variable Precision Arithmetic (VPA)/MATLAB 2016a:>> N = 1000; % polynomial degree >> M = 5000; % number of points >> digits(25); >> x = vpa(rand(M,1)); >> tic; v = chebyshevT(N,x); toc; Elapsed time is 2.600266 seconds. >> tic; v = chebyshevU(N,x); toc; Elapsed time is 1.908803 seconds. >> tic; v = hermiteH(N,x); toc; Elapsed time is 34.765995 seconds. >> tic; v = legendreP(N,x); toc; Elapsed time is 44.810674 seconds.
Advanpix Multiprecision Toolbox:
>> N = 1000; % polynomial degree >> M = 5000; % number of points >> mp.Digits(34); >> x = mp(rand(M,1)); >> tic; v = chebyshevT(N,x); toc; Elapsed time is 0.426981 seconds. % x6 times faster >> tic; v = chebyshevU(N,x); toc; % x4 times faster Elapsed time is 0.450537 seconds. >> tic; v = hermiteH(N,x); toc; % x53 times faster Elapsed time is 0.653591 seconds. >> tic; v = legendreP(N,x); toc; % x43 times faster Elapsed time is 1.035238 seconds.
Please note, our routines are not multi-core optimized (yet). In due course timings will be divided by number of CPU cores.
August 24, 2016 – 4.1.0.11420
- Added following new functions (by categories):Linear Equations
SYLVESTER, LYAP, DLYAP
– solvers for linear matrix equations of Lyapunov and Sylvester.
Continuous, discrete-time and generalized forms are covered.Matrix Decomposition
CHOLUPDATE, QRDELETE, QRINSERT, QRUPDATE
andPLANEROT
.
AlsoGSVD
for generalized singular value decomposition.Matrix Analysis
ISBANDED, BANDWIDTH, ISSCHUR, hasInfNaN
andSUBSPACE
.All new functions (with few exceptions) are implemented in toolbox core using C++ for better performance.
- Following functions have been revised and improved:
CROSS
andDOT
now support multi-dimensional arrays.QR, CHOL, SVD
andKRON
now have better handling of corner cases (e.g. empty inputs, error reports, etc.) - Added support for differential algebraic equations (DAEs) to
ODE15S
.
The number of new functions and changes allow us to bump the version directly to 4.1
bypassing smaller revisions.
August 11, 2016 – 4.0.1.11324
- Improvements to linear least-squares solver. Instead of SVD, now we use rank-revealing QR decomposition (RRQR), heavily optimized for parallel execution on multi-core CPUs:
% Before >> mp.Digits(34); >> A = rand(1000,500,'mp')-0.5; >> tic; x = A\A; toc; Elapsed time is 45.334321 seconds. % Now - 4.0.1.11324 >> mp.Digits(34); >> A = rand(1000,500,'mp')-0.5; >> tic; x = A\A; toc; Elapsed time is 12.059790 seconds.
All cases are covered – real, complex, quadruple and arbitrary precision. Speed-up ratio scales with matrix size and number of CPU cores.
July 6, 2016 – 4.0.0.11272
- New functions –
NEXTABOVE
andNEXTBELOW
have been added. They generate the next representable floating-point value towards positive/negative infinity:>> mp.Digits(34); >> nextabove(mp(1)) ans = 1.00000000000000000000000000000000019 >> nextbelow(mp(1)) ans = 0.999999999999999999999999999999999904 >> nextabove(mp(-1)) ans = -0.999999999999999999999999999999999904 >> nextbelow(mp(-1)) ans = -1.00000000000000000000000000000000019
The routines are able to generate all/any floating-point numbers representable in given precision, thus they are indispensable for accuracy checks of various algorithms. In particular, to investigate quality of approximation in close vicinity of roots or singularities. As an example, here is quick accuracy check of
MATLAB's
built-ingamma
function:>> mp.Digits(34); >> mp.FollowMatlabNumericFormat(true); >> format longE; >> x = nextabove(-1) % closest value to the pole x = -9.999999999999999e-01 >> gamma(mp(x)) % correct function value in quadruple precision ans = -9.00719925474099242278433509846728884e+15 >> gamma(x) % MATLAB gives no correct digits at all! ans = -5.545090608933970e+15
- Occasional crashes on Mac OSX have been fixed. As it turned out, sometimes
MATLAB
(especially older versions) forget to delete MEX module from memory on “clear” command, even though at-exit handlers were called. This leaves MEX in uninitialized and unusable state! Next attempt to use any command from such MEX results in crash.Now we do the unloading procedure manually on all platforms to avoid this from happening again. - Prediction on maximum number of iterations needed to reach target accuracy level in Schur & SVD algorithms have been revisited. Now we rely on more pessimistic assumption to make sure algorithms converge even if it requires much higher number of iterations.
June 21, 2016 – 3.9.9.11199
- Matrix exponential –
EXPM
has been switched to use classic scaling and squaring algorithm. Although Schur-Parlett[Higham2003] might give more accurate results in some cases [Higham2009], it is approx. 5-10 times slower. Thus we decided to use fast scaling & squaring. - Eigen-decomposition re-ordering functions:
ORDSCHUR, ORDEIG
andORDQZ
have been updated with proper error handling in corner cases.
June 3, 2016 – 3.9.9.11161
- Whole set of numerical integration routines has been refreshed:
INTEGRAL, INTEGRAL2, INTEGRAL3, QUAD2D, TRAPZ
andCUMTRAPZ
. TheINTEGRALx
routines are based on nested Gauss-Kronrod quadrature rule or its product for2-3D
.For some reason,MATLAB's
built-inINTEGRAL
doesn’t return the error bound. We consider this unacceptable and our version fixes the flaw:>>mp.Digits(34); >>f = @(x)sin((1:5)*x); >>[q, errbnd] = integral(f,mp(0),mp(1),'ArrayValued',true,'RelTol',mp('eps*100')); >>[q' errbnd'] ans = 0.4596976941318602825990633925570232 2.422586612556771991014735044455665e-34 0.7080734182735711934987841147503806 3.709251321227161864242960518419781e-34 0.6633308322001484857571909315770862 3.510290962167396105950983616082283e-34 0.4134109052159029786597920457744374 2.247449605195361383728053642397239e-34 0.1432675629073547471066721656972884 9.201635067043439859020812453968865e-35
Integration of vector-valued function with error bound for every component.
May 19, 2016 – 3.9.9.11048
- Added
X = SYLVESTER_TRI(A,B,C)
function for solving Sylvester equationA*X + X*B = C
.
The most important case is whenA
andB
are upper quasi-triangular (real Schur canonical form) – needed for computing matrix functions (SQRTM, LOGM,
etc.). - Speed-up of
HYPOT
andATAN2
, approx. 10-50 times each:% 3.9.8.11023 >> A = 100*mp(rand(1000,1000)-0.5); >> B = 100*mp(rand(1000,1000)-0.5); >> tic; atan2(A,B); toc; Elapsed time is 18.056319 seconds. % 3.9.9.11048 >> A = 100*mp(rand(1000,1000)-0.5); >> B = 100*mp(rand(1000,1000)-0.5); >> tic; atan2(A,B); toc; Elapsed time is 0.262123 seconds. % ~70 times faster
May 17, 2016 – 3.9.8.11023
- Speed-up of
SQRTM_TRI
andINTERP1
, approx. 50-100 times each:>> A = triu(100*mp(rand(100,100)-0.5)); >> tic; sqrtm_tri(A); toc; Elapsed time is 0.058902 seconds. >> tic; old_sqrtm_tri(A); toc; Elapsed time is 3.936160 seconds.
New
SQRTM_TRI
is implemented in C++, old one was implemented using MATLAB:function R = old_sqrtm_tri(T) %SQRTM_TRI Square root of upper triangular matrix. n = length(T); R = mp(zeros(n)); for j=1:n R(j,j) = sqrt(T(j,j)); for i=j-1:-1:1 R(i,j) = (T(i,j) - R(i,i+1:j-1)*R(i+1:j-1,j))/(R(i,i) + R(j,j)); end end end
Square root of upper triangular matrix is important for
SQRTM
andLOGM
functions.
May 3, 2016 – 3.9.8.10986
- Speed-up of some (basic) special functions:
gamma, gammaln, erf
anderfc
:% Before/3.9.8.10946 >> mp.Digits(34); >> A = mp(10*(rand(1000)-0.5)); >> tic; B = gamma(A); toc; Elapsed time is 73.195803 seconds. >> tic; B = erf(A); toc; Elapsed time is 31.863047 seconds.
% Now/3.9.8.10986 >> mp.Digits(34); >> A = mp(10*(rand(1000)-0.5)); >> tic; B = gamma(A); toc; Elapsed time is 1.695979 seconds. % x43 times faster >> tic; B = erf(A); toc; Elapsed time is 1.769617 seconds. % x18 times faster
Please note, in contrast to MATLAB, these functions support all kinds of input arguments (real negative, complex, sparse, etc.):
% MATLAB >> gammaln(-0.25) 'Error using gammaln' 'Input must be nonnegative.' % Advanpix >> gammaln(mp(-0.25)) ans = 1.589575312551185990315897214778783 + 3.141592653589793238462643383279503i
>> A = mp(sprand(5,5,0.25)); >> [x,y,s] = find(A); >> s = s + (rand(size(s))-0.5)*1i; % Add imaginary part >> A = sparse(x,y,s,5,5) A = (2,1) 0.1066527701805843886262437081313692 + 0.3173032206534329713321085364441387i (4,1) 0.004634224134067443934270613681292161 + 0.3686947053635096782642222024151124i (3,3) 0.9618980808550536831802446613437496 - 0.4155641544890896765807042356755119i (1,5) 0.4426782697754463313799533352721483 - 0.1002173509011035079652174317743629i (4,5) 0.774910464711502378065688390051946 - 0.2401295971493457859224918138352223i % Advanpix >> erf(A) ans = (2,1) 0.1324883666365941115619861448158018 + 0.3659492942681403578764987541731807i (4,1) 0.005990516950461289561597280148562875 + 0.4356625058347425700172815442490943i (3,3) 0.903034046175742903946764521185423 - 0.1758911959897686907097927081692329i (1,5) 0.472854450257120712106056672396968 - 0.09314858177203069504581523930686257i (4,5) 0.7550126399539313384996685128575232 - 0.1480116547266936627042332666803371i % MATLAB >> erf(double(A)) 'Error using erf' 'Input must be real and full.'
April 26, 2016 – 3.9.8.10946
- We have finished several months of optimization work for elementary functions.
Please see the results and comparisons: Performance of Elementary Functions. - Added support for signed zeros as imaginary part of complex numbers to expression evaluator. As a note, we have been supporting signed zeros from the start, with proper handling of branch cuts, etc. Some additional information: Branch Cuts and Signed Zeros in MATLAB.
April 12, 2016 – 3.9.7.10723
- Added element-wise relational & logical operations for sparse matrices:
>> A = mp(sprand(5,5,0.2)) A = (1,1) 0.4387443596563982417535498825600371 (2,2) 0.7951999011370631809114684074302204 (1,4) 0.3815584570930083962991830048849806 (1,5) 0.7655167881490023695789659541333094 (4,5) 0.1868726045543785962976812697888818 >> A(A<0.5) = 0 A = (2,2) 0.7951999011370631809114684074302204 (1,5) 0.7655167881490023695789659541333094
April 6, 2016 – 3.9.7.10708
- Performance of power and related functions has been improved:
% Before: >> mp.Digits(34); >> A = mp(rand(2000)-0.5); >> B = mp(rand(2000)-0.5); >> tic; C = A.^B; toc; Elapsed time is 82.506463 seconds. >> tic; C = log(A); toc; Elapsed time is 31.506463 seconds.
% Now: >> mp.Digits(34); >> A = mp(rand(2000)-0.5); >> B = mp(rand(2000)-0.5); >> tic; C = A.^B; toc; Elapsed time is 5.363670 seconds. % x15 times >> tic; C = log(A); toc; Elapsed time is 0.843535 seconds. % x37 times
Similarly for all related functions (
log2, log10, sqrt,
etc.) - Added support for sparse logical indices.
- Improved support for sparse matrices in
isequal, isnequal, isnan, isinf, isfinite, sqrt, expm1, log1p
and many other basic functions. - Fixed
spfun
in case when function handle has nested calls to another functions.
March 23, 2016 – 3.9.5.10580
- Added
sortrows
, few minor issues has been fixed. - Fixed small incompatibilities with
R2016a
.
March 15, 2016 – 3.9.5.10558
- Added solver for systems of nonlinear equations –
fsolve
:
x = fsolve(fun,x0)
x = fsolve(fun,x0,options)
x = fsolve(problem)
[x,fval] = fsolve(fun,x0)
[x,fval,exitflag] = fsolve(...)
[x,fval,exitflag,output] = fsolve(...)
[x,fval,exitflag,output,jacobian] = fsolve(...)
All features are supported, including sparse formulation, Jacobian and three algorithms of optimization:trust-region-dogleg, trust-region-reflective
andlevenberg-marquardt
.function test_fsolve x0 = mp([-5; -5]); options = optimset('Display','iter', 'TolFun', mp.eps,... 'TolX', mp.eps,... 'Algorithm','levenberg-marquardt'); [x,fval,exitflag,output] = fsolve(@myfun,x0,options) function F = myfun(x) F = [2*x(1) - x(2) - exp(-x(1)); -x(1) + 2*x(2) - exp(-x(2))]; end end
Output:
>> mp.Digits(34); >> test_fsolve First-Order Norm of Iteration Func-count Residual optimality Lambda step 0 3 47071.2 2.29e+04 0.01 1 6 6527.47 3.09e+03 0.001 1.45207 2 9 918.374 418 0.0001 1.49186 3 12 127.741 57.3 1e-05 1.55326 4 15 14.9153 8.26 1e-06 1.57591 5 18 0.779056 1.14 1e-07 1.27662 6 21 0.00372457 0.0683 1e-08 0.484659 7 24 9.2164e-08 0.000336 1e-09 0.0385554 8 27 5.66104e-17 8.34e-09 1e-10 0.000193709 9 30 2.35895e-34 1.7e-17 1e-11 4.80109e-09 10 33 1.70984e-51 4.58e-26 1e-12 9.80056e-18 11 36 1.8546e-68 1.51e-34 1e-13 2.63857e-26 Equation solved. fsolve completed because the vector of function values is near zero as measured by the selected value of the function tolerance, and the problem appears regular as measured by the gradient. x = 0.56714 0.56714 fval = 9.6296e-35 9.6296e-35 exitflag = 1 output = iterations: 11 funcCount: 36 stepsize: [1x1 mp] cgiterations: [] firstorderopt: [1x1 mp] algorithm: 'levenberg-marquardt' message: 'Equation solved. Equation solved. The sum of squared function values, r = 1.854603e-68, is less than sqrt(options.TolFun) = 1.387779e-17. The relative norm of the gradient of r, 6.583665e-39, is less than 1e-4*options.TolFun = 1.925930e-38. Optimization Metric Options relative norm(grad r) = 6.58e-39 1e-4*TolFun = 2e-38 (selected) r = 1.85e-68 sqrt(TolFun) = 1.4e-17 (selected)
- Fixes for various minor bugs and other small improvements.
March 8, 2016 – 3.9.4.10498
- Added light wrappers for
text
andhistogram
routines. Now both accept mp-parameters without errors. - We have upgraded to new Intel C++ compiler, MPIR, MPFR and MPC. Any compatibility tests and reports would be highly appreciated.
March 3, 2016 – 3.9.4.10481
- Basic test matrices are enabled with
classname='mp'
support:hadamard, hilb, invhilb, magic, pascal, rosser
andwilkinson
.function [c,r] = inverthilb(n, classname) % INVERTHILB Class-independent inversion of Hilbert matrix A = hilb(n,classname); c = cond(A); r = norm(eye(n,classname)-A*invhilb(n,classname),1)/norm(A,1); end
Invert Hilbert matrix using different levels of precision:
>> [c,r] = inverthilb(20,'double') c = 1.5316e+18 r = 2.1436e+11 >> mp.Digits(50); >> [c,r] = inverthilb(20,'mp') c = 2.4522e+28 r = 9.4122e-25 >> mp.Digits(100); >> [c,r] = inverthilb(20,'mp') c = 2.4522e+28 r = 2.2689e-74
- Added
cast
function for conversion of mp-entities to/from different data types.>> cast(magic(3),'like',mp('pi')) ans = 8 1 6 3 5 7 4 9 2 >> format longE >> cast(rand(2,'double'),'mp') ans = 6.9907672265668596711662985399016179e-01 9.5929142520544430361439935950329527e-01 8.9090325253579849551499592053005472e-01 5.4721552996380307121171426842920482e-01 >> cast(rand(2,'mp'),'double') ans = 1.386244428286791e-01 2.575082541237365e-01 1.492940055590575e-01 8.407172559836625e-01
- Added solvers for Riccati equations:
care, gcare, dare
andgdare
.
February 29, 2016 – 3.9.4.10458
- Added
cplxpair
andunwrap
functions. - Performance of multiplication has been improved for the precision levels <= 385 of decimal digits.
>> mp.Digits(350); >> A = mp(rand(500)-0.5); % 3.9.4.10443 >> tic; A*A; toc; Elapsed time is 22.412486 seconds. % 3.9.4.10458 >> tic; A*A; toc; % x4 times faster Elapsed time is 5.212486 seconds.
Parameters of our arithmetic engine were tuned for old CPUs. Now it is updated for modern architectures.
Thanks to Massimiliano Fasi who noticed and reported to us performance drop for small precisions. - Conjugate pairs computed by generalized eigen-solver are now guaranteed to match exactly (bit-to-bit).
Before they matched up to machine epsilon in any precision. Reported by Stefan Güttel. - Now toolbox overloads global built-in functions for array creation:
zeros, ones, eye, nan, inf, rand, randn
andrandi
.The main goal is to embed support for ‘mp’ arrays in MATLAB out-of-the-box, without the need formp(zeros(...))
wrapper:% 3.9.4.10443 >> A = zeros(3,3,'mp'); % didn't work since built-in zeros doesn't know what 'mp' is. >> A = mp(zeros(3,3)); % manual modification was needed. % 3.9.4.10458 >> A = zeros(3,3,'mp'); % now works as expected >> whos A Name Size Bytes Class Attributes A 3x3 376 mp
This change is very important, as it allows much easier conversion of existing scripts to multiprecision.
In fact, we have added option to override behavior of such basic functions to always produce mp-outputs for floating-point types:
double
andsingle
. Be careful with such powerful option (it is turned off by default).Please run
doc mp.OverrideDoubleBasicArrays
for more information on its usage, pros and cons.
February 22, 2016 – 3.9.4.10443
- Added arbitrary-precision overloads for the following functions:
orth, rref, airy, beta, betaln, ellipj, ellipke
andlegendre
. - The
svd, null
andnorm
have been updated to support more special cases.>> A = [1 2 3; 1 2 3; 1 2 3]; >> Z = null(mp(A)) Z = -0.8728715609439695250643899416624781 0.4082482904638630163662140124509814 -0.2182178902359923812660974854156192 -0.8164965809277260327324280249019639 0.4364357804719847625321949708312385 0.4082482904638630163662140124509821 >> norm(A*Z,1) ans = 1.733336949948512267750380148326435e-33 >> null(mp(A),'r') ans = -2 -3 1 0 0 1
While working on MATLAB’s internal scripts we found several cases of undocumented syntax:
[Q,Z] = svd(...) % two-outputs only norm(A,'inf') % Inf is passed as a string [US,TS, Success] = ordschur(...) % three-outputs, with status as last one.
Now
svd
andnorm
in toolbox support these special cases for compatibility. - Basic matrix manipulation & analysis functions have been optimized for sparse matrices:
diag, triu, tril, norm, max
andmin
.>> A = mp(sprand(10000,10000,0.01)); >> nnz(A) ans = 994960 >> tic; norm(A,1); toc; Elapsed time is 0.0531895 seconds. >> tic; max(sum(abs(A))); toc; Elapsed time is 0.103413 seconds.
Not too bad for handling
1M
of nonzero quadruples in sparse format!
February 15, 2016 – 3.9.3.10265
- Added LDLT decomposition for Hermitian indefinite matrices (dense). Real and complex cases are supported and optimized for multi-core CPUs.
>> A = mp(rand(1000)-0.5); >> A = A+A'; >> tic; [L,D,p] = ldl(A,'vector'); toc; Elapsed time is 7.412486 seconds. >> norm(A(p,p) - L*D*L',1)/norm(A,1) ans = 9.900420499644941245191320505579399e-33
Still we consider this as a draft version – since there is further potential for speed-up.
February 9, 2016 – 3.9.2.10193
- Added iterative methods for large/sparse systems:
bicg, bicgstab[l], cgs, gmres, minres
andpcg
.
All special cases are supported including the function handle inputs and preconditioners. Usage examples can be found in updated posts of Krylov Subspace Methods in Arbitrary Precision and Arbitrary Precision Iterative Solver – GMRES. - Added ODE solver for stiff problems –
ode15s
. - Many basic functions have been enabled with sparse matrices support.
January 30, 2016 – 3.9.1.10015
- Added
spline, pchip, ppval, mkpp, unmkpp
andinterp1
routines. - Fixed accuracy loss in
expm
and crash inordschur
when U is real but T is complex matrix. Thanks to Massimiliano Fasi for tests and reports! - Indexing engine
subsref
has been enabled with all the ad-hoc rules of MATLAB in case when the first (and only) index is semi-empty matrix. This is needed to match the MATLAB behavior in rare cases, e.g. when empty matrices are used as indices in operands of arithmetic operations.
January 19, 2016 – 3.9.0.9998
- Added
norm
computation for sparse matrices and vectors. All norms are supported except the 2-norm for sparse matrices (since it needssvds
). Please use1, Inf
or recommended'fro'
norm for matrices. - Added
mp.GaussKronrod
function for computation of Gauss-Kronrod nodes and weights. - Improved accuracy of
eig
in computing eigenvectors of real symmetric tridiagonal matrices.The method we used previously (inverse iteration) suffers from numerical instability for very ill-conditioned eigenvectors. Now we have switched to implicit QR.
December 12, 2015 – 3.9.0.9970
- Added concatenation for sparse matrices:
cat, vertcat
andhorzcat
. - Fixed
find
in case when no named output parameters is provided. - Changed error messages to be shorter and more informative. Default function from MEX API –
mexErrMsgTxt
shows intimidating messages with little information:% mexErrMsgTxt (Before): >> A = mp(magic(3)); >> A(-1,0) = 10 'Error using mpimpl' 'Subscript indices must either be real positive integers or logicals.' 'Error in mp/subsasgn (line 871)' ' [varargout{1:nargout}] = mpimpl(171, varargin{:});' % Using our custom workaround: >> A = mp(magic(3)); >> A(-1,0) = 10 'Error using mp/subsasgn (line 871)' 'Subscript indices must either be real positive integers or logicals.'
Now error messages two times shorter with all the necessary information.
December 2, 2015 – 3.9.0.9938
- Fixed
rcond
for better handling of singular matrices. - Improved generalized eigen-solver (unsymetric case). Now it tries to recover converged eigenvalues even if QZ has failed. Thanks to Mohammad Rahmanian for reporting and helping to reproduce the issue!
December 1, 2015 – 3.9.0.9935
- Added
linsolve
function as a simple wrapper overmldivide
. Toolbox already has mature solver implemented inmldivide
which analyses input matrix properties and applies the best suitable algorithm automatically. No need to re-implementlinsolve
separately. - Improved performance and fixed bug in power functions:
.^
and^
. - Added functions for saving/loading mp-objects to/from text file:
mp.Digits(34); A = mp(rand(25)); mp.write(A,'matrix.txt'); % Write mp-matrix to the text file B = mp.read('matrix.txt'); % Read it back norm(A-B,1) % check accuracy - difference should be 0 0
Function
mp.write
converts mp-matrix to text format and stores it to file. To load the saved matrix back to MATLAB usemp.read
. Data is saved with enough precision to be restored without loss of accuracy.
November 6, 2015 – 3.8.9.9901
- Special functions – hypergeometric, gamma and whole family of Bessel functions have been updated. In particular, Bessel functions have been rewritten from scratch to support arbitrary orders and arguments, avoid instability regions and accuracy loss in some cases. Now we are not only the fastest in MATLAB world:
% Symbolic Math Toolbox - R2015a >> digits(34); >> z = 150*vpa((rand(50)-0.5)+(rand(50)-0.5)*1i); >> tic; hypergeom([],vpa(1000-1000*i),z); toc; Elapsed time is 6.208366 seconds. >> tic; besseli(0,z); toc; Elapsed time is 8.635691 seconds. >> tic; besselk(0,z); toc; Elapsed time is 9.938277 seconds. >> tic; bessely(0,z); toc; Elapsed time is 17.444061 seconds. >> tic; besselj(0,z); toc; Elapsed time is 10.570827 seconds. % Advanpix Multiprecision Toolbox - 3.8.9.9901 >> mp.Digits(34); >> Z = sym2mp(z); >> tic; hypergeom([],mp('1000-1000*i'),Z); toc; Elapsed time is 0.155974 seconds. >> tic; besseli(0,Z); toc; Elapsed time is 2.131102 seconds. >> tic; besselk(0,Z); toc; Elapsed time is 2.163481 seconds. >> tic; bessely(0,Z); toc; Elapsed time is 4.301771 seconds. >> tic; besselj(0,Z); toc; Elapsed time is 3.228625 seconds.
but also the most accurate:
% Symbolic Math Toolbox - R2015a >> digits(34); >> z = 8-43*1i; >> besseli(1,vpa(z)) ans = -17.2937_918159785... + 178.7197_1803657...i % only 6-7 digits are correct!! % Advanpix Multiprecision Toolbox - 3.8.9.9901 >> mp.Digits(34); >> besseli(1,mp(z)) ans = -17.29378620503294288287975364153844 + 178.7197375141310577798644760813918i % full precision Maple: -17.29378620503294288287975364153844 + 178.7197375141310577798644760813918*I Mathematica: -17.29378620503294288287975364153844 + 178.7197375141310577798644760813918*I % One more example: >> bessely(0,vpa(1000*i)) ans = -2.54099907...376797872e381 + 2.485686096075864174562771484145676e432i % huge real part? >> bessely(0,mp(1000*i)) ans = -1.28057169...387105221e-436 + 2.485686096075864174562771484145677e+432i % not really
This is just simplest examples, we have logged many other cases.
IMPORTANT. Bessel functions from MathWorks Symbolic Math Toolbox do suffer from accuracy loss. Please avoid using it if you need high accuracy.
October 19, 2015 – 3.8.9.9541
- Added/improved following functions:
gradient, linspace, logspace, meshgrid
andndgrid
. - Fixed bugs in
svd(X,0)
and indiff
. - Fixed issue with complex division with infinite components:
% Before >> 1/mp(Inf + 0.1*1i) ans = NaN - NaNi % Now (correct) >> 1/mp(Inf + 0.1*1i) ans = 0
Thanks to Ciprian Necula for bug reports!
October 16, 2015 – 3.8.9.9528
- Thread manager has been improved with better load balancing. Now majority of dense matrix operations are faster by
15-20%
(even the matrix multiplication). - Fixed two issues reported here and here.
- Added support for MATLAB’s line spacing setting (
format compact/loose
):>> mp.Digits(34) >> mp.FollowMatlabNumericFormat(true); >> A = mp(rand(2)); >> format loose >> A A = 0.4853756487228412241918817926489282 0.1418863386272153359612957501667552 0.8002804688888001116708892368478701 0.4217612826262749914363325842714403 >> format compact >> A A = 0.4853756487228412241918817926489282 0.1418863386272153359612957501667552 0.8002804688888001116708892368478701 0.4217612826262749914363325842714403 >> format longE >> A A = 4.8537564872284122419188179264892824e-01 1.4188633862721533596129575016675517e-01 8.0028046888880011167088923684787005e-01 4.2176128262627499143633258427144028e-01 >> format shortE >> A A = 4.8538e-01 1.4189e-01 8.0028e-01 4.2176e-01
October 8, 2015 – 3.8.9.9464
- Speed of symmetric eigen-decompositions has been boosted up. All cases are covered: standard and generalized problems in quadruple and arbitrary precision modes and both complexities.Actual speed-up factor depends on number of cores and matrix size. On our 1st-gen Core i7 we see ~3 times ratio for
1Kx1K
matrix:mp.Digits(34) A = mp(rand(1000)-0.5); A = A+A'; % 3.8.8.9254 tic; eig(A); toc; Elapsed time is 48.154 seconds. % 3.8.9.9464 tic; eig(A); toc; % x3.4 times faster Elapsed time is 14.212 seconds.
Speed-up is even higher for other precision levels:
mp.Digits(50) A = mp(rand(1000)-0.5); A = A+A'; % 3.8.8.9254 tic; eig(A); toc; Elapsed time is 204.431 seconds. % 3.8.9.9464 tic; eig(A); toc; % x5.3 times faster Elapsed time is 38.534 seconds.
- In fact, all operations with dense symmetric/Hermitian and SPD/HPD matrices have became faster:
chol, inv, det, solvers,
etc. Speed-up ratio forinv
is up to 10 times depending on matrix structure and number of cores.
Full comparison table is in preparation.
September 22, 2015 – 3.8.8.9254
- Estimation of reciprocal of the condition number has been updated for all matrix types in solver (
\
) and inversion(inv
). Now it is much faster and always produce accurate values (before we encountered occasional0
‘s).Just for fun:A = magic(5); A(3,:) = A(2,:)+10*eps; % make problem ill-conditioned in double precision b = rand(5,1); x = A\b; Warning: Matrix is close to singular or badly scaled. RCOND = 2.899200e-18. norm(A*x-b,1) % computed solution is useless (as expected) ans = 4.344157433375132e+00 mp.Digits(34); x = mp(A)\mp(b); % problem is not ill-conditioned when we use 34-digits norm(A*x-b,1) ans = 4.336808689942017736029811203479767e-18
Although half of digits is lost (see the RCOND magnitude), our solver still gives us ~17 digits of accuracy.
September 21, 2015 – 3.8.8.9242
- Architecture of multi-core parallelism has been revised and improved. You may notice better timings in various dense operations, depending on number of cores and matrix size.For example, this is very beneficent for computations of eigenvectors in unsymmetric case. Which is faster up to
45%
on Core i7 (4 hardware cores):A = mp(rand(300)-0.5 + 1i*(rand(300)-0.5)); % 3.8.6.9165 tic; [V,D]=eig(A); toc; Elapsed time is 71.154 seconds. % 3.8.8.9242 tic; [V,D]=eig(A); toc; Elapsed time is 38.5 seconds.
Both modes (quadruple & multiprecision) and complexities (real & complex) are improved.
September 13, 2015 – 3.8.6.9165
- Following functions have been updated:
cond, svd, rank, norm, det, inv, trace, eps
andnull
. Performance optimization, stability, special cases.
September 6, 2015 – 3.8.6.9106
- Improved matrix analysis stage in dense matrix solver.
- Improved m-wrapper for
conv
andcolon
. - Improved compatibility of
inv
with MATLAB. Now it returns fullInf
matrix in case of singular input matrix, estimates RCOND and provides warning if problem is near-singular. - Added check for
NaN/Inf
elements in input matrix ineig
.
August 6, 2015 – 3.8.5.9081
- Added option
'valid'
forconv
.
August 2, 2015 – 3.8.5.9059
- Fixed critical issue in
subsasgn
in multiple-precision mode (when digits<>34
). - Added workaround for occasional Matlab crashes due to incorrect unload of toolbox (after
clear all
).
Update is strongly recommended!
July 24, 2015 – 3.8.5.8939
- Added two-output variants of Cholesky decomposition:
[R,p] = chol(A)
[L,p] = chol(A,'lower')
[R,p] = chol(A,'upper')
- Added support for
'vector'/'matrix'
option inLU
decomposition.
July 22, 2015 – 3.8.4.8915
- Fixed bugs in
chol
and in auto-detection of numeric constants.>> mp.Digits(50); >> mp.ExtendConstAccuracy(false); % auto-detection is disabled by default >> mp(1/3) ans = 0.33333333333333331482961625624739099293947219848633 >> mp.ExtendConstAccuracy(true); >> mp(1/3) ans = 0.33333333333333333333333333333333333333333333333333
July 6, 2015 – 3.8.4.8901
- Added
spdiags
for sparse matrices. - Improved multi-core parallelism in operations with really large matrices (N > 5000). Different thread-scheduling is required (and has been implemented) for the case.
June 4, 2015 – 3.8.3.8882
- Improved performance of
pinv
andnull
with optimized SVD code. - Fixed
norm
to prevent crashes in some cases when input matrix is empty.
May 28, 2015 – 3.8.3.8861
- Fixes in multi-dimensional array slicing and indexing of complex matrices.
Thanks to Thomas Rinder, Stefan Güttel and Maxime Pigou for reports and help. - Improved performance of arithmetic operations when one of the arguments is complex scalar.
This is maintenance release. Meanwhile we continue focusing on adding sparse matrices functionality in development branch.
April 20, 2015 – 3.8.3.8819
- Changes in architecture for faster work with sparse matrices.
- Ordering functions for sparse matrices (to reduce fill-in prior direct solvers) have been added:
amd, colamd, symamd
andsymrcm
.Efficient direct solvers are needed for spectral transformation in generalized eigenvalue solver for large matrices. - Improved performance of
find
for sparse matrices.
April 1, 2015 – 3.8.2.8775
- Minor speed-up in dense linear algebra (especially in SVD) due to more efficient memory management.
- New analysis functions have been added:
issymmetric, ishermitian, isdiag, istriu
andistril
.
Syntax and semantic are fully compatible with MATLAB’s built-in functions. - Fixed issues with Not-a-Number (NaN) handling in relational operators.
March 19, 2015 – 3.8.1.8728
- Restrictions on maximum iterations in SVD has been changed to be dependent on precision (needed for the high-precision settings, e.g. 1000 digits or more).
- Minor fixes in sub-scripted assignment operator,
sum
andprod
. - Stability and accuracy of
quadgk
has been improved. Now it can be used in arbitrary precision mode:mp.Digits(50); f = (@(x)sin(x)); [q,errbnd] = quadgk(f,mp(0),mp(1),'RelTol',100*mp.eps,'AbsTol',100*mp.eps,'MaxIntervalCount',2000) q = 0.45969769413186028259906339255702339626768957938206 errbnd = 7.872028207137840807482477381844110449433981844857e-51
March 1, 2015 – 3.8.1.8676
- System solver (
\, mldivide
) has been updated with all the recent optimizations.Solver is a meta-algorithm which automatically selects decomposition to use (LU, CHOL or SVD) depending on matrix structure and problem type/size. Now it is optimized for multi-core architectures and shows better performance overall (especially for large problems). - Minor updates to arbitrary precision arithmetic engine.
February 23, 2015 – 3.8.1.8617
- Arbitrary precision Cholesky decomposition, LU and QR have been updated with more optimizations (including multi-core). Speed-up depends on matrix size, larger matrix = higher speed-up.For instance,
1Kx1K
shows x3 times better speed whereas100x100
only 30%.
February 17, 2015 – 3.8.0.8477
- Fixed memory leak in coefficient-wise operations reported by Ito Kazuho (Thanks!).
February 10, 2015 – 3.8.0.8447
- Eigenvalue decomposition routines are switched to use “Small Bulge Multi-shift QR Algorithm with Aggressive Early Deflation” (Braman, Byers and Mathias).Speed gain is x2-x3 times for large dense matrices (>
500
) and decreases with smaller matrix size. The highest speed-up is achieved in multiprecision mode.
January 23, 2015 – 3.7.9.8323
- Further optimization of large matrix computations by using parallel algorithms on multi-core architectures. Now speed scales better with number of cores:
>> A = mp(rand(2000)); % 1 - core CPU: >> tic; lu(A); toc; Elapsed time is 24.14 seconds. % 4 - core CPU: >> tic; lu(A); toc; Elapsed time is 8.9 seconds.
Solvers, decompositions and eigen-routines benefit from the update (quadruple & multiprecision mode).
January 15, 2015 – 3.7.8.8309
- Memory manager for multiprecision objects has been completely re-written with the focus on fast allocation and minimizing memory fragmentation for large matrices. Speed gain depends on matrix size and operation. For example, addition of two
3Kx3K
complex matrices shows x3 times improvement:% 3.7.8.8309 - new version >> mp.Digits(50); >> A = mp(rand(3000)-0.5 + 1i*(rand(3000)-0.5)); >> tic; A+A; toc; Elapsed time is 3.88 seconds. % 3.7.7.8234 - previous >> mp.Digits(50); >> A = mp(rand(3000)-0.5 + 1i*(rand(3000)-0.5)); >> tic; A+A; toc; Elapsed time is 11.9 seconds.
Speed-up is higher for larger matrices.
- Several changes to formatted input/output, including new feature of whole-matrix conversion to mp-object:
>> A = mp('[1/2, sqrt(-1); (-1)^(1/2) pi]') A = 0.5 + 0i 0 + 1i 0 + 1i 3.141592653589793238462643383279503 + 0i
This applies only for matrices with constant elements – useful for scripts which store constant parameters in a matrix.
January 7, 2015 – 3.7.7.8234
- Scripts with high volume of small scale computations (scalars, small matrices, etc.) are faster by up to 2 times.Usually in such computations the most time was spent in communicating with MATLAB through MEX API functions, which are very slow and impose heavy overhead (e.g. it is not possible to access variable’s memory directly, only after making the deep copy).Today we have found long-awaited workaround for one of such issues – slow creation of ‘mp’-objects in MATLAB. It was affecting every single routine and now it is faster by 50%.
- Linux version of toolbox has been updated with all the recent changes and improvements.
January 5, 2015 – 3.7.6.8202
- Multiple precision generalized eigenproblem and SVD functions are 1.5-3 times faster (
eig(A,B), qz, svd
). Speed-up is higher for larger matrices, e.g. for 2000 x 2000 we can expect 5 times improvement or more. - Matrix multiplication is faster by 1.5-2 times – in quadruple and multiprecision mode. Recent versions of MATLAB restrict number of threads allowed to be used in parallel computations. Now we choose this independently from MATLAB. As a result, we can use more cores.
- Overall, we have been working on multi-core optimizations and you might see noticeable speed-up in large matrix computations.
December 26, 2014 – 3.7.5.7900
- Multiple precision complex computations are faster by up to 2 times. This affects all the routines including matrix computations –
eig, svd, qr, lu,
etc. - Optimized Kronecker product function
kron
has been added.
December 24, 2014 – 3.7.4.7853
- The functions
schur, ordschur, qz, ordqz, ordeig, balance
andrcond
have been extended and optimized. Now we fully support'real'/'complex'
standard and generalized Schur decomposition together with re-ordering in arbitrary and quadruple precision.In a last two weeks the singular/eigenvalues module of the toolbox has been completely changed, 80% of code refactored for improved functionality, stability, speed and feature support. This is part of the work needed for addingEIGS
function in next versions.
December 17, 2014 – 3.7.3.7605
- Eigen-decomposition routines for full matrices have been completely revised. Improved processing of symmetric, Hermitian and symmetric tridiagonal matrices by using MRRR and Divide&Conquer algorithms. Higher speed, better functionality.
- Routine for generalized eigen-problem,
eig(A,B)
, has been enabled with full multiprecision support without restrictions (before we had it in quadruple precision).
December 12, 2014 – 3.7.2.7508
- Implemented arbitrary precision divide & conquer SVD algorithm (we had it only for quadruple precision).
Overall speed-up factor is 7 times for real and complex matrices.
December 3, 2014 – 3.7.2.7464
- Added element-wise arithmetic operations with scalar for sparse matrices.
November 25, 2014 – 3.7.2.7422
- Improved compatibility with Parallel Computing Toolbox (removed intra-process blocks).
- Fixed issue in indexed assignment when LHS is empty matrix.
November 19, 2014 – 3.7.2.7355
- Added workaround for malfunctioning
mxDuplicateArray
. - Matrix inversion is sped-up in quadruple mode.
- Small fixes and improvements.
November 14, 2014 – 3.7.2.7314
- Speed-up of array manipulation operations. All platforms are covered – Windows, Linux and Mac OSX.
- Fixed bug in matrix power function (for complex matrices).
November 11, 2014 – 3.7.1.7230
- Optimized memory usage for
mp
objects after on-the-fly precision change. - Fixed minor bugs in
colon
and matrix power functions.
September 16, 2014 – 3.7.1.7217
- Performance of matrix computations are boosted up by additional 25-30% (Windows only).
September 9, 2014 – 3.7.0.7002
- Speed of matrix computations (solvers, decompositions, etc.) is boosted up by 30% for real dense, by 40% for complex dense and by 50% for sparse cases. The improvement is for pure multiple precision computations (not quadruple).
August 28, 2014 – 3.6.7.6340
- Added indexed assignment capability for sparse matrices (
subsasgn
). - Added
nextprime, prevprime, sym2mp, mp2sym, isequal, isequaln, logical
andislogical
functions. - Fixed issues with empty sparse matrices support.
August 21, 2014 – 3.6.5.6104
- Improved code for generation of nodes and weights for Gaussian-family quadrature.
August 15, 2014 – 3.6.5.6102
- Added
primes, factor, gcd, lcm
andisprime
functions in arbitrary precision.
August 8, 2014 – 3.6.5.6048
- Added
condeig
function. - Refined
eig
to produce matrix of left eigenvectors:[V,D,W] = eig(...)
.
August 6, 2014 – 3.6.5.6015
- Added polynomial functions:
poly, polyeig, residue, polyder, polyval, polyvalm, deconv, polyfit
andpolyint
. - Refined
eig
to support special flags ('vector'/'matrix'
, etc.) .
July 31, 2014 – 3.6.4.5174
- Added fast
permute, ipermute, shiftdim, blkdiag, squeeze, circshift
androt90
functions.
July 25, 2014 – 3.6.4.5128
- Added
bsxfun
function (andkron
as a consequence). - Enabled
mp
-objects to be used as indices in referencing operations.
July 24, 2014 – 3.6.4.5101
- Further speed-up of data exchange with MATLAB – total speed-up in all operations is 15-20%.
July 21, 2014 – 3.6.4.5051
- Speed-up of data exchange with MATLAB – all operations are faster now.
- Fixed bugs related to memory leaks and memory alignment.
May 2, 2014 – 3.6.3.4945
- Updated
hankel
andtoeplitz
to make it compatible with recent changes to toolbox architecture. Thanks to Ilya Tyuryukanov for reporting the bug.
April 14, 2014 – 3.6.3.4941
- Fixed bug when scalar is passed to
diag
function.
March 3, 2014 – 3.6.3.4931
- Speed of sparse matrices computations is boosted up by 3-5 times on Linux platform.
January 22, 2014 – 3.6.3.4889
- Fixed incompatible exception handling with MATLAB (on Linux).
Toolbox was catching all exceptions coming from withing the code. However some of theMEX
functions throw their own exceptions of hidden (and unknown) type to toolbox. Any attempts to catch them led to MATLAB crash. Now this is fixed – thanks to Takashi Nishikawa for sending the crash dumps.
November 22, 2013 – 3.6.3.4872
- Improved performance of dense and sparse solvers thanks to re-designed memory layout & access patterns.
- Fixed triangular solvers, improved matrix analysis routines (positive-definite, etc.) in solvers.
November 8, 2013 – 3.6.2.4812
- Added
precomputeLU
function targeted for use in iterative schemes for sparse matrices (e.g. Arnoldi process for computing eigenvalues).New function computes and stores LU factorization of a given sparse matrix directly in toolbox’s core. Then pre-computed LU can be used to solve system of equations with different right-hand side by standard"\"
. Here is simple example:mp.Digits(34); A = rand(1000); A = mp(sparse((A>0.5).*A)); F = precomputeLU(A); for k=1:10 b = mp.rand(1000,1); x = F\b; % re-use of LU with different b end;
This approach has advantage over usual
x = U\(L\(P*b))
by avoiding overhead of transferring data between MATLAB and toolbox. - Most trigonometric & exponential functions are sped up in favor to quadruple precision.
- Fixed minor bug in
sort
.
November 1, 2013 – 3.6.1.4792
- Greatly improved performance of the dense matrix solver (both real & complex) in quadruple precision mode.Now we tear up famous competitors by even greater margin: Advanpix vs. VPA vs. Maple – Dense Solvers and Factorization.This improvement gives us right to claim that now we have the fastest quadruple precision on the planet 🙂
- Lots of small performance-oriented improvements in toolbox core – avoiding temporaries where possible, better caching, etc.
October 20, 2013 – 3.5.6.4760
- Added routine for generalized Schur decomposition –
qz
. Syntax & functionality are equivalent to MATLAB’s routine with the exception that we do not support'real'
flag. Results are computed in'complex
‘ mode (default in MATLAB).mp.Digits(34); A = mp(rand(30)); B = mp(rand(30)); [AA,BB,Q,Z,V,W] = qz(A,B); a = diag(AA); b = diag(BB); l = a./b; % Check absolute error norm(Q*A*Z-AA,1) ans = 4.023065828938653331292726401171631e-32 norm(Q*B*Z-BB,1) ans = 4.06883050821433466532161841735413e-32 norm(A*V - B*V*diag(l),1) ans = 1.522296748911918101325723347183125e-31 norm(W'*A - diag(l)*W'*B,1) ans = 6.646702710471966018101416671748383e-32
- Added new function
mp.FollowMatlabNumericFormat(true | false)
. It allows user to choose numeric formatting preferences for the toolbox. Iftrue
, toolbox will obey numeric format settings in MATLAB (show limited number of digits) or display all digits of precision otherwise (by default).Default settings can be adjusted inmpstartup.m
script as usual.>> mp.Digits(30); >> mp.FollowMatlabNumericFormat(false); >> mp('pi') ans = 3.14159265358979323846264338328 >> mp.FollowMatlabNumericFormat(true); % Fixed-point formats >> format short >> mp('pi') ans = 3.1416 >> format long >> mp('pi') ans = 3.141592653589793238462643383280 % Scientific floating-point formats >> format shortE >> mp('pi') ans = 3.1416e+00 >> format longE >> mp('pi') ans = 3.141592653589793238462643383280e+00 % Fixed or scientific formats >> format shortG >> mp('pi') ans = 3.1416 >> format longG >> mp('pi') ans = 3.14159265358979323846264338328 % C99 hex float format >> format hex >> mp('pi') ans = 0x3.243f6a8885a308d313198a2e037080p+0
October 16, 2013 – 3.5.5.4731
- (Preliminary) Added adaptive Gauss-Kronrod numerical integration routine –
quadgk
. It is completely compatible with default MATLAB function including options and all arguments.>> mp.Digits(34); >> f = @(x) exp(-x.^2).*log(x).^2; >> Q = quadgk(f,mp(0),mp(inf),'RelTol',mp('1e-15'),'AbsTol',mp('1e-30')) Q = 1.947522180300781587359083284072062
- Fixed bug in
mp
constructor to handle the case with recursive precision downgrade, e.g.:mp(mp(pi,50),10)
. Both cases – dense and sparse are covered.
September 26, 2013 – 3.5.5.4665
- New sparse matrices serialization, now it is much more efficient and supports
NZMAX
parameter to reserve space / lower chances for costly re-allocations during computations. - Indexed referencing (
subsref
) is now supported for sparse matrices. For example:A(:), A(2,3), A(1,:)
, etc. Please note – indexed assignment for sparse matrices is not yet implemented. - Fixed bug in
diff
. MATLAB ignores indexation overloads and calls built-in functions in methods of the class. This is quite an unpleasant surprise (another violation of classic OOP design). Anyway ODE routines now work correctly.
September 17, 2013 – 3.5.5.4629
- All direct solvers for sparse matrices (LDLT, SuperLU and QR) are optimized for quadruple precision. Speed gain is approx.
x10
times. Tables with new timings can be found here. - Sparse QR has been fixed and improved, now we can solve undetermined systems as well.
- Added new functions:
conv
andpoly
. - Fixed bug in
subsref
, related to special case of vector indexing.
September 5, 2013 – 3.5.4.4586
- Added direct solver for sparse matrices, operator
"\"
.Solver includes sparse LDLT, SuperLU and QR decomposition enabled with arbitrary precision support. The most appropriate method is chosen automatically depending on matrix properties. - Added arithmetic operations for sparse matrices (
+,-,*
) and new functions:sparse
andtranspose
. Usage syntax & semantic is completely compatible with MATLAB’s rules. - Improved support of empty matrices when used as indices.
- Speed-up of sparse matrices handling, mixed complexity matrix multiplication, conversion to double, formatted output, etc.
- Fixed bugs in
subsasgn, subsref, norm
androots
.
August 13, 2013 – 3.5.2.4293
- Completely re-designed arbitrary precision arithmetic engine with emphasis on performance. Now real arithmetic operations are up to
20%
faster. Complex operations arex2
times faster.
This improvement has major impact on overall performance of the toolbox. Thus, Fourier transform has became x2
times faster (as direct consequence of faster complex arithmetic). Even matrix multiplication receives benefit of 20%-50%
increase in performance.
August 9, 2013 – 3.5.1.4260
- Added functions for sparse matrix manipulations:
nonzeros, spfun, spones, full, nnz, nzmax
. - Improved support of empty matrices and fixed minor bug in
numel
(case of sparse matrices).
August 6, 2013 – 3.5.1.4193
- We have re-fined common array operations with improved multi-dimensional support:
sum, prod, cumsum, cumprod, max, min, sort, fft, ifft
. - Fourier transform speed-up by 25%.
July 31, 2013 – 3.5.0.4112
- Very basic support of sparse matrices in multiple precision. Will extend it in upcoming versions. See Rudimentary Support for Sparse Matrices for more details.
- Custom implementation of indexing, referencing and similar operations (
subsref
,subsasgn
,cat
, etc.). - Matrix computations
x2-3
times speed up on Windows 64-bit. - Display & formatted output improvement
July 12, 2013 – 3.4.4.3840
- We have added
mp.randn
function for generating normally distributed pseudorandom numbers with arbitrary accuracy. All special cases are supported for full compatibility with MATLAB:r = mp.randn(n) r = mp.randn(m,n) r = mp.randn([m,n]) r = mp.randn(m,n,p,...) r = mp.randn([m,n,p,...]) r = mp.randn r = mp.randn(size(A)) r = mp.randn(..., 'double') % 'double' is ignored r = mp.randn(..., 'single') % 'single' is ignored
Also we have refined
mp.rand
, now it is faster and able to generate multidimensional arrays as well.
July 3, 2013 – 3.4.4.3828
- To resolve the problem with mixed usage of limited accuracy
double
precision constants in expressions withmp
entities, we have introduced new global setting in the toolbox:mp.ExtendConstAccuracy()
.
It enables/disables auto-detection and recomputing of the constants with higher precision to match toolbox’s settings, e.g.:>> mp.Digits(34); >> mp.ExtendConstAccuracy(false); >> sin(mp(pi)) 1.224646799147353177226065932274998e-16 >> mp.ExtendConstAccuracy(true); >> sin(mp(pi)) 8.671810130123781024797044026043352e-35
In the first example toolbox uses
pi
as it is, withdouble
precision accuracy (at most 16 correct digits). In the second example, toolbox recognizes thepi
constant and re-computes it with required precision of 34 digits.Be default (can be changed in
mpstartup.m
) we usemp.ExtendConstAccuracy(true)
.
June 26, 2013 – 3.4.3.3818
- After few reports from confused users we have removed auto-detection and recomputing of commonly used constants (
pi
,eps
, etc). Now alldouble
precision constants are converted tomp
as it is – with at most 16 correct digits. Before toolbox was trying to recompute them in higher precision.
May 20, 2013 – 3.4.3.3806
- Improved support for empty arrays/matrices.
- Optimized and improved
rcond
(including quadruple precision). - Speed up of operations with multidimensional arrays.
- Fixed minor bugs of quadruple precision mode.
April 21, 2013 – 3.4.3.3481
- Fixed bugs in incomplete gamma function computation.
- Speed up of determinant computation.
April 15, 2013 – 3.4.3.3452
- Real & complex Schur decomposition has been improved with more intelligent restriction on allowable number of Francis QR iterations. Thanks to Gokhan Tekeli from Bogazici University!
March 21, 2013 – 3.4.3.3426
- Fixed bug in
expm
. Thanks to Dmitry Smelov from Stanford! - Fixed bug and improved performance of
colon
.
March 8, 2013 – 3.4.3.3389
- Added support for multidimensional arrays and operations with them. We support coefficient-wise arithmetic operators, mathematical functions, basic information, array manipulation and other functions. Example of array manipulation:
>> x = mp(eye(2)); >> a = cat(3,x,2*x,3*x) (:,:,1) = 1 0 0 1 (:,:,2) = 2 0 0 2 (:,:,3) = 3 0 0 3 >> B = permute(a,[3 2 1]); >> C = ipermute(B,[3 2 1]); >> isequal(a,C) ans = 1
February 13, 2013 – 3.4.2.3292
- 20% performance increase in multiprecision linear algebra.
- Fixed bug in Cholesky decomposition in quadruple precision mode.
February 6, 2013 – 3.4.2.3257
New features:
- We have re-implemented QR decomposition with optimizations for quadruple precision. This resulted in significant speed-up by
10-20
times. We support following variants ofqr
function:X = qr(A)
X = qr(A,0)
[Q,R] = qr(A)
[Q,R] = qr(A,0)
[Q,R,E] = qr(A)
[Q,R,E] = qr(A,0)
Few tests with timing:
>> mp.Digits(34); % Quadruple precision >> mp.GuardDigits(0); >> A = rand(100,50); % 100 x 50 test matrix >> X = mp(A); >> tic; [Q,R] = qr(X); toc; % Full QR Elapsed time is 0.152928 seconds. >> norm(X-Q*R,1) 3.670100250272926322479797966838402e-32 >> tic; [Q,R] = qr(X,0); toc; % Economic QR Elapsed time is 0.110708 seconds. >> norm(X-Q*R,1) 3.670100250272926322479797966838402e-32
January 20, 2013 – 3.4.2.3222
New features:
- This version introduces very important feature we have been working for a few months. In order to boost performance we have made thorough speed optimization of our core engine for computations in quadruple precision (34 decimal digits).As a result we have
20-50
times better performance for matrix computations done in quadruple precision. Here is example of computing singular values of a 100×100 matrix:>> mp.Digits(34); % Switch to quadruple mode by using 34 decimal digits >> mp.GuardDigits(0); >> A = mp(rand(100)); % Use 100 x 100 matrix >> tic; svd(A); toc; Elapsed time is 0.133788 seconds. >> tic; [U,S,V] = svd(A); toc; Elapsed time is 0.668525 seconds. >> norm(A-U*S*V',1) % Accuracy check 1.605428118251356861603279613248978e-31
Basic matrix operations, eigensolvers and some decomposition routines are already updated to be much faster in quadruple precision. Further extensions are under development.
- Routine
eig()
is extended to support arbitrary matricesA,B
when computing solution of generalized eigenvalue problem with quadruple precision.>> mp.Digits(34); >> mp.GuardDigits(0); >> A = mp(rand(100)); >> B = mp(rand(100)); >> [V,D] = eig(A,B); >> norm(A*V - B*V*D,1) 2.827686455535796940129031614889614e-30
January 9, 2013 – 3.4.0.3172
New features:
- Added new solver for ordinary differential equations –
ode113
. - Both ODE solvers
ode45
andode113
are enabled with the support of event functions.
Bug fix:
- Fixed minor bug in operations involving mp-objects with different precisions.
December 28, 2012 – 3.4.0.3116
Improvements:
- Fixed incompatibility with the latest OpenMP version included in
MATLAB R2012b
. Our sincere gratitude goes to Kazuho Ito from University of Yamanashi for his excellent help in finding and investigating the problem.Unfortunately it comes with the price of dropping support of oldR2008b
. Now toolbox supports all versions ofMATLAB
starting fromR2009b
.
December 19, 2012 – 3.4.0.3041
Improvements:
- Performance has been improved by 20% – 50% in most of mathematical operations.
We have switched to Intel C++/Fortran Compilers and their new OpenMP runtime provides this gain.Side effect is that Windows & MATLAB R2008b users need to define special environment variable
KMP_DUPLICATE_LIB_OK=TRUE
in order to enable compatibility with older Intel Compilers (used to build R2008b).
December 1, 2012 – 3.4.0.3028
Improvements:
- Due to growing number of customers using Windows 8 we have added support for the OS.
November 7, 2012 – 3.4.0.3017
Improvements:
- Few months ago we have started re-implementation of linear algebra algorithms to benefit from multi-core parallelism. Basically this means that performance of the toolbox will be better on systems with more cores/CPUs.Today’s release is the first version with enabled parallelism in basic matrix operations, like multiplication.
Bug fix:
- Fixed (similar)bugs in matrix left and right divide. We have used in-proper speed optimization for a special case when one matrix is real and other has complex elements.
October 31, 2012 – 3.4.0.2947
- Performance is increased by
2.5-3
times in majority of linear algebra functions including:qr, svd, eig, schur, hess
. - Algorithm for automatic detection and re-calculation of common constants is improved to avoid false-positive errors.
October 24, 2012 – 3.3.9.2842
Bug fix:
- Fixed bug in matrix properties analysis stage of
eig()
function. Imaginary part of elements were erroneously stripped off in case of complex diagonal matrices.
Improvement:
- Speed up of multi-precision arithmetic engine by
5-10%
. - Dynamic memory manager is tuned to gain more speed on
Windows 7
.
October 3, 2012 – 3.3.8.2794
- Added new function,
balance
aimed to improve accuracy of computation of eigenvalues and/or eigenvectors. It applies similarity transformation (permutation & scaling) to a matrix so that row and column norms becomes approximately equal.Algorithm is based onxGEBAL
,xGEBAK
fromLAPACK
library. All MATLAB’s features are supported:[T,B] = balance(A)
[S,P,B] = balance(A)
B = balance(A)
B = balance(A,'noperm')
October 2, 2012 – 3.3.8.2785
- Added new function,
rcond
to compute the 1-norm estimate of the reciprocal condition number.
Algorithm is based onxGECON
routines fromLAPACK
, both complex and real matrices are supported. Usage syntax is compatible withMATLAB
:c = rcond(A)
September 25, 2012 – 3.3.8.2776
- Added new functions,
mod
andidivide
. They are needed for some built-in functions to work correctly with multi-precision entities (e.g.unwrap
).
Improvement:
- Mathematical expression parsing & evaluation have been completely re-written to be more error-robust and flexible.
August 8, 2012 – 3.3.8.2725
- Now display format of mp-entities are controlled by MATLAB’s formatted output settings. We support the following formats:
short, long, shortE, longE, shortG, longG, shortEng, longEng
andhex
.Short formats show only 4 digits after the decimal point. Long formats show full precision.>> mp.Digits(30); % Fixed-point formats >> format short >> mp('pi') 3.1416 >> format long >> mp('pi') 3.141592653589793238462643383280 % Scientific floating-point formats >> format shortE >> mp('pi') 3.1416e+00 >> format longE >> mp('pi') 3.141592653589793238462643383280e+00 % Fixed or scientific formats >> format shortG >> mp('pi') 3.1416 >> format longG >> mp('pi') 3.14159265358979323846264338328 % C99 hex float format >> format hex >> mp('pi') 0x3.243f6a8885a308d313198a2e037080p+0
July 26, 2012 – 3.3.8.2715
- Added functions for conversion to integers and vice versa:
int8, uint8, int16, uint16, int32, uint32, int64
anduint64
:>> x = mp(int64(magic(3))) 8 1 6 3 5 7 4 9 2 >> int64(x) ans = 8 1 6 3 5 7 4 9 2 >> x = mp(intmax('uint64')) 18446744073709551615 >> uint64(x) ans = 18446744073709551615
An interesting fact is that MATLAB rounds floating point numbers to nearest integer in conversion:
>> int32(1.2) ans = 1 >> int32(1.5) ans = 2
This is quite unexpected and contradicts the majority of other programming languages, where decimal parts are just truncated.
July 25, 2012 – 3.3.8.2702
- Added functions for specialized matrices computing:
compan, hankel, vander, toeplitz, mp.hilb
andmp.invhilb
.Hilbert matrix inversion test:% Multiprecision Computing Toolbox: >> mp.Digits(100); >> norm(mp.invhilb(20) - inv(mp.hilb(20)),1) 9.3295712880......440476004039710173e-51 % MATLAB: >> norm(invhilb(20) - inv(hilb(20)),1) Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 1.155429e-019. ans = 1.74653540359445e+028
Please note, integer-valued special matrices can be converted to
mp
objects directly, e.g.:mp(eye(10)) mp(ones(15)) mp(zeros(7)) mp(magic(3)) mp(rosser,100) mp(wilkinson(15)) mp(hadamard(8)) mp(pascal(5))
Bug fixes:
- Fixed bug in element-wise
power
function whenNaN
was returned for small negative arguments.
July 19, 2012 – 3.3.8.2684
- Code for data interchange between toolbox and MATLAB has been re-written to be more generic. This is the next step towards sparse matrices support. This part is greatly optimized thanks to reduced number of heap memory (de-)allocations.
July 10, 2012 – 3.3.8.2651
- We have implemented adaptive computational load balancing between Pade approximation and ISS (inverse scaling and squaring) in matrix logarithm. Now
logm()
is more stable and much faster.
July 6, 2012 – 3.3.8.2637
- Added functions for 2-D fast Fourier transform,
fft2
andifft2
. All features and special cases are supported to provide full compatibility with standard functions:
Y = fft2(X)
Y = fft2(X,m,n)
Y = ifft2(X)
Y = ifft2(X,m,n)
y = ifft2(..., 'symmetric')
y = ifft2(..., 'nonsymmetric')
- Added
subsindex
function for smooth usage ofmp
objects as indexes to matrix coefficients.
Bug fixes:
- Fixed bug in
ifft
related to special case when a'symmetric'
option is supplied.
June 22, 2012 – 3.3.7.2611
- Added
fprintf()
function. Nowmp
numbers can be printed the same way as standard floating point numbers:>> fprintf('double pi = %f and \n50-digits pi = %.50f\n', pi, mp(pi,50)) double pi = 3.141593 and 50-digits pi = 3.14159265358979323846264338327950288419716939937511
This combined with auto-recomputing of commonly used constants makes existing scripts porting to arbitrary precision even easier. There are much less modifications needed in code than before.
Improvements:
- We have implemented new heap memory management for entities of the
mp
type. Memory of “disposed” objects is not freed immediately but can be re-used for new objects without slow system calls (allocation/deallocation). This gives up to two times a performance boost in all linear algebra functions.
June 20, 2012 – 3.3.6.2600
- Automatic detection and re-calculation of commonly used
double
constants if they are encountered in expressions with multi-precision numbers. Usage of limited precisiondouble
constants (likepi
,exp(1)
,sqrt(2)
, etc.) in arbitrary precision computations has always been one of the main source of low accuracy final results.General rule is that floating-point constants should be re-computed in high precision from the beginning:mp('pi') mp('sqrt(2)') mp('exp(1)') mp('log(2)') mp('catalan') mp('euler') ...
See More on Existing Code Porting for details.
***
The new version of toolbox automatically detects and re-calculates common constants encountered in computations with arbitrary precision, includingpi
,sqrt(2)
,exp(1)
,log(2)
,eps
:>> pi ans = 3.141592653589793 >> mp(pi,50) % pi is automatically re-computed to have 50 correct digits 3.14159265358979323846264338327950288419716939937511 >> mp(eps,50) % eps is automatically recognized and adjusted to supplied precision 1.069105884036878258456214586860592751526078752042e-50 >> mp(10*sqrt(2)/571,50) % fractions with constants are also supported 0.024767312826148774935230975905598915561640488185236
Additionally toolbox correctly recognizes constants if they are used with fractional coefficients, e.g:
7*pi/3
,10*eps
,sqrt(2)/2
.Hopefully this new feature will make porting of existing programs to arbitrary precision even easier.
Bug fixes:
- Extended
dot
product to support vectors of different shapes, e.g. row – column. - Fixed incompatibility of
funm
with older versions of MATLAB.
June 14, 2012 – 3.3.5.2519
Matrix functions (logm
in particular) have been completely revised thanks to feedback of Numerical Linear Algebra Group from The University of Manchester. Now we use the state-of-the-art Schur-Parlett algorithm for computing general matrix function described in the following references:
- P. I. Davies and N. J. Higham, A Schur-Parlett algorithm for computing matrix functions. SIAM J. Matrix Anal. Appl., 25(2):464-485, 2003.
- N. J. Higham, Functions of Matrices: Theory and Computation, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008.
June 12, 2012 – 3.3.5.2511
- Added
ordschur()
– eigenvalues reordering in complex Schur factorization. All special cases are supported:
[US,TS] = ordschur(U,T,select)
[US,TS] = ordschur(U,T,keyword)
[US,TS] = ordschur(U,T,clusters)
In some situations, order of eigenvalues within clusters might not coincide with what is generated by MATLAB. This is because we do additional minimization of permutations to gain more speed. Otherwise functionality is identical to MATLAB’s built-inordschur()
. - Added solver for triangular Sylvester equation,
A*X + X*B = C
. Syntax istrisylv(A,B,C)
.
June 5, 2012 – 3.3.4.2487
- Added matrix power function,
mpower()
. We use binary squaring for integer powers and combination ofexpm
,logm
for other cases (preliminary). - Added support for logical variables, now can be used with
mp
numbers in comparisons, conversions, and expressions:
mp('1')==true
mp(false)
mp('pi') + true
Bug fixes:
- Fixed premature stop in Schur factorization due to restriction on maximum iteration count in Francis QR step algorithm.
- Fixed program crash when a user without administrator rights tries to install toolbox in restricted folders (e.g. Program Files)
Improvements:
- Implemented workaround for
libstdc++
& dynamicstd::string
problem on Mac OS X. No more segfaults because of this! - Improved performance in
mp
output routines thanks to removed dependency onboost::format
which was extremely sloooow and buggy on Mac OS X.
***
‘Version History’ above doesn’t reflect development prior June 2012.
Idea for Multiprecision Computing Toolbox was born on September 13, 2010. Development started in February 2011 (one of the first commits was done during the Great East Japan Earthquake on March 11, 2011 in shaken building midst falling shelves, computers, books and panicked people). Public version was released on September 16, 2011. Major new versions are released twice a month, with minor updates – almost every day.