Performance of Array Manipulation Operations

by Pavel Holoborodko on August 2, 2013

Besides sparse matrices support latest version of toolbox also includes significant changes in performance optimization of the matrix / array manipulation operations. This post is attempt to highlight some of the key points & reasons of the development.

As we noted before [1, 2], MATLAB has somewhat limited object oriented capabilities. The good feature that MATLAB has in this area – it allows elements of dense matrix / n-dim array to be of custom-type (e.g. multiprecision number). In this case MATLAB automatically takes care of basic manipulation operations (indexing, assigning, concatenation, deletion, etc.).

This nice feature removes huge amount of work from toolbox’s developers – so that we can concentrate on mathematical functionality rather than spending time re-inventing the wheel.

Although we have been using this functionality from the start, we decided to abandon it in the recent version. The reason is simple: MATLAB’s implementation of array manipulation operations (with custom objects as elements) suffers from enormous loss in performance!

Below we show few tests and timing comparisons between our own and MATLAB’s implementation. In our test we use 64-bit R2012b on Windows 7, with Core i7 CPU.

Last version of toolbox relying on MATLAB’s engine was 3.4.4.3831 (can be downloaded from here).
Version based on our own routines are 3.5.0.4112.
We will use both of versions interchangeably in our tests for comparison.

Timing tests

Here is simple script which tests basic matrix operations when elements are multiprecision numbers (mo_speed_test.m):

rng('default');                 % Make sure we use the same seed every time
 
mp.ExtendConstAccuracy(false);  % Disable constant auto-detection for clean tests
 
N = 100;                        % Test matrix size
n = 1000;                       % Test repetitions
 
A = rand(N);                     
B = mp(A);                      % Test matrix
 
Pi = mp(pi);                    % Test scalar
 
% Subscripted reference tests
s = 0;
for i=1:n 
    x0 = randi([1,N/2],1);
    y0 = randi([1,N/2],1);
    x1 = randi([N/2,N],1);
    y1 = randi([N/2,N],1);
 
    tic;    
    B(x0:x1,y0:y1);
    s = s + toc;    
end;
fprintf('\tsubsref:\t%.2f sec\n', s);
 
% Subscripted assignment tests (simple)
s = 0;
for i=1:n 
    x0 = randi([1,N/2],1);
    y0 = randi([1,N/2],1);
    x1 = randi([N/2,N],1);
    y1 = randi([N/2,N],1);
 
    tic;    
    B(x0:x1,y0:y1) = Pi;
    s = s + toc;    
end;
fprintf('\tsubsasgn:\t%.2f sec\n', s);
 
% Subsref & subsasgn combined
Q = B;
s = 0;
for i=1:n 
    x0 = randi([1,N/2],1);
    y0 = randi([1,N/2],1);
    x1 = randi([N/2,N],1);
    y1 = randi([N/2,N],1);
 
    tic;    
    Q(x0:x1,y0:y1) = B(x0:x1,y0:y1);
    s = s + toc;    
end;
fprintf('\tcombined:\t%.2f sec\n', s);
 
% Concatenation - 1
s = 0;
for i=1:n 
    tic;    
    cat(1, B, B);
    s = s + toc;    
end;
fprintf('\tcat(1):\t\t%.2f sec\n', s);
 
% Concatenation - 2
s = 0;
for i=1:n 
    tic;    
    cat(2, B, B);
    s = s + toc;    
end;
fprintf('\tcat(2):\t\t%.2f sec\n', s);
 
% Concatenation - 5
s = 0;
for i=1:n 
    tic;    
    cat(5, B, B);
    s = s + toc;    
end;
fprintf('\tcat(5):\t\t%.2f sec\n',s);
 
% Matrix Creation
tic;
for i=1:n
    mp(A);
end;
s = toc;
fprintf('\tctor:\t\t%.2f sec\n', s);

Now we can run tests for both engines using the script above:

 
% MATLAB's matrix / array manipulation engine 
>> addpath('D:\Multiprecision Computing Toolbox-3.4.4.3831\');
>> mo_speed_test
	subsref:	5.45 sec
	subsasgn:	6.40 sec
	combined:	11.25 sec
	cat(1):		41.91 sec
	cat(2):		42.66 sec
	cat(5):		42.68 sec
	ctor:		50.49 sec
 
% Advanpix's matrix / array manipulation engine 
>> clear all
>> addpath('D:\Multiprecision Computing Toolbox-3.5.0.4112\');
>> mo_speed_test
	subsref:	0.22 sec
	subsasgn:	0.30 sec
	combined:	0.52 sec
	cat(1):		0.55 sec
	cat(2):		0.55 sec
	cat(5):		0.55 sec    
	ctor:		0.65 sec

Now Advanpix engine outperforms MATLAB by 24 times at least (with maximum of 70 times). These results are quite surprising even for us, since our implementation is very far from being highly optimized. At the moment we use standard C++ features in our code which are considered to be “slow” in high performance world. With sufficient dedication engine from Advanpix can be optimized to run by 10-20 times faster.

Conclusion

I don’t want to comment MATLAB’s results in details. All I can say – MathWorks should do some serious changes to their development process. At least approaches / codes related to OOP should be completely re-considered.

As for latest version of toolbox – performance of existing scripts became much better now ;).

{ 0 comments… add one now }

Leave a Comment

Previous post:

Next post: