Undocumented MEX API in MATLAB

by Pavel Holoborodko on July 19, 2013

It is not a secret that MATLAB has many undocumented (or deliberately hidden) features and commands. There are even excellent website & book devoted specifically to this topic: Undocumented Matlab

However most of the findings are related to MATLAB language itself and investigations on undocumented MEX API seems to be missing (or scarce at least).

During development of our toolbox we have found lots of hidden functions which can be helpful for creating speed-efficient extensions for MATLAB using native languages.

Here we want to explain some of them in details and provide complete list of undocumented MEX API functions.

Please note that there is a risk in using the functions – MathWorks can change / remove some of them in next versions. It is additional burden for developer to stay tuned and update their toolboxes on time.

Reduced OOP capabilities of MATLAB

Starting from version 2008b MATLAB allows user to introduce custom-type classes by classdef keyword. MATLAB was late on adding object oriented features – I can only image how hard it was for developers at MathWorks to add OOP constructs to purely procedural language which follows entirely different philosophy. (Yes, objects could be created using structs and special folder structure in earlier version of MATLAB – but that was just ugly design, MathWorks will not support it in future).

They still don’t have full support for OOP features though. The most important missing features are:

  • It is prohibited to have destructor for custom non-handle classes.
  • It is not possible to overload assignment-without-subscripts operator (e.g. A = B).

I don’t know the reasons why these fundamental OOP paradigms are not implemented in MATLAB – but this disables creating powerful virtual machine-type of toolboxes.

In that case MATLAB objects would have only one property field – ‘id’, identifier of variable stored in MEX module – virtual machine (e.g. pointer to C++/C object). MEX module would only need to know ‘id’ of objects and what operation to conduct with them (+, -, *, etc.) – all processing would be done in MEX. Heavy data exchange between MATLAB and MEX libraries would be completely unnecessary. MATLAB would act as just an interpreter in such scenario. Moreover MEX API could be simplified to several functions only.

Deep-Copy access to object properties from MEX library (Official)

Unfortunately we are restricted to current architecture – where all the data are allocated / stored on MATLAB side and we have to transfer it from MATLAB to MEX library in order to work with it.

Official MEX API provides two functions to access object properties from within MEX library: mxGetProperty and mxSetProperty.

Both functions share the same major problem – they create deep copy of the data!

Imagine the situation when your object is a huge matrix with high-precision elements and it occupies 800MB of RAM. If we want to access it in MEX library (e.g. transpose) we would call mxGetProperty which will do ENTIRE COPY of your object’s property – wasting another 800MB!

Obviously this cannot be accepted, not speaking of totally reduced performance (copying could take a while for such amount too).

Copy-Free access to object properties from MEX library (Undocumented)

In search for remedy we found similar (but) undocumented functions we can use to get shared access to objects properties (32-bit):

extern mxArray* mxGetPropertyShared(const mxArray* pa, 
                                    unsigned int index, 
                                    const char * propname);
 
extern void mxSetPropertyShared(mxArray* pa, 
                                unsigned int index, 
                                const char * propname, 
                                const mxArray* value);

Functions can be used as one-to-one replacement for official functions. mxGetPropertyShared just returns pointer to existing property without any copying. mxDestroyArray can still be called on returned pointer (Thanks to James Tursa for correction).

Full list of MEX API functions

We have extracted full list of usable MEX functions from libmx.dll and libmex.dll (MATLAB 2012b 32-bit) – two main MEX API dynamic libraries. It includes functions from official API as well as undocumented ones:

The distinctive feature of the list – it provides de-mangled C++ names of functions, with type of arguments and return value (not published before to the knowledge of author). This makes usage of undocumented functions much easier.

Take a look on function list – there are a lot of interesting ones, like mxEye, mxIsInf, mxFastZeros, mxGetReferenceCount and many others.

Moreover it is possible to see high level C++ classes MathWorks developers use for work. For example, now it is clear that fundamental type mxArray_tag – is not a plain-old-struct anymore, it has member-functions and behaves more like a class. It even has custom new/delete heap management functions and overloaded assignment operator=. Reverse-engineering of these functions might reveal the exact & complete data structure of mxArray_tag.

Actually with some effort internal mxArray_tag class from MathWorks might be used in third-party MEX files now. How much more easier this would be instead of clumsy mx**** functions!

Please feel free to leave your requests or comments below.

{ 8 comments… read them below or add one }

Brad Stiritz July 25, 2013 at 1:25 am

Hi Pavel,

Thanks for your very interesting article. I didn’t follow the exact logic of how the two missing OOP features you listed enable the “virtual machine” functionality? Perhaps you could please add a footnote here or a URL if explained elsewhere? I would really like to understand via a code example how this would work. Thanks for your consideration, Brad

Reply

Pavel Holoborodko July 25, 2013 at 2:14 am

Hi Brad,

Thank you for your comment.

My conclusion was the opposite – absence of these two features disable us from creating “virtual machine” – type of toolboxes. Sorry for my unclear statement, my Engrish needs improvement.

I spent few days debugging & doing reverse-engineering of MATLAB’s core in attempts to find undocumented functions to implement these features manually – without any luck unfortunately.

Reply

Cris January 22, 2017 at 10:02 pm

Correct signatures for these two functions are:

extern mxArray* mxGetPropertyShared( mxArray const* pa, 
                                     mwIndex index, 
                                     char const* propname );

extern void mxSetPropertyShared( mxArray* pa, 
                                 mwIndex index, 
                                 char const* propname, 
                                 mxArray const* value );

Note that mwIndex has a different basic type depending on the architecture.

Noteworthy is that the MATLAB code for the property setter and getter methods will be executed here, so these functions are not necessarily quick. It is sad that we cannot create class methods as MEX-files.

Reply

Pavel Holoborodko January 23, 2017 at 12:31 pm

Cris,

Thank you for the correction. As a note, since TMW is moving away from 32-bit platforms, we can safely assume that mwIndex is just an alias for size_t (64-bit unsigned int).

@” It is sad that we cannot create class methods as MEX-files.”
In our toolbox class methods are implemented in single MEX file. Every method passes its ID to MEX so that MEX knows what functionality to execute.

Reply

Cris January 23, 2017 at 9:46 pm

Pavel,

But can you access private properties of the class? I have not yet tried this, but the documentation to mxSetProperty says that the property must have public write access. If so, the only way of having a MEX-file read or write private properties is by having an M-file for the method that calls a MEX-file and passes the contents of the relevant properties as individual inputs. You cannot have just a MEX-file that is called directly by MATLAB.

Another thing that I found out is that calling a class constructor through mexCallMATLAB in a MEX-file causes the input data to be copied (this is not the case when calling the constructor from an M-file). So I’m creating an empty object and then writing the relevant data into it through public properties (derived ones). This still means that M-file code is being executed from within the MEX-file. They really need to enhance their support for custom classes!

I’m confused in general that even the M-code for a class constructor cannot set a property directly if there is a set method defined for the property. I don’t really understand the design ideas behind the classes in MATLAB. It looks and smells like OOP, but it’s not fully there IMO.

I’d love to hear how you handled these issues, if you have the time. Thanks!

Reply

Pavel Holoborodko January 24, 2017 at 9:39 am

Dear Cris,

@ “I don’t really understand the design ideas behind the classes in MATLAB. It looks and smells like OOP, but it’s not fully there IMO.”
You nailed it, MATLAB has no real OOP, just some attempts to keep up with modern languages. M-language was designed as procedural and over the years they did a pretty good job optimizing its interpreter, lazy evaluations, etc. But all of these are hardcoded to double arrays. Introduction of new concepts like OOP to such ecosystem is a nightmare and they do what they can do….

The most painful shortcoming for me is that simple assignment operator (A = B) and destructor cannot be overloaded. This would allow me to implement whole class in C++ and keep only its pointer in M-class as a property, so that OOP would be implemented in C++ with M-class as thin wrapper for it.

In our case – custom class is a numeric type, which is frequently used in arithmetic operations, assignment, etc. where speed is crucial. All these operations are terribly slow in M-code [1] and we had to implement them in MEX. But MEX API has no real support for custom classes.

At the end we came to the following solution. Our M-class definition has only one public property which contains all the data (e.g. serialized C++ class), to minimize access overhead you mentioned. Also we disable actual access to the property from M-code by overloading SUBSREF operator, so that the property is hidden.

Every method of M-class just calls the MEX function with the ID of method to execute on the data. MEX gets the property/data using mxGetPropertyShared, which is extremely fast since it just gives you the pointer to real memory where data is stored. Thus we solved the fast access issue.

Another issue is creating objects from MEX. As you said, mexCallMATLAB is catastrophically slow and not applicable. We call it only once, to create object in memory and then make it persistent[2]. Every time we need to create new object of our class – we just duplicate[3] the persistent object. Duplication is like 10 times faster compared to mexCallMATLAB. Important thing is to not forget to delete the persistent object upon MEX exit [4] otherwise MATLAB crashes later on.

1. http://www.advanpix.com/2014/11/14/advanpix-vs-vpa-array-manipulation-operations/
2. mexMakeArrayPersistent
3. mxDuplicateArray
4. mexAtExit

Reply

Cris January 24, 2017 at 3:51 pm

Thanks, Pavel, for writing this down. I really appreciate it!

I didn’t realize that you can disable access to public properties through overloading SUBSREF. Very clever!

I found an alternative solution that involves a handle class storing a pointer to the C++ object. But apparently it breaks down on Windows when doing clear functions, so it requires locking the MEX file that created the C++ object. I’m still looking for a good solution that allows me to create objects in many different MEX-files. When you have hundreds of functions, you don’t want to create hundreds of M-file stubs that call the same MEX-file, which then has to dispatch to the right C++ function. It’s so much easier and cleaner to auto-generate a MEX-file for each C++ function. But maybe we’ll have to go that route anyway… 🙁

Pavel Holoborodko January 25, 2017 at 4:17 am

To be compatible with different versions of MATLAB – I use external development tools (Visual Studio on Windows and GCC on *NIX). Also every class method needs quite a lot of common functionality and global settings (e.g. precision, formatting, cached/persistent objects for speed, etc.).

In these conditions it is much easier to maintain only one code-base and single MEX file with all libraries linked-in and available to all methods.

Besides, it is always better to minimize number of MEX files, since MATLAB has severe issues when there are too many MEX loaded, at least on Linux:
http://www.advanpix.com/2016/02/22/matlab-error-cannot-load-any-more-object-with-static-tls/

Since R2008b MATLAB allows to put entire class definition in one M-file, and it is easy to dispatch methods, e.g.:

        %% Arithemtic operators        
        function r = minus(x,y)
           r = mpimpl(10,x,y); 
        end
        
        function r = plus(x,y)
           r = mpimpl(11,x,y); 
        end
        ...

The ‘mpimpl’ is single MEX file name, first argument is method ID.
The MEX file has table of {ID, PointerToFunction} which makes dispatching trivial on C++ level.

Leave a Comment

{ 1 trackback }

Previous post:

Next post: