-
-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MATIO Perfomance compared to matlab libs #65
Comments
I am very interested in such a performance benchmark. Do you think you can share your code and the MAT file you were using such that I can try to reproduce? |
Sure thing. No work has been put in to making the code clean however, it was just meant to be a quick comparison between the two libraries. The code was written only to work in windows env with visual studio. Also good to know that I had the option "Character set" set to "Use Multi-Byte Character Set". I build two seperate .exe in Visual Studio, one that includes the matio libs and headers and one that include the matlab libs and headers. Libs added to project settings was libmx.lib and libmat.lib for Matlab and libmatio.lib for the Matio project. Include files used in both projects: #include <iostream>
#include <windows.h>
#include <ctime> And then also Then I had the following code in the main function that choose folder, start timer and calls the ReadMatFile function. The main function was identical for the two exe. int main()
{
string pathToFiles = "c:\\rerun\\TwoLogs\\";
string fileExtension = "*.mat";
WIN32_FIND_DATA search_data;
memset(&search_data, 0, sizeof(WIN32_FIND_DATA));
HANDLE handle = FindFirstFile((pathToFiles+fileExtension).c_str(), &search_data);
cout << "Tick!" << endl;
clock_t startTime = clock();
while (handle != INVALID_HANDLE_VALUE)
{
ReadMatFile((pathToFiles+search_data.cFileName).c_str());
if (FindNextFile(handle, &search_data) == FALSE)
break;
}
clock_t endTime = clock();
cout << "Tock!" << endl;
FindClose(handle);
cout << "Done in: " << ((float)(endTime - startTime) / CLOCKS_PER_SEC) << endl;
system("pause");
return 0;
} Then I had two different versions of the ReadMatFile, one that uses the Matlab commands and one that uses Matios commands. Matlab: void ReadMatFile(const char* file)
{
MATFile *pmat;
mxArray *pa;
const char *name;
int varCnt = 0;
cout << "Try to read all variables in: " << file << endl;
pmat = matOpen(file, "r");
if (pmat == NULL)
{
cout << "Failed to open!" << endl;
return;
}
while ((pa = matGetNextVariable(pmat, &name)) != NULL)
{
varCnt++;
mxDestroyArray(pa);
}
if (matClose(pmat) != 0)
{
cout << "Failed to close! " << endl;
return;
}
cout << varCnt << " variables found and read in file..." << endl;
return;
} And Matio: void ReadMatFile(const char* file)
{
mat_t *pmat;
matvar_t *pa;
int varCnt = 0;
cout << "Try to read all variables in: " << file << endl;
pmat = Mat_Open(file, MAT_ACC_RDONLY);
if (pmat == NULL)
{
cout << "Failed to open!" << endl;
return;
}
while ((pa = Mat_VarReadNext(pmat)) != NULL)
{
varCnt++;
Mat_VarFree(pa);
}
if (Mat_Close(pmat) != 0)
{
cout << "Failed to close! " << endl;
return;
}
cout << varCnt << " variables found and read in file..." << endl;
return;
} |
Thanks for the code snippets. Based on them I've created https://github.com/tbeu/matioPerformance which compiles with VS 2012 - the same VS version that the MATLAB R2015b libraries were built with. libmatio.dll was built from current master and tweaked to link with hdf5.lib v1.8.12 (the version required by MATLAB R2105b) and zlib1.lib v1.2.11. I observe that matPerf.exe crawls the MAT-files in the data folder in about 1.2 seconds whereas matioPerf.exe needs about 8.4 seconds. |
Very interesting! Please let me know if I can do something to help? We have files with relatively complex structures inside, is that relevant for the performance difference? |
I am expecting the three performance bottle-necks
|
@emmenlau Well, you could run matPer/matioPerf on your files and try to get it down to a single struct. On the other side I am not sure if matGetNextVariable and Mat_VarReadNext are really comparable. We could also try matGetNextVariableInfo and Mat_VarReadNextInfo to ignore the data I/O. |
https://github.com/tbeu/matioPerformance was updated
|
@emmenlau Is there anything you figured out? |
As reported by tbeu#65
@emmenlau FYI I updated https://github.com/tbeu/matioPerformance to the upcoming libmatio v1.5.24. test_suites.zip from #157 (comment) still is a performance bottle-neck. |
* The performance gain is obtained by removing the slow HDF5 API function H5Iget_name being the main bottleneck. Handles of HDF5 groups or datasets are now kept open for the lifetime of the matvar_t instance. * As a side-effect, the hdf5_name could be removed from matvar_t.internal, too. * Fix reference counting in Mat_VarDuplicate * As reported by #65 and #198
* The performance gain is obtained by removing the slow HDF5 API function H5Iget_name being the main bottleneck. Handles of HDF5 groups or datasets are now kept open for the lifetime of the matvar_t instance. * As a side-effect, the hdf5_name could be removed from matvar_t.internal, too. * Fix reference counting in Mat_VarDuplicate * As reported by tbeu#65 and tbeu#198
Hi, I am wondering how the perfomance of MATIO is compared to the standard matlab libraries. I need to read a lot of data from multiple .mat files and also write alot of data so it is of high importance that it can be done quickly. So I tried comparing MATIO with Matlab 2015b libraries. Unfortunately MATIO was much slower (when Matlab libs took 60s to read a bunch of files it took almost 400s for MATIO).
But I don't know if I that have compiled MATIO with settings that caused it to be much slower or if my benchmarking software has some bugs in it. Is there a big performance difference or have I just done something wrong on my end?
The text was updated successfully, but these errors were encountered: