Explain Codes LogoExplain Codes Logo

Read .mat files in Python

python
dataframe
pandas
numpy
Nikita BarsukovbyNikita Barsukov·Nov 11, 2024
TLDR

The scipy.io.loadmat method allows you to import .mat files effortlessly in Python:

from scipy.io import loadmat data = loadmat('insert_your_file_path')['variable_name']

Replace 'insert_your_file_path' with the path of your .mat file and 'variable_name' with the specific variable you want to extract from MATLAB. Ensure you installed SciPy, and the version you're using is 0.7.0 or higher to be able to call loadmat().

Alternate solutions for different MATLAB versions

For .mat files from MATLAB v7.3, using h5py package might be beneficial as these files are stored as HDF5:

import h5py with h5py.File('insert_your_file_path', 'r') as f: data = f['variable_name'][()]

Another alternate is mat4py, which can be installed via pip and provides read and write access to .mat files, handling them as dictionaries and lists:

from mat4py import loadmat data = loadmat('insert_your_file_path')

Saving files in .mat format with Python

Save your Python data to a .mat file with the -v7 option for backward compatibility:

import scipy.io # Make Matlab users less grumpy scipy.io.savemat('new_file.mat', {'variable_name': your_data}, oned_as='row')

Diving into the data

After you've successfully imported your .mat file, top-level MATLAB variables are converted into a Python dictionary. The keys are MATLAB variable names. Familiarize yourself with the structure to navigate through the data efficiently.

Accessing variables

Extracting and working with the variables is straightforward:

variables = loadmat('data.mat') # Loop over variables like you loop over your problems for var_name in variables: if not var_name.startswith("__"): # We don't deal with private matters here print(f"Variable Name: {var_name}\nContents: {variables[var_name]}")

Converting to NumPy array

To convert the imported data to a NumPy array:

import numpy as np mat_data = loadmat('data.mat') array_data = np.array(mat_data['variable_name']) # Now unleash the full power of NumPy (and your data)!

Deep dive with the MATLAB engine

If you have MATLAB installed, the MATLAB Engine for Python allows you to call MATLAB functions:

import matlab.engine eng = matlab.engine.start_matlab() # It's like calling your friend MATLAB from Python data = eng.load('data.mat', nargout=1)

The nargout=1 parameter ensures MATLAB returns the data to Python.

Working with advanced .mat files using h5py

The h5py library is particularly useful while dealing with MATLAB v7.3 files:

import h5py # Like a treasure hunt with h5py.File('data_v73.mat', 'r') as file: base_items = list(file.items()) print(f"Base items in the .mat file: {base_items}") # Now you are the datasec (as in detective, but for datasets) group = file['/data_group'] for name, dataset in group.items(): print(name, dataset)

Don't forget, knowing the exact structure of your .mat file is critical here.

When more conversion is needed

mat4py can also convert data to JSON-compatible format:

import mat4py from json import dumps # When you speak both Python and JSON data = mat4py.loadmat('data.mat') json_data = dumps(data, indent=2)

And the squeezing feature removes single-element arrays:

# Squeeze the last drop out of your data data = mat4py.loadmat('data.mat', squeeze_me=True)