scipy.io.loadmat嵌套结构(即字典)

mer*_*gen 31 python dictionary nested structure scipy

使用给定的例程(如何使用scipy加载Matlab .mat文件),我无法访问更深层次的嵌套结构以将它们恢复为字典

为了更详细地介绍我遇到的问题,我给出了以下玩具示例:

load scipy.io as spio
a = {'b':{'c':{'d': 3}}}
# my dictionary: a['b']['c']['d'] = 3
spio.savemat('xy.mat',a)
Run Code Online (Sandbox Code Playgroud)

现在我想将mat-File读回到python中.我尝试了以下方法:

vig=spio.loadmat('xy.mat',squeeze_me=True)
Run Code Online (Sandbox Code Playgroud)

如果我现在想要访问我得到的字段:

>> vig['b']
array(((array(3),),), dtype=[('c', '|O8')])
>> vig['b']['c']
array(array((3,), dtype=[('d', '|O8')]), dtype=object)
>> vig['b']['c']['d']
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

/<ipython console> in <module>()

ValueError: field named d not found.
Run Code Online (Sandbox Code Playgroud)

但是,通过使用该选项struct_as_record=False,可以访问该字段:

v=spio.loadmat('xy.mat',squeeze_me=True,struct_as_record=False)
Run Code Online (Sandbox Code Playgroud)

现在有可能通过它访问它

>> v['b'].c.d
array(3)
Run Code Online (Sandbox Code Playgroud)

mer*_*gen 45

以下是函数,重构字典只需使用此loadmat而不是scipy.io的loadmat:

import scipy.io as spio

def loadmat(filename):
    '''
    this function should be called instead of direct spio.loadmat
    as it cures the problem of not properly recovering python dictionaries
    from mat files. It calls the function check keys to cure all entries
    which are still mat-objects
    '''
    data = spio.loadmat(filename, struct_as_record=False, squeeze_me=True)
    return _check_keys(data)

def _check_keys(dict):
    '''
    checks if entries in dictionary are mat-objects. If yes
    todict is called to change them to nested dictionaries
    '''
    for key in dict:
        if isinstance(dict[key], spio.matlab.mio5_params.mat_struct):
            dict[key] = _todict(dict[key])
    return dict        

def _todict(matobj):
    '''
    A recursive function which constructs from matobjects nested dictionaries
    '''
    dict = {}
    for strg in matobj._fieldnames:
        elem = matobj.__dict__[strg]
        if isinstance(elem, spio.matlab.mio5_params.mat_struct):
            dict[strg] = _todict(elem)
        else:
            dict[strg] = elem
    return dict
Run Code Online (Sandbox Code Playgroud)

  • 这需要更好地宣传。scipy 的 loadmat 的当前实现是一个真正的痛苦工作。很棒的工作! (2认同)
  • 实际上,@jpapon 下面的方法更好,并且在处理图像等数组时是必需的。 (2认同)

小智 18

只是对mergen的答案的增强,遗憾的是,如果它到达对象的单元格数组,它将停止递归.以下版本将改为列出它们,并在可能的情况下继续递归到单元数组元素中.

import scipy
import numpy as np


def loadmat(filename):
    '''
    this function should be called instead of direct spio.loadmat
    as it cures the problem of not properly recovering python dictionaries
    from mat files. It calls the function check keys to cure all entries
    which are still mat-objects
    '''
    def _check_keys(d):
        '''
        checks if entries in dictionary are mat-objects. If yes
        todict is called to change them to nested dictionaries
        '''
        for key in d:
            if isinstance(d[key], spio.matlab.mio5_params.mat_struct):
                d[key] = _todict(d[key])
        return d

    def _todict(matobj):
        '''
        A recursive function which constructs from matobjects nested dictionaries
        '''
        d = {}
        for strg in matobj._fieldnames:
            elem = matobj.__dict__[strg]
            if isinstance(elem, spio.matlab.mio5_params.mat_struct):
                d[strg] = _todict(elem)
            elif isinstance(elem, np.ndarray):
                d[strg] = _tolist(elem)
            else:
                d[strg] = elem
        return d

    def _tolist(ndarray):
        '''
        A recursive function which constructs lists from cellarrays
        (which are loaded as numpy ndarrays), recursing into the elements
        if they contain matobjects.
        '''
        elem_list = []
        for sub_elem in ndarray:
            if isinstance(sub_elem, spio.matlab.mio5_params.mat_struct):
                elem_list.append(_todict(sub_elem))
            elif isinstance(sub_elem, np.ndarray):
                elem_list.append(_tolist(sub_elem))
            else:
                elem_list.append(sub_elem)
        return elem_list
    data = scipy.io.loadmat(filename, struct_as_record=False, squeeze_me=True)
    return _check_keys(data)
Run Code Online (Sandbox Code Playgroud)

  • 很棒的工作.如果将其纳入scipy会很棒. (4认同)
  • 我建议使用[改进版本](https://stackoverflow.com/suggested-edits/4048476),在将 ndarray 转换为列表之前测试结构的数组内容。 (2认同)

Par*_*dhu 8

scipy >= 1.5.0开始,此功能现在使用参数内置simplify_cells

from scipy.io import loadmat

mat_dict = loadmat(file_name, simplify_cells=True)
Run Code Online (Sandbox Code Playgroud)