SimpleJSON和NumPy数组

epo*_*och 41 python json numpy simplejson

使用simplejson序列化numpy数组的最有效方法是什么?

小智 78

为了保持dtype和维度,请尝试以下方法:

import base64
import json
import numpy as np

class NumpyEncoder(json.JSONEncoder):

    def default(self, obj):
        """If input object is an ndarray it will be converted into a dict 
        holding dtype, shape and the data, base64 encoded.
        """
        if isinstance(obj, np.ndarray):
            if obj.flags['C_CONTIGUOUS']:
                obj_data = obj.data
            else:
                cont_obj = np.ascontiguousarray(obj)
                assert(cont_obj.flags['C_CONTIGUOUS'])
                obj_data = cont_obj.data
            data_b64 = base64.b64encode(obj_data)
            return dict(__ndarray__=data_b64,
                        dtype=str(obj.dtype),
                        shape=obj.shape)
        # Let the base class default method raise the TypeError
        super(NumpyEncoder, self).default(obj)


def json_numpy_obj_hook(dct):
    """Decodes a previously encoded numpy ndarray with proper shape and dtype.

    :param dct: (dict) json encoded ndarray
    :return: (ndarray) if input was an encoded ndarray
    """
    if isinstance(dct, dict) and '__ndarray__' in dct:
        data = base64.b64decode(dct['__ndarray__'])
        return np.frombuffer(data, dct['dtype']).reshape(dct['shape'])
    return dct

expected = np.arange(100, dtype=np.float)
dumped = json.dumps(expected, cls=NumpyEncoder)
result = json.loads(dumped, object_hook=json_numpy_obj_hook)


# None of the following assertions will be broken.
assert result.dtype == expected.dtype, "Wrong Type"
assert result.shape == expected.shape, "Wrong Shape"
assert np.allclose(expected, result), "Wrong Values"
Run Code Online (Sandbox Code Playgroud)

  • @Community这是为C_CONTIGUOUS编辑的,类似于我对http://stackoverflow.com/a/29853094/3571110的回答.当我看到这个时,我认为np.ascontiguousarray()是C_CONTIGUOUS的无操作,使得if/else检查不再需要与简单地总是调用np.ascontiguousarray()相比.我对么? (3认同)
  • 为了解决无限递归问题,我将`return json.JSONEncoder(self,obj)`改为`super(JsonNumpy,self).default(obj)` (3认同)

Ale*_*lli 28

我用它simplejson.dumps(somearray.tolist())作为最方便的方法(如果我还在使用simplejson它,这意味着要坚持使用Python 2.5或更早版本; 2.6及更高版本有一个标准库模块json,其工作方式相同,所以我当然会使用它如果使用的Python版本支持它;-).

在提高效率的追求,你可以继承json.JSONEncoder(在json;我不知道,如果上了年纪simplejson已经提供了这样的定制的可能性),并在该default法,特殊情况下的情况下,numpy.array通过把它们变为列表或元组"只及时".不过,我有点怀疑你通过这种方法在性能方面获得足够的收益来证明这种努力是合理的.


小智 17

我发现这个json子类代码用于序列化字典中的一维numpy数组.我试了一下它对我有用.

class NumpyAwareJSONEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, numpy.ndarray) and obj.ndim == 1:
            return obj.tolist()
        return json.JSONEncoder.default(self, obj)
Run Code Online (Sandbox Code Playgroud)

我的字典是'结果'.这是我写入文件"data.json"的方式:

j=json.dumps(results,cls=NumpyAwareJSONEncoder)
f=open("data.json","w")
f.write(j)
f.close()
Run Code Online (Sandbox Code Playgroud)

  • 当你有一个嵌套在dict中的numpy数组时,这种方法也有效.这个答案(我认为)暗示了我刚才所说的,但这是一个重点. (2认同)

unu*_*tbu 10

这显示了如何从1D NumPy数组转换为JSON并返回到数组:

try:
    import json
except ImportError:
    import simplejson as json
import numpy as np

def arr2json(arr):
    return json.dumps(arr.tolist())
def json2arr(astr,dtype):
    return np.fromiter(json.loads(astr),dtype)

arr=np.arange(10)
astr=arr2json(arr)
print(repr(astr))
# '[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]'
dt=np.int32
arr=json2arr(astr,dt)
print(repr(arr))
# array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Run Code Online (Sandbox Code Playgroud)

基于tlausch的回答,这里有一种JSON编码NumPy数组的方法,同时保留任何NumPy数组的形状和dtype - 包括那些具有复杂dtype的数组.

class NDArrayEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.ndarray):
            output = io.BytesIO()
            np.savez_compressed(output, obj=obj)
            return {'b64npz' : base64.b64encode(output.getvalue())}
        return json.JSONEncoder.default(self, obj)


def ndarray_decoder(dct):
    if isinstance(dct, dict) and 'b64npz' in dct:
        output = io.BytesIO(base64.b64decode(dct['b64npz']))
        output.seek(0)
        return np.load(output)['obj']
    return dct

# Make expected non-contiguous structured array:
expected = np.arange(10)[::2]
expected = expected.view('<i4,<f4')

dumped = json.dumps(expected, cls=NDArrayEncoder)
result = json.loads(dumped, object_hook=ndarray_decoder)

assert result.dtype == expected.dtype, "Wrong Type"
assert result.shape == expected.shape, "Wrong Shape"
assert np.array_equal(expected, result), "Wrong Values"
Run Code Online (Sandbox Code Playgroud)