为什么使用“pickle”转储比“json”快得多?

use*_*210 3 python benchmarking json pickle

这是针对 Python 3.6 的。

编辑并删除了很多无关紧要的内容。

我原以为json比 Stack Overflow 上的其他答案和评论更快pickle,看起来很多其他人也相信这一点。

我的测试合格吗?差距比我想象的要大得多。我在非常大的物体上测试得到了相同的结果。

import json
import pickle
import timeit

file_name = 'foo'
num_tests = 100000

obj = {1: 1}

command = 'pickle.dumps(obj)'
setup = 'from __main__ import pickle, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("pickle: %f seconds" % result)

command = 'json.dumps(obj)'
setup = 'from __main__ import json, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("json:   %f seconds" % result)
Run Code Online (Sandbox Code Playgroud)

和输出:

pickle: 0.054130 seconds
json:   0.467168 seconds
Run Code Online (Sandbox Code Playgroud)

Ahm*_*akr 5

我根据您的代码片段尝试了几种方法,发现使用 cPickle 并将转储方法的协议参数设置为:cPickle.dumps(obj, protocol=cPickle.HIGHEST_PROTOCOL)是最快的转储方法。

import msgpack
import json
import pickle
import timeit
import cPickle
import numpy as np

num_tests = 10

obj = np.random.normal(0.5, 1, [240, 320, 3])

command = 'pickle.dumps(obj)'
setup = 'from __main__ import pickle, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("pickle:  %f seconds" % result)

command = 'cPickle.dumps(obj)'
setup = 'from __main__ import cPickle, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("cPickle:   %f seconds" % result)


command = 'cPickle.dumps(obj, protocol=cPickle.HIGHEST_PROTOCOL)'
setup = 'from __main__ import cPickle, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("cPickle highest:   %f seconds" % result)

command = 'json.dumps(obj.tolist())'
setup = 'from __main__ import json, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("json:   %f seconds" % result)


command = 'msgpack.packb(obj.tolist())'
setup = 'from __main__ import msgpack, obj'
result = timeit.timeit(command, setup=setup, number=num_tests)
print("msgpack:   %f seconds" % result)
Run Code Online (Sandbox Code Playgroud)

输出:

pickle         :   0.847938 seconds
cPickle        :   0.810384 seconds
cPickle highest:   0.004283 seconds
json           :   1.769215 seconds
msgpack        :   0.270886 seconds
Run Code Online (Sandbox Code Playgroud)

  • 对于未来的用户,我使用 `msgpack` 0.5.6 和 `pickle` (相当于 Python 3 中的 `cPickle`)运行 Python 3.5,并且 `pickle` 现在比 `msgpack` 更快: (2认同)