小编jo2*_*248的帖子

可以将python中的bz2解压到文件而不是内存中

我曾使用库在内存中解压和读取文件bz2。但是，我已通读文档，似乎不能只是简单地解压缩文件，以在文件系统上使用解压缩的数据创建一个全新的文件，而无需内存存储。当然，您可以使用 BZ2Decompressor 逐行读取，然后将其写入文件，但这会非常慢。（解压海量文件，50GB+）。是否有一些我忽略的方法或库可以实现与bz2 -d myfile.ext.bz2python 中的终端命令相同的功能，而无需使用涉及子进程的 hacky 解决方案来调用该终端命令？

bz2 如此慢的示例：

通过 bz2 -d 解压该文件：104秒

对解压文件的分析（仅涉及逐行读取）：183秒

with open(file_src) as x:
    for l in x:

Run Code Online (Sandbox Code Playgroud)

解压文件并使用分析：超过 600 秒（此时间最大应为 104+183）

if file_src.endswith(".bz2"):
    bz_file = bz2.BZ2File(file_src)
    for l in bz_file:

Run Code Online (Sandbox Code Playgroud)

python compression

jo2*_*248

2018 03-03

4
推荐指数

1
解决办法

4228
查看次数

如何在python中有效地找到两个字典之间的所有差异

所以，我有 2 个字典，我必须检查缺少的键和匹配的键，检查它们是否具有相同或不同的值。

dict1 = {..}
dict2 = {..}
#key values in a list that are missing in each
missing_in_dict1_but_in_dict2 = []
missing_in_dict2_but_in_dict1 = []
#key values in a list that are mismatched between the 2 dictionaries
mismatch = []

Run Code Online (Sandbox Code Playgroud)

执行此操作的最有效方法是什么？

python dictionary

jo2*_*248

lucky-day

1
推荐指数

1
解决办法

6777
查看次数