相关疑难解决方法(0)

有没有一种内存高效快速的方法来加载python中的大json文件？

我有一些500MB的json文件.如果我使用"trivial"json.load一次加载其内容,它将消耗大量内存.

有没有办法部分阅读文件？如果它是一个文本,行分隔文件,我将能够遍历这些行.我正在寻找它的类比.

有什么建议？谢谢

python json large-files

dud*_*ein

2010 03-15

56
推荐指数

5
解决办法

5万
查看次数

如何写入大JSON数据？

我一直在尝试将大量（> 800mb）数据写入JSON文件；我做了一些相当多的试验和错误来获得此代码：

def write_to_cube(data):
    with open('test.json') as file1:
        temp_data = json.load(file1)

    temp_data.update(data)

    file1.close()

    with open('test.json', 'w') as f:
        json.dump(temp_data, f)

        f.close()

Run Code Online (Sandbox Code Playgroud)

运行它只需调用该函数write_to_cube({"some_data" = data})

现在这段代码的问题是，对于少量数据来说速度很快，但是当test.json文件超过 800mb 时就会出现问题。当我尝试更新或添加数据时，需要很长时间。

我知道有外部库，例如simplejson或jsonpickle，我不太确定如何使用它们。

还有其他办法解决这个问题吗？

更新：

我不确定这怎么可能是重复的，其他文章没有提到编写或更新大型 JSON 文件，而是只提到了解析。

有没有一种内存高效且快速的方法来在 python 中加载大 json 文件？

在Python中读取相当大的json文件

上述任何一个都不能重复解决这个问题。他们没有谈论任何有关写作或更新的事情。

python json

Aks*_*hay

2017 05-23

6
推荐指数

1
解决办法

1万
查看次数

无法保留内存块，在python中导入json错误

import pandas as pd
with open(r'data.json') as f:
   df = pd.read_json(f, encoding='utf-8')

Run Code Online (Sandbox Code Playgroud)

我收到“无法保留内存块”错误。JSON 文件大小为 300MB。Python 中为正在运行的程序保留内存有限制吗？我的电脑有 8GB RAM，使用 Windows 10。

loading of json file into df
Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm 2018.1.4\helpers\pydev\pydev_run_in_console.py", line 52, in run_file
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm 2018.1.4\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/Beorn/PycharmProjects/project_0/projekt/test.py", line 7, in <module>
    df = pd.read_json(f, encoding='utf-8')
  File "C:\Users\Beorn\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\io\json\json.py", line 422, in read_json
    result = json_reader.read()
  File "C:\Users\Beorn\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\io\json\json.py", …

Run Code Online (Sandbox Code Playgroud)

python import json

Beo*_*orn

2022 05-17

6
推荐指数

1
解决办法

1万
查看次数

如何在大型 JSON 文件中查找唯一值？

我有大小2个JSON文件data_large(150.1mb)和data_small(7.5kb)。每个文件中的内容都是[{"score": 68},{"score": 78}]. 我需要从每个文件中找到唯一分数的列表。

在处理data_small 时，我执行了以下操作，并且能够使用0.1 secs.

with open('data_small') as f:
    content = json.load(f)

print content # I'll be applying the logic to find the unique values later.

Run Code Online (Sandbox Code Playgroud)

但是在处理data_large 时，我做了以下事情，我的系统被挂起，速度很慢，不得不强制关闭它以使其恢复正常速度。2 mins 打印其内容需要花费一些时间。

with open('data_large') as f:
    content = json.load(f)

print content # I'll be applying the logic to find the unique values later.

Run Code Online (Sandbox Code Playgroud)

如何在处理大型数据集时提高程序的效率？

python json

pyt*_*der

2014 01-04

5
推荐指数

1
解决办法

9804
查看次数

Django灯具.加载初始数据流程正在被杀死

我一直致力于将两个遗留数据库中的57k +记录精炼和重构为一个与Django兼容的实体.现在,当我完成后,我将其作为夹具倾倒,我试图在生产环境中加载它.

我的问题是这个过程在一段时间后被"杀死".我的过程是:

./manage.py syncdb --noinput
./manage.py loaddata core/fixtures/auth.json  # just a default user
./manage.py migrate

Run Code Online (Sandbox Code Playgroud)

结果:

Running migrations for django_extensions:  # custom apps migrate just fine
 - Migrating forwards to 0001_empty.
 > django_extensions:0001_empty
 - Loading initial data for django_extensions.
Installed 0 object(s) from 0 fixture(s)
Running migrations for myotherapp:
 - Migrating forwards to 0001_initial.
 > myotherapp:0001_initial
 - Loading initial data for myotherapp.
Installed 4 object(s) from 1 fixture(s)  # my other app with a fixture migrates ok
Running migrations …

Run Code Online (Sandbox Code Playgroud)

django django-south django-fixtures

gwa*_*dze

2013 10-29

4
推荐指数

1
解决办法

2400
查看次数

如何在 Pytorch 中处理大型 JSON 文件？

我正在研究时间序列问题。不同的训练时间序列数据存储在一个 30GB 的大型 JSON 文件中。在 tensorflow 中，我知道如何使用 TF 记录。pytorch 中是否有类似的方法？

time-series deep-learning pytorch

Sha*_*ana

2020 08-04

1
推荐指数

1
解决办法

1467
查看次数

JSON 模式生成器 Python

我正在使用此资源生成架构https://github.com/wolverdude/GenSON/

我有以下 JSON 文件

{
 'name':'Sam',
},
{
 'name':'Jack',
}

Run Code Online (Sandbox Code Playgroud)

很快 ...

我想知道如何遍历大型 JSON 文件。我想解析每个 JSON 文件并将其传递给 GENSON 以生成架构

{
  "$schema": "http://json-schema.org/schema#",
  "type": "object",
  "properties": {
     "name": {
       "type": [
        "string"
      ]
   }
},
  "required": [
    "name"
  ]
}

Run Code Online (Sandbox Code Playgroud)

python json

sam*_*pin

2019 05-08

1
推荐指数

1
解决办法

4442
查看次数