我有.ndjson一个 20GB 的文件,我想用 Python 打开它。文件太大,所以我找到了一种方法,用一个在线工具将其分成 50 个文件。这是这个工具:https://pinetools.com/split-files
现在我得到一个文件,其扩展名.ndjson.000(我不知道那是什么)
我试图将其作为 json 或 csv 文件打开,以在 pandas 中读取它,但它不起作用。您知道如何解决这个问题吗?
import json
import pandas as pd
Run Code Online (Sandbox Code Playgroud)
第一种方法:
df = pd.read_json('dump.ndjson.000', lines=True)
Run Code Online (Sandbox Code Playgroud)
错误:ValueError: Unmatched ''"' when when decoding 'string'
第二种方法:
with open('dump.ndjson.000', 'r') as f:
my_data = f.read()
print(my_data)
Run Code Online (Sandbox Code Playgroud)
错误:json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 104925061 (char 104925060)
我认为问题是我的文件中有一些表情符号,所以我不知道如何对它们进行编码?
我有一个ndjson(换行符分隔的 JSON)文件,我需要解析它并获取某些逻辑操作的数据。有没有什么好的方法可以ndjson使用golang解析文件?下面给出了 ndjson 示例
{"a":"1","b":"2","c":[{"d":"100","e":"10"}]}
{"a":"2","b":"2","c":[{"d":"101","e":"11"}]}
{"a":"3","b":"2","c":[{"d":"102","e":"12"}]}
Run Code Online (Sandbox Code Playgroud) 如何在 SQL Server 2016 中打开 ndJSON 格式?我可以使用 JSON 格式打开,但对如何使用 ndJSON 一无所知。
SQL Server 中是否有特定功能可以执行此操作,还是有其他方法?
Declare @JSON varchar(max)
SELECT @JSON = BulkColumn
FROM OPENROWSET (BULK 'C:\examplepath\filename.JSON', SINGLE_CLOB) as j
Select * FROM OPENJSON(@JSON)
With (House varchar(50),
Car varchar(4000) '$.Attributes.Car',
Door varchar(4000) '$.Attributes.Door',
Bathroom varchar(4000) '$.Attributes.Bathroom' ,
Basement varchar(4000) '$.Attributes.Basement' ,
Attic varchar(4000) '$.Attributes.Attic'
) as Dataset
Go
Run Code Online (Sandbox Code Playgroud)
JSON 格式:
[
{"House":"Blue","Attributes":{"Car":"Camry","Door":"Small","Bathroom":"Medium","Basement":"Dark","Attic":"1"}},
{"House":"Red","Attributes":{"Car":"Thunderbird","Door":"Large","Bathroom":"Small","Basement":"Light","Attic":"4"}}
]
Run Code Online (Sandbox Code Playgroud)
ndJSON 格式:
{"House":"Blue","Attributes":{"Car":"Camry","Door":"Small","Bathroom":"Medium","Basement":"Dark","Attic":"1"}}
{"House":"Red","Attributes":{"Car":"Thunderbird","Door":"Large","Bathroom":"Small","Basement":"Light","Attic":"4"}}
Run Code Online (Sandbox Code Playgroud)