我有一个JSON文件,我想要转换为CSV文件.我怎么能用Python做到这一点?
我试过了:
import json
import csv
f = open('data.json')
data = json.load(f)
f.close()
f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
csv_file.writerow(item)
f.close()
Run Code Online (Sandbox Code Playgroud)
但是,它没有用.我正在使用Django,我收到的错误是:
import json
import csv
f = open('data.json')
data = json.load(f)
f.close()
f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
f.writerow(item) # ? changed
f.close()
Run Code Online (Sandbox Code Playgroud)
那么,我尝试了以下内容:
[{
"pk": 22,
"model": "auth.permission",
"fields": {
"codename": "add_logentry",
"name": "Can add log entry",
"content_type": 8
}
}, {
"pk": 23,
"model": "auth.permission",
"fields": {
"codename": "change_logentry",
"name": "Can change log entry",
"content_type": 8
}
}, {
"pk": 24,
"model": "auth.permission",
"fields": {
"codename": "delete_logentry",
"name": "Can delete log entry",
"content_type": 8
}
}, {
"pk": 4,
"model": "auth.permission",
"fields": {
"codename": "add_group",
"name": "Can add group",
"content_type": 2
}
}, {
"pk": 10,
"model": "auth.permission",
"fields": {
"codename": "add_message",
"name": "Can add message",
"content_type": 4
}
}
]
Run Code Online (Sandbox Code Playgroud)
然后我得到错误:
import json
import csv
f = open('data.json')
data = json.load(f)
f.close()
f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
csv_file.writerow(item)
f.close()
Run Code Online (Sandbox Code Playgroud)
示例json文件:
import json
import csv
f = open('data.json')
data = json.load(f)
f.close()
f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
f.writerow(item) # ? changed
f.close()
Run Code Online (Sandbox Code Playgroud)
YOU*_*YOU 115
我不确定这个问题是否已经解决,但是让我粘贴我所做的以供参考.
首先,您的JSON具有嵌套对象,因此通常无法直接转换为CSV.您需要将其更改为以下内容:
{
"pk": 22,
"model": "auth.permission",
"codename": "add_logentry",
"content_type": 8,
"name": "Can add log entry"
},
......]
Run Code Online (Sandbox Code Playgroud)
这是我从中生成CSV的代码:
import csv
import json
x = """[
{
"pk": 22,
"model": "auth.permission",
"fields": {
"codename": "add_logentry",
"name": "Can add log entry",
"content_type": 8
}
},
{
"pk": 23,
"model": "auth.permission",
"fields": {
"codename": "change_logentry",
"name": "Can change log entry",
"content_type": 8
}
},
{
"pk": 24,
"model": "auth.permission",
"fields": {
"codename": "delete_logentry",
"name": "Can delete log entry",
"content_type": 8
}
}
]"""
x = json.loads(x)
f = csv.writer(open("test.csv", "wb+"))
# Write CSV Header, If you dont need that, remove this line
f.writerow(["pk", "model", "codename", "name", "content_type"])
for x in x:
f.writerow([x["pk"],
x["model"],
x["fields"]["codename"],
x["fields"]["name"],
x["fields"]["content_type"]])
Run Code Online (Sandbox Code Playgroud)
您将获得输出:
pk,model,codename,name,content_type
22,auth.permission,add_logentry,Can add log entry,8
23,auth.permission,change_logentry,Can change log entry,8
24,auth.permission,delete_logentry,Can delete log entry,8
Run Code Online (Sandbox Code Playgroud)
vmg*_*vmg 86
使用pandas 库,这就像使用两个命令一样简单!
pandas.read_json()
Run Code Online (Sandbox Code Playgroud)
将JSON字符串转换为pandas对象(系列或数据框).然后,假设结果存储为df:
df.to_csv()
Run Code Online (Sandbox Code Playgroud)
哪个可以返回字符串或直接写入csv文件.
基于以前答案的详细程度,我们都应该感谢大熊猫的捷径.
Ale*_*ail 85
我假设您的JSON文件将解码为字典列表.首先,我们需要一个能够展平JSON对象的函数:
def flattenjson( b, delim ):
val = {}
for i in b.keys():
if isinstance( b[i], dict ):
get = flattenjson( b[i], delim )
for j in get.keys():
val[ i + delim + j ] = get[j]
else:
val[i] = b[i]
return val
Run Code Online (Sandbox Code Playgroud)
在JSON对象上运行此代码段的结果:
flattenjson( {
"pk": 22,
"model": "auth.permission",
"fields": {
"codename": "add_message",
"name": "Can add message",
"content_type": 8
}
}, "__" )
Run Code Online (Sandbox Code Playgroud)
是
{
"pk": 22,
"model": "auth.permission',
"fields__codename": "add_message",
"fields__name": "Can add message",
"fields__content_type": 8
}
Run Code Online (Sandbox Code Playgroud)
将此函数应用于JSON对象的输入数组中的每个dict后:
input = map( lambda x: flattenjson( x, "__" ), input )
Run Code Online (Sandbox Code Playgroud)
并找到相关的列名:
columns = [ x for row in input for x in row.keys() ]
columns = list( set( columns ) )
Run Code Online (Sandbox Code Playgroud)
通过csv模块运行它并不困难:
with open( fname, 'wb' ) as out_file:
csv_w = csv.writer( out_file )
csv_w.writerow( columns )
for i_r in input:
csv_w.writerow( map( lambda x: i_r.get( x, "" ), columns ) )
Run Code Online (Sandbox Code Playgroud)
我希望这有帮助!
Ale*_*lli 35
JSON可以表示各种各样的数据结构 - JS"对象"大致类似于Python dict(带字符串键),JS"数组"大致类似于Python列表,只要最后一个就可以嵌套它们"叶"元素是数字或字符串.
CSV本质上只能表示一个二维表 - 可选地带有第一行"标题",即"列名",这可以使表可解释为一个字典列表,而不是正常的解释,列表列表(同样,"叶子"元素可以是数字或字符串).
因此,在一般情况下,您无法将任意JSON结构转换为CSV.在一些特殊情况下,您可以(没有进一步嵌套的数组数组;所有具有完全相同键的对象数组).哪种特殊情况(如果有的话)适用于您的问题?解决方案的细节取决于您拥有的特殊情况.鉴于您甚至没有提到哪一个适用的惊人事实,我怀疑您可能没有考虑过约束,事实上既不适用也不适用,而您的问题无法解决.但是请澄清!
Mik*_*ass 26
一种通用解决方案,可将任何平面对象的json列表转换为csv.
将input.json文件作为命令行的第一个参数传递.
import csv, json, sys
input = open(sys.argv[1])
data = json.load(input)
input.close()
output = csv.writer(sys.stdout)
output.writerow(data[0].keys()) # header row
for row in data:
output.writerow(row.values())
Run Code Online (Sandbox Code Playgroud)
Dan*_*erz 23
假设您的JSON数据位于名为的文件中,此代码应该适合您data.json.
import json
import csv
with open("data.json") as file:
data = json.load(file)
with open("data.csv", "w") as file:
csv_file = csv.writer(file)
for item in data:
fields = list(item['fields'].values())
csv_file.writerow([item['pk'], item['model']] + fields)
Run Code Online (Sandbox Code Playgroud)
Ret*_*402 16
它易于使用csv.DictWriter(),详细的实现可以是这样的:
def read_json(filename):
return json.loads(open(filename).read())
def write_csv(data,filename):
with open(filename, 'w+') as outf:
writer = csv.DictWriter(outf, data[0].keys())
writer.writeheader()
for row in data:
writer.writerow(row)
# implement
write_csv(read_json('test.json'), 'output.csv')
Run Code Online (Sandbox Code Playgroud)
请注意,这假定您的所有JSON对象都具有相同的字段.
这是可以帮助您的参考.
Tre*_*ney 16
json_normalize自pandas:test.json.encoding='utf-8' 已在此处使用,但对于其他情况可能不需要。pathlib库。
.open是一种方法pathlib。pandas.to_csv(...)将数据保存到CSV文件。import pandas as pd
# As of Pandas 1.01, json_normalize as pandas.io.json.json_normalize is deprecated and is now exposed in the top-level namespace.
# from pandas.io.json import json_normalize
from pathlib import Path
import json
# set path to file
p = Path(r'c:\some_path_to_file\test.json')
# read json
with p.open('r', encoding='utf-8') as f:
data = json.loads(f.read())
# create dataframe
df = pd.json_normalize(data)
# dataframe view
pk model fields.codename fields.name fields.content_type
22 auth.permission add_logentry Can add log entry 8
23 auth.permission change_logentry Can change log entry 8
24 auth.permission delete_logentry Can delete log entry 8
4 auth.permission add_group Can add group 2
10 auth.permission add_message Can add message 4
# save to csv
df.to_csv('test.csv', index=False, encoding='utf-8')
Run Code Online (Sandbox Code Playgroud)
pk,model,fields.codename,fields.name,fields.content_type
22,auth.permission,add_logentry,Can add log entry,8
23,auth.permission,change_logentry,Can change log entry,8
24,auth.permission,delete_logentry,Can delete log entry,8
4,auth.permission,add_group,Can add group,2
10,auth.permission,add_message,Can add message,4
Run Code Online (Sandbox Code Playgroud)
我在使用Dan提出的解决方案时遇到了麻烦,但这对我有用:
import json
import csv
f = open('test.json')
data = json.load(f)
f.close()
f=csv.writer(open('test.csv','wb+'))
for item in data:
f.writerow([item['pk'], item['model']] + item['fields'].values())
Run Code Online (Sandbox Code Playgroud)
"test.json"包含以下内容:
[
{"pk": 22, "model": "auth.permission", "fields":
{"codename": "add_logentry", "name": "Can add log entry", "content_type": 8 } },
{"pk": 23, "model": "auth.permission", "fields":
{"codename": "change_logentry", "name": "Can change log entry", "content_type": 8 } }, {"pk": 24, "model": "auth.permission", "fields":
{"codename": "delete_logentry", "name": "Can delete log entry", "content_type": 8 } }
]
Run Code Online (Sandbox Code Playgroud)
亚历克的回答很好,但在有多层嵌套的情况下不起作用。这是一个支持多级嵌套的修改版本。如果嵌套对象已经指定了自己的键(例如 Firebase Analytics / BigTable / BigQuery 数据),它也会使标头名称更好一点:
"""Converts JSON with nested fields into a flattened CSV file.
"""
import sys
import json
import csv
import os
import jsonlines
from orderedset import OrderedSet
# from https://stackoverflow.com/a/28246154/473201
def flattenjson( b, prefix='', delim='/', val=None ):
if val is None:
val = {}
if isinstance( b, dict ):
for j in b.keys():
flattenjson(b[j], prefix + delim + j, delim, val)
elif isinstance( b, list ):
get = b
for j in range(len(get)):
key = str(j)
# If the nested data contains its own key, use that as the header instead.
if isinstance( get[j], dict ):
if 'key' in get[j]:
key = get[j]['key']
flattenjson(get[j], prefix + delim + key, delim, val)
else:
val[prefix] = b
return val
def main(argv):
if len(argv) < 2:
raise Error('Please specify a JSON file to parse')
print "Loading and Flattening..."
filename = argv[1]
allRows = []
fieldnames = OrderedSet()
with jsonlines.open(filename) as reader:
for obj in reader:
# print 'orig:\n'
# print obj
flattened = flattenjson(obj)
#print 'keys: %s' % flattened.keys()
# print 'flattened:\n'
# print flattened
fieldnames.update(flattened.keys())
allRows.append(flattened)
print "Exporting to CSV..."
outfilename = filename + '.csv'
count = 0
with open(outfilename, 'w') as file:
csvwriter = csv.DictWriter(file, fieldnames=fieldnames)
csvwriter.writeheader()
for obj in allRows:
# print 'allRows:\n'
# print obj
csvwriter.writerow(obj)
count += 1
print "Wrote %d rows" % count
if __name__ == '__main__':
main(sys.argv)
Run Code Online (Sandbox Code Playgroud)
这是对@MikeRepass 答案的修改。此版本将 CSV 写入文件,并且适用于 Python 2 和 Python 3。
import csv,json
input_file="data.json"
output_file="data.csv"
with open(input_file) as f:
content=json.load(f)
try:
context=open(output_file,'w',newline='') # Python 3
except TypeError:
context=open(output_file,'wb') # Python 2
with context as file:
writer=csv.writer(file)
writer.writerow(content[0].keys()) # header row
for row in content:
writer.writerow(row.values())
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
409387 次 |
| 最近记录: |