我是Python和Pandas的新手.我正在尝试将Pandas Dataframe转换为嵌套的JSON.函数.to_json()没有给我足够的灵活性来实现我的目标.
以下是数据帧的一些数据点(以csv,逗号分隔):
,ID,Location,Country,Latitude,Longitude,timestamp,tide
0,1,BREST,FRA,48.383,-4.495,1807-01-01,6905.0
1,1,BREST,FRA,48.383,-4.495,1807-02-01,6931.0
2,1,BREST,FRA,48.383,-4.495,1807-03-01,6896.0
3,1,BREST,FRA,48.383,-4.495,1807-04-01,6953.0
4,1,BREST,FRA,48.383,-4.495,1807-05-01,7043.0
2508,7,CUXHAVEN 2,DEU,53.867,8.717,1843-01-01,7093.0
2509,7,CUXHAVEN 2,DEU,53.867,8.717,1843-02-01,6688.0
2510,7,CUXHAVEN 2,DEU,53.867,8.717,1843-03-01,6493.0
2511,7,CUXHAVEN 2,DEU,53.867,8.717,1843-04-01,6723.0
2512,7,CUXHAVEN 2,DEU,53.867,8.717,1843-05-01,6533.0
4525,9,MAASSLUIS,NLD,51.918,4.25,1848-02-01,6880.0
4526,9,MAASSLUIS,NLD,51.918,4.25,1848-03-01,6700.0
4527,9,MAASSLUIS,NLD,51.918,4.25,1848-04-01,6775.0
4528,9,MAASSLUIS,NLD,51.918,4.25,1848-05-01,6580.0
4529,9,MAASSLUIS,NLD,51.918,4.25,1848-06-01,6685.0
6540,8,WISMAR 2,DEU,53.898999999999994,11.458,1848-07-01,6957.0
6541,8,WISMAR 2,DEU,53.898999999999994,11.458,1848-08-01,6944.0
6542,8,WISMAR 2,DEU,53.898999999999994,11.458,1848-09-01,7084.0
6543,8,WISMAR 2,DEU,53.898999999999994,11.458,1848-10-01,6898.0
6544,8,WISMAR 2,DEU,53.898999999999994,11.458,1848-11-01,6859.0
8538,10,SAN FRANCISCO,USA,37.806999999999995,-122.465,1854-07-01,6909.0
8539,10,SAN FRANCISCO,USA,37.806999999999995,-122.465,1854-08-01,6940.0
8540,10,SAN FRANCISCO,USA,37.806999999999995,-122.465,1854-09-01,6961.0
8541,10,SAN FRANCISCO,USA,37.806999999999995,-122.465,1854-10-01,6952.0
8542,10,SAN FRANCISCO,USA,37.806999999999995,-122.465,1854-11-01,6952.0
Run Code Online (Sandbox Code Playgroud)
有很多重复的信息,我想有这样的JSON:
[
{
"ID": 1,
"Location": "BREST",
"Latitude": 48.383,
"Longitude": -4.495,
"Country": "FRA",
"Tide-Data": {
"1807-02-01": 6931,
"1807-03-01": 6896,
"1807-04-01": 6953,
"1807-05-01": 7043
}
},
{
"ID": 5,
"Location": "HOLYHEAD",
"Latitude": 53.31399999999999,
"Longitude": -4.62,
"Country": "GBR", …Run Code Online (Sandbox Code Playgroud) 我经常使用pandas groupby来生成堆叠表.但后来我经常想要将生成的嵌套关系输出到json.有没有办法从它生成的堆栈表中提取嵌套的json字段?
假设我有一个像:
year office candidate amount
2010 mayor joe smith 100.00
2010 mayor jay gould 12.00
2010 govnr pati mara 500.00
2010 govnr jess rapp 50.00
2010 govnr jess rapp 30.00
Run Code Online (Sandbox Code Playgroud)
我可以:
grouped = df.groupby('year', 'office', 'candidate').sum()
print grouped
amount
year office candidate
2010 mayor joe smith 100
jay gould 12
govnr pati mara 500
jess rapp 80
Run Code Online (Sandbox Code Playgroud)
美丽!当然,我真正喜欢做的是通过沿groups.to_json行的命令获得嵌套的json.但是这个功能不可用.任何解决方法?
所以,我真正想要的是:
{"2010": {"mayor": [
{"joe smith": 100},
{"jay gould": 12}
]
},
{"govnr": [
{"pati mara":500},
{"jess rapp": 80}
] …Run Code Online (Sandbox Code Playgroud) 由于问题解释了问题,我一直在尝试生成嵌套的JSON对象.在这种情况下,我有for循环从字典中获取数据dic.以下是代码:
f = open("test_json.txt", 'w')
flag = False
temp = ""
start = "{\n\t\"filename\"" + " : \"" +initial_filename+"\",\n\t\"data\"" +" : " +" [\n"
end = "\n\t]" +"\n}"
f.write(start)
for i, (key,value) in enumerate(dic.iteritems()):
f.write("{\n\t\"keyword\":"+"\""+str(key)+"\""+",\n")
f.write("\"term_freq\":"+str(len(value))+",\n")
f.write("\"lists\":[\n\t")
for item in value:
f.write("{\n")
f.write("\t\t\"occurance\" :"+str(item)+"\n")
#Check last object
if value.index(item)+1 == len(value):
f.write("}\n"
f.write("]\n")
else:
f.write("},") # close occurrence object
# Check last item in dic
if i == len(dic)-1:
flag = True
if(flag):
f.write("}")
else:
f.write("},") …Run Code Online (Sandbox Code Playgroud)