我使用python 3.6并尝试使用下面的代码下载json文件(350 MB)作为pandas数据帧.但是,我收到以下错误:
Run Code Online (Sandbox Code Playgroud)data_json_str = "[" + ",".join(data) + "] "TypeError: sequence item 0: expected str instance, bytes found
我该如何修复错误?
import pandas as pd
# read the entire file into a python array
with open('C:/Users/Alberto/nutrients.json', 'rb') as f:
data = f.readlines()
# remove the trailing "\n" from each line
data = map(lambda x: x.rstrip(), data)
# each element of 'data' is an individual JSON object.
# i want to convert it into an *array* of JSON objects
# which, in …Run Code Online (Sandbox Code Playgroud) 我有一个看起来像这样的数据框:
ID phone_numbers
1 [{u'updated_at': u'2017-12-02 15:29:54', u'created_at': u'2017-12-0
2 15:29:54', u'sms': 0, u'number': u'1112223333', u'consumer_id':
12345, u'organization_id': 1, u'active': 1, u'deleted_at':
None, u'type': u'default', u'id': 1234}]
Run Code Online (Sandbox Code Playgroud)
我想获取 phone_numbers 列并将其中的信息展平,以便我可以查询“id”字段。
当我尝试时;
json_normalize(df.phone_numbers)
Run Code Online (Sandbox Code Playgroud)
我得到错误:
AttributeError: 'str' 对象没有属性 'itervalues'
我不确定为什么会产生这个错误以及为什么我不能展平这个列。
编辑:
最初是从响应对象(r.text)中读取的 JSON 字符串:
https://docs.google.com/document/d/1Iq4PMcGXWx6O48sWqqYnZjG6UMSZoXfmN1WadQLkWYM/edit?usp=sharing
编辑:
通过此命令将我需要展平的列转换为 JSON
a = df.phone_numbers.to_json()
{"0":[{"updated_at":"2018-04-12 12:24:04","created_at":"2018-04-12 12:24:04","sms":0,"number":"","consumer_id":123,"org_id":123,"active":1,"deleted_at":null,"type":"default","id":123}]}
Run Code Online (Sandbox Code Playgroud) 我有一个 pandas 数据框,如下所示:
User | Query| Filters
-----------------------------------------------------------------------------------------
1 | abc | [{u'Op': u'and', u'Type': u'date', u'Val': u'1992'},{u'Op': u'and', u'Type': u'sex', u'Val': u'F'}]
1 | efg | [{u'Op': u'and', u'Type': u'date', u'Val': u'2000'},{u'Op': u'and', u'Type': u'col', u'Val': u'Blue'}]
1 | fgs | [{u'Op': u'and', u'Type': u'date', u'Val': u'2001'},{u'Op': u'and', u'Type': u'col', u'Val': u'Red'}]
2 | hij | [{u'Op': u'and', u'Type': u'date', u'Val': u'2002'}]
2 | dcv | [{u'Op': u'and', u'Type': u'date', u'Val': u'2001'},{u'Op': u'and', u'Type': u'sex', u'Val': u'F'}]
2 | …Run Code Online (Sandbox Code Playgroud)