meh*_*man 5 python json dataframe pandas
我有下面的示例数据框:
d = {'key': ['foo', 'foo', 'foo', 'foo', 'bar', 'bar', 'bar', 'bar', 'crow', 'crow', 'crow', 'crow'],
'count': [12, 3, 5, 5, 3, 1, 4, 1, 7, 3, 8, 2],
'text': ["hello", "i", "am", "a", "piece", "of", "text", "have", "a", "nice", "day", "friends"],
}
}
df = pd.DataFrame(data=d)
df
Run Code Online (Sandbox Code Playgroud)
输出:
key count text
0 foo 12 hello
1 foo 3 i
2 foo 5 am
3 foo 5 a
4 bar 3 piece
5 bar 1 of
6 bar 4 text
7 bar 1 have
8 crow 7 a
9 crow 3 nice
10 crow 8 day
11 crow 2 friends
Run Code Online (Sandbox Code Playgroud)
我将数据框堆叠为:
df.set_index("key").stack()
要得到:
key
foo count 12
text hello
count 3
text i
count 5
text am
count 5
text a
bar count 3
text piece
count 1
text of
count 4
text text
count 1
text have
crow count 7
text a
count 3
text nice
count 8
text day
count 2
text friends
dtype: object
Run Code Online (Sandbox Code Playgroud)
我现在试图将堆叠的 df 作为 JSON 文件输出,但是当我使用 时to_json(),出现错误:
ValueError: Series index must be unique for orient='index'
Run Code Online (Sandbox Code Playgroud)
该预期产量将text与count由分组key:
[
{
"key": "19",
"values": [
{
text: 'hello',
count: 12
},
{
content: 'i',
count: 3
},
{
content: 'am',
count: 5
},
...
]
]
Run Code Online (Sandbox Code Playgroud)
正如评论中提到的,您的预期输出不是有效的 JSON 字符串。你需要"some_key":[...]与 处于同一级别"key":"bar"。
例如groupby:
json_str = json.dumps([ {'key':k, 'values':d.to_dict('records')}
for k,d in df.drop('key',axis=1).groupby(df['key'])
], indent=2)
Run Code Online (Sandbox Code Playgroud)
输出:
[
{
"key": "bar",
"values": [
{
"count": 3,
"text": "piece"
},
{
"count": 1,
"text": "of"
},
{
"count": 4,
"text": "text"
},
{
"count": 1,
"text": "have"
}
]
},
{
"key": "crow",
"values": [
{
"count": 7,
"text": "a"
},
{
"count": 3,
"text": "nice"
},
{
"count": 8,
"text": "day"
},
{
"count": 2,
"text": "friends"
}
]
},
{
"key": "foo",
"values": [
{
"count": 12,
"text": "hello"
},
{
"count": 3,
"text": "i"
},
{
"count": 5,
"text": "am"
},
{
"count": 5,
"text": "a"
}
]
}
]
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
47 次 |
| 最近记录: |