Nik*_*pta 0 python json list dataframe pandas
我有一个包含列的数据框:
A
[{"A": 28, "B": "abc"},{"A": 29, "B": "def"},{"A": 30, "B": "hij"}]
[{"A": 31, "B": "hij"},{"A": 32, "B": "abc"}]
[{"A": 28, "B": "abc"}]
[{"A": 28, "B": "abc"},{"A": 29, "B": "def"},{"A": 30, "B": "hij"}]
[{"A": 28, "B": "abc"},{"A": 29, "B": "klm"},{"A": 30, "B": "nop"}]
[{"A": 28, "B": "abc"},{"A": 29, "B": "xyz"}]
Run Code Online (Sandbox Code Playgroud)
输出应该是这样的:
A B
28,29,30 abc,def,hij
31,32 hij,abc
28 abc
28,29,30 abc,def,hij
28,29,30 abc,klm,nop
28,29 abc,xyz
Run Code Online (Sandbox Code Playgroud)
如何根据键名将对象列表拆分为列,并将它们存储为逗号分隔值,如上所示.
通过使用stack然后groupby
df.A.apply(pd.Series).stack().\
apply(pd.Series).groupby(level=0).\
agg(lambda x :','.join(x.astype(str)))
Out[457]:
A B
0 28,29,30 abc,def,hij
1 31,32 hij,abc
2 28 abc
3 28,29,30 abc,def,hij
4 28,29,30 abc,klm,nop
Run Code Online (Sandbox Code Playgroud)
数据输入:
df=pd.DataFrame({'A':[[{"A": 28, "B": "abc"},{"A": 29, "B": "def"},{"A": 30, "B": "hij"}],
[{"A": 31, "B": "hij"},{"A": 32, "B": "abc"}],
[{"A": 28, "B": "abc"}],[{"A": 28, "B": "abc"},{"A": 29, "B": "def"},{"A": 30, "B": "hij"}],
[{"A": 28, "B": "abc"},{"A": 29, "B": "klm"},{"A": 30, "B": "nop"}]]})
Run Code Online (Sandbox Code Playgroud)
有关您的其他问题,请阅读csv
import ast
df=pd.read_csv(r'your.csv',dtype={'A':object})
df['A'] = df['A'].apply(ast.literal_eval)
Run Code Online (Sandbox Code Playgroud)