我在熊猫中有以下数据框
key no lpm
ab_12 1 12
ab_12 2 11
ab_12 3 11
ac_12 1 12
ac_12 2 11
ac_12 4 11
ad_12 1 12
ad_12 2 11
ad_12 3 11
Run Code Online (Sandbox Code Playgroud)
我想要的数据框如下
key no_1 no_2 no_3 no_4
ab_12 12 11 11 does not exist
ac_12 12 11 does not exist 11
ad_12 12 11 11 does not exist
Run Code Online (Sandbox Code Playgroud)
我正在熊猫中做以下工作,但它没有给我我需要的东西。
df= df.melt('key').groupby(['key', 'value']).unstack(fill_value='Does not exist')
Run Code Online (Sandbox Code Playgroud)
set_index与unstack和 一起使用add_prefix:
df = df.set_index(['key', 'no'])['lpm'].unstack(fill_value='Does not exist').add_prefix('no_')
print (df)
no no_1 no_2 no_3 no_4
key
ab_12 12 11 11 Does not exist
ac_12 12 11 Does not exist 11
ad_12 12 11 11 Does not exist
Run Code Online (Sandbox Code Playgroud)
如果解决方案因为成对重复而不起作用key,no则必须聚合:
df = (df.groupby(['key', 'no'])['lpm']
.mean()
.unstack(fill_value='Does not exist')
.add_prefix('no_'))
Run Code Online (Sandbox Code Playgroud)
或者:
df = (df.pivot_table(index='key',
columns='no',
values='lpm',
fill_value='Does not exist',
aggfunc='mean').add_prefix('no_'))
Run Code Online (Sandbox Code Playgroud)
编辑:对于后缀添加add_suffix:
df = (df.set_index(['key', 'no'])['lpm']
.unstack(fill_value='Does not exist')
.add_prefix('no_')
.add_suffix('_lpm'))
print (df)
no no_1_lpm no_2_lpm no_3_lpm no_4_lpm
key
ab_12 12 11 11 Does not exist
ac_12 12 11 Does not exist 11
ad_12 12 11 11 Does not exist
Run Code Online (Sandbox Code Playgroud)