Joh*_*Doe 2 dataframe python-3.x pandas
我有一个很大的Pandas Data Frame。其中一部分如下所示:
Rule_Name Rule_Seq_No Condition Expression Type
Rule P 1 ID 19909 Action
Rule P 1 Type A Condition
Rule P 1 System B Condition
Rule P 2 ID 19608 Action
Rule P 2 Type A Condition
Rule P 2 System C Condition
Rule S 1 ID 19909 Action
Rule S 1 Type A Condition
Rule S 1 System M Condition
Rule S 2 ID 19608 Action
Rule S 2 Type C Condition
Rule S 2 System F Condition
Run Code Online (Sandbox Code Playgroud)
该表包含一些带有序列号的规则。
我试着用不同的功能,例如MERGE,GROUP BY,APPLY但我没有得到期望的输出。
预期的输出应该是这样的:
Rule_Name Rule_Seq_No Condition Action
Rule P 1 (Type=A)and(System=B) 19909
Rule P 2 (Type=A)and(System=C) 19608
Rule S 1 (Type=A)and(System=M) 19909
Rule S 2 (Type=A)and(System=F) 19608
Run Code Online (Sandbox Code Playgroud)
出于同样的规则和相同的序列号以及其中TYPE就是Condition,我要合并的行。而且,这里的TYPE是ACTION,它应该显示在一个单独的列。
使用:
df1 = (df.assign(Condition = '(' + df['Condition'] + '=' + df['Expression'] + ')')
.groupby(['Rule_Name','Rule_Seq_No','Type'])
.agg({'Condition': 'and'.join, 'Expression':'first'})
.unstack()
.drop([('Condition','Action'), ('Expression','Condition')], axis=1)
.droplevel(axis=1, level=0)
.reset_index()
.rename_axis(None, axis=1))
print (df1)
Rule_Name Rule_Seq_No Condition Action
0 Rule P 1 (Type=A)and(System=B) 19909
1 Rule P 2 (Type=A)and(System=C) 19608
2 Rule S 1 (Type=A)and(System=M) 19909
3 Rule S 2 (Type=C)and(System=F) 19608
Run Code Online (Sandbox Code Playgroud)
说明:
Condition 并Expression与=并添加()GroupBy.agg与join和firstDataFrame.unstackDataFrame.drop用元组删除不必要的列,因为MultiIndexMultiIndex由DataFrame.droplevelDataFrame.reset_index和清除数据DataFrame.rename_axis编辑:
较旧的熊猫版本(0.24及以下)的解决方案,其中包括Index.droplevel:
df1 = (df.assign(Condition = '(' + df['Condition'] + '=' + df['Expression'] + ')')
.groupby(['Rule_Name','Rule_Seq_No','Type'])
.agg({'Condition': 'and'.join, 'Expression':'first'})
.unstack()
.drop([('Condition','Action'), ('Expression','Condition')], axis=1))
df1.columns = df1.columns.droplevel(level=0)
df1 = df1.reset_index().rename_axis(None, axis=1)
print (df1)
Rule_Name Rule_Seq_No Condition Action
0 Rule P 1 (Type=A)and(System=B) 19909
1 Rule P 2 (Type=A)and(System=C) 19608
2 Rule S 1 (Type=A)and(System=M) 19909
3 Rule S 2 (Type=C)and(System=F) 19608
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
38 次 |
| 最近记录: |