Jam*_*and 8 python pandas columnsorting
我在熊猫数据框中有这样的数据
id import_id investor_id loan_id meta
35736 unremit_loss_100312 Q05 0051765139 {u'total_paid': u'75', u'total_expense': u'75'}
35737 unremit_loss_100313 Q06 0051765140 {u'total_paid': u'77', u'total_expense': u'78'}
35739 unremit_loss_100314 Q06 0051765141 {u'total_paid': u'80', u'total_expense': u'65'}
Run Code Online (Sandbox Code Playgroud)
如何基于total_expense进行排序,后者是json字段的值,
例如:meta字段上的total_expense
输出应为
id import_id investor_id loan_id meta
35739 unremit_loss_100314 Q06 0051765141 {u'total_paid': u'80', u'total_expense': u'65'}
35736 unremit_loss_100312 Q05 0051765139 {u'total_paid': u'75', u'total_expense': u'75'}
35737 unremit_loss_100313 Q06 0051765140 {u'total_paid': u'77', u'total_expense': u'78'}
Run Code Online (Sandbox Code Playgroud)
使用:
print (df)
id import_id investor_id loan_id \
0 35736 unremit_loss_100312 Q05 51765139
1 35736 unremit_loss_100312 Q05 51765139
2 35736 unremit_loss_100312 Q05 51765139
meta
0 {u'total_paid': u'75', u'total_expense': u'75'}
1 {u'total_paid': u'75', u'total_expense': u'20'}
2 {u'total_paid': u'75', u'total_expense': u'100'}
import ast
df['meta'] = df['meta'].apply(ast.literal_eval)
df = df.iloc[df['meta'].str['total_expense'].astype(int).argsort()]
print (df)
id import_id investor_id loan_id \
1 35736 unremit_loss_100312 Q05 51765139
0 35736 unremit_loss_100312 Q05 51765139
2 35736 unremit_loss_100312 Q05 51765139
meta
1 {'total_paid': '75', 'total_expense': '20'}
0 {'total_paid': '75', 'total_expense': '75'}
2 {'total_paid': '75', 'total_expense': '100'}
Run Code Online (Sandbox Code Playgroud)
如果可能的话,如果total_expense某些行缺少键,则将缺失值转换为像所有其他值一样较低的整数,例如-1这些行的第一个位置:
print (df)
id import_id investor_id loan_id \
0 35736 unremit_loss_100312 Q05 51765139
1 35736 unremit_loss_100312 Q05 51765139
2 35736 unremit_loss_100312 Q05 51765139
meta
0 {u'total_paid': u'75', u'total_expense': u'75'}
1 {u'total_paid': u'75', u'total_expense': u'20'}
2 {u'total_paid': u'75'}
df['meta'] = df['meta'].apply(ast.literal_eval)
df = df.iloc[df['meta'].str.get('total_expense').fillna(-1).astype(int).argsort()]
print (df)
id import_id investor_id loan_id \
2 35736 unremit_loss_100312 Q05 51765139
1 35736 unremit_loss_100312 Q05 51765139
0 35736 unremit_loss_100312 Q05 51765139
meta
2 {'total_paid': '75'}
1 {'total_paid': '75', 'total_expense': '20'}
0 {'total_paid': '75', 'total_expense': '75'}
Run Code Online (Sandbox Code Playgroud)
另一个解决方案:
df['new'] = df['meta'].str.get('total_expense').astype(int)
df = df.sort_values('new').drop('new', axis=1)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
402 次 |
| 最近记录: |