我有以下数据框:
country Duration
0 Afghanistan 0 days
1 Afghanistan 4 days
2 Afghanistan 0 days
3 Afghanistan 22 days
4 Afghanistan 6 days
... ...
316813 Zimbabwe (Rhodesia) 36 days
316814 Zimbabwe (Rhodesia) 6 days
316815 Zimbabwe (Rhodesia) 223 days
316816 Zimbabwe (Rhodesia) 6 days
316817 Zimbabwe (Rhodesia) 0 days
Run Code Online (Sandbox Code Playgroud)
我想将“持续时间”列转换为以下类别:
< 1 week, 1-4 weeks, 1-3 months, 3-6 months, 6-9 months, 9-12 months
我怎样才能做到这一点?
Month并不是真正有效的时间间隔,因为月份的日长不同。也就是说,您可以将Duration列转换为TimeDelta然后pd.cut:
pd.cut(pd.to_timedelta(df['Duration']),
bins=pd.to_timedelta([0, '1W', '30D', '90D', '180D', '270D', '365D', '36500D']),
include_lowest=True,
labels=['< 1 week', '1-4 weeks', '1-3 months', '3-6 months', '6-9 months', '9-12 months', '> 12 months']
)
Run Code Online (Sandbox Code Playgroud)
输出:
0 < 1 week
1 < 1 week
2 < 1 week
3 1-4 weeks
4 < 1 week
... NaN
316813 1-3 months
316814 < 1 week
316815 6-9 months
316816 < 1 week
316817 < 1 week
Name: Duration, dtype: category
Categories (7, object): ['< 1 week' < '1-4 weeks' < '1-3 months' < '3-6 months' < '6-9 months' < '9-12 months' < '> 12 months']
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
56 次 |
| 最近记录: |