use*_*135 2 python datetime pandas
目前,我正在使用以下for循环基于每个月的if条件为其数字等价物分配月份.它似乎在运行时非常有效,但对于我的偏好来说太过手动和丑陋.
怎么能更好地执行?我想通过以某种方式简化/压缩多个条件,以及使用某种为日期转换而制作的翻译器,可以改进它吗?哪个都更好?
#make numeric month
combined = combined.sort_values('month')
combined.index = range(len(combined))
combined['month_numeric'] = None
for i in combined['month'].unique():
first = combined['month'].searchsorted(i, side='left')
last = combined['month'].searchsorted(i, side='right')
first_num = list(first)[0] #gives first instance
last_num = list(last)[0] #gives last instance
if i == 'January':
combined['month_numeric'][first_num:last_num] = "01"
elif i == 'February':
combined['month_numeric'][first_num:last_num] = "02"
elif i == 'March':
combined['month_numeric'][first_num:last_num] = "03"
elif i == 'April':
combined['month_numeric'][first_num:last_num] = "04"
elif i == 'May':
combined['month_numeric'][first_num:last_num] = "05"
elif i == 'June':
combined['month_numeric'][first_num:last_num] = "06"
elif i == 'July':
combined['month_numeric'][first_num:last_num] = "07"
elif i == 'August':
combined['month_numeric'][first_num:last_num] = "08"
elif i == 'September':
combined['month_numeric'][first_num:last_num] = "09"
elif i == 'October':
combined['month_numeric'][first_num:last_num] = "10"
elif i == 'November':
combined['month_numeric'][first_num:last_num] = "11"
elif i == 'December':
combined['month_numeric'][first_num:last_num] = "12"
Run Code Online (Sandbox Code Playgroud)
to_datetime然后month,您可以使用,转换为字符串并使用zfill:
print (pd.to_datetime(df['month'], format='%B').dt.month.astype(str).str.zfill(2))
Run Code Online (Sandbox Code Playgroud)
样品:
import pandas as pd
df = pd.DataFrame({ 'month': ['January','February', 'December']})
print (df)
month
0 January
1 February
2 December
print (pd.to_datetime(df['month'], format='%B').dt.month.astype(str).str.zfill(2))
0 01
1 02
2 12
Name: month, dtype: object
Run Code Online (Sandbox Code Playgroud)
另一个解决方案是mapdict d:
d = {'January':'01','February':'02','December':'12'}
print (df['month'].map(d))
0 01
1 02
2 12
Name: month, dtype: object
Run Code Online (Sandbox Code Playgroud)
时间:
df = pd.DataFrame({ 'month': ['January','February', 'December']})
print (df)
df = pd.concat([df]*1000).reset_index(drop=True)
print (pd.to_datetime(df['month'], format='%B').dt.month.astype(str).str.zfill(2))
print (df['month'].map({'January':'01','February':'02','December':'12'}))
In [200]: %timeit (pd.to_datetime(df['month'], format='%B').dt.month.astype(str).str.zfill(2))
100 loops, best of 3: 13.5 ms per loop
In [201]: %timeit (df['month'].map({'January':'01','February':'02','December':'12'}))
1000 loops, best of 3: 462 µs per loop
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
235 次 |
| 最近记录: |