如何在Python/Pandas中为它们的数字等价物分配月份?

use*_*135 2 python datetime pandas

目前,我正在使用以下for循环基于每个月的if条件为其数字等价物分配月份.它似乎在运行时非常有效,但对于我的偏好来说太过手动和丑陋.

怎么能更好地执行?我想通过以某种方式简化/压缩多个条件,以及使用某种为日期转换而制作的翻译器,可以改进它吗?哪个都更好?

#make numeric month

combined = combined.sort_values('month')
combined.index = range(len(combined))
combined['month_numeric'] = None

for i in combined['month'].unique():
    first = combined['month'].searchsorted(i, side='left')
    last = combined['month'].searchsorted(i, side='right')
    first_num = list(first)[0] #gives first instance
    last_num = list(last)[0] #gives last instance
    if i == 'January':
        combined['month_numeric'][first_num:last_num] = "01"
    elif i == 'February':
        combined['month_numeric'][first_num:last_num] = "02"
    elif i == 'March':
        combined['month_numeric'][first_num:last_num] = "03"
    elif i == 'April':
        combined['month_numeric'][first_num:last_num] = "04"
    elif i == 'May':
        combined['month_numeric'][first_num:last_num] = "05"
    elif i == 'June':
        combined['month_numeric'][first_num:last_num] = "06"
    elif i == 'July':
        combined['month_numeric'][first_num:last_num] = "07"
    elif i == 'August':
        combined['month_numeric'][first_num:last_num] = "08"
    elif i == 'September':
        combined['month_numeric'][first_num:last_num] = "09"
    elif i == 'October':
        combined['month_numeric'][first_num:last_num] = "10"
    elif i == 'November':
        combined['month_numeric'][first_num:last_num] = "11"
    elif i == 'December':
        combined['month_numeric'][first_num:last_num] = "12"
Run Code Online (Sandbox Code Playgroud)

jez*_*ael 5

to_datetime然后month,您可以使用,转换为字符串并使用zfill:

print (pd.to_datetime(df['month'], format='%B').dt.month.astype(str).str.zfill(2))
Run Code Online (Sandbox Code Playgroud)

样品:

import pandas as pd

df = pd.DataFrame({ 'month': ['January','February', 'December']})
print (df)
      month
0   January
1  February
2  December

print (pd.to_datetime(df['month'], format='%B').dt.month.astype(str).str.zfill(2))
0    01
1    02
2    12
Name: month, dtype: object
Run Code Online (Sandbox Code Playgroud)

另一个解决方案是mapdict d:

d = {'January':'01','February':'02','December':'12'}

print (df['month'].map(d))
0    01
1    02
2    12
Name: month, dtype: object
Run Code Online (Sandbox Code Playgroud)

时间:

df = pd.DataFrame({ 'month': ['January','February', 'December']})
print (df)
df = pd.concat([df]*1000).reset_index(drop=True)

print (pd.to_datetime(df['month'], format='%B').dt.month.astype(str).str.zfill(2))
print (df['month'].map({'January':'01','February':'02','December':'12'}))

In [200]: %timeit (pd.to_datetime(df['month'], format='%B').dt.month.astype(str).str.zfill(2))
100 loops, best of 3: 13.5 ms per loop

In [201]: %timeit (df['month'].map({'January':'01','February':'02','December':'12'}))
1000 loops, best of 3: 462 µs per loop
Run Code Online (Sandbox Code Playgroud)