我有一个df像以下一样的DataFrame (摘录,'Timestamp'是索引):
Timestamp Value
2012-06-01 00:00:00 100
2012-06-01 00:15:00 150
2012-06-01 00:30:00 120
2012-06-01 01:00:00 220
2012-06-01 01:15:00 80
...and so on.
Run Code Online (Sandbox Code Playgroud)
我需要一个新列df['weekday'],其中包含时间戳的相应工作日/星期几.
我怎么能得到这个?
EdC*_*ica 75
使用新dt.dayofweek属性:
In [2]:
df['weekday'] = df['Timestamp'].dt.dayofweek
df
Out[2]:
Timestamp Value weekday
0 2012-06-01 00:00:00 100 4
1 2012-06-01 00:15:00 150 4
2 2012-06-01 00:30:00 120 4
3 2012-06-01 01:00:00 220 4
4 2012-06-01 01:15:00 80 4
Run Code Online (Sandbox Code Playgroud)
在Timestamp您的索引的情况下,您需要重置索引,然后调用dt.dayofweek属性:
In [14]:
df = df.reset_index()
df['weekday'] = df['Timestamp'].dt.dayofweek
df
Out[14]:
Timestamp Value weekday
0 2012-06-01 00:00:00 100 4
1 2012-06-01 00:15:00 150 4
2 2012-06-01 00:30:00 120 4
3 2012-06-01 01:00:00 220 4
4 2012-06-01 01:15:00 80 4
Run Code Online (Sandbox Code Playgroud)
奇怪的是,如果您尝试从索引创建一个系列以便不重置索引,您将获得NaN值,使用reset_index调用dt.dayofweek属性的结果而不将结果reset_index返回到原始df:
In [16]:
df['weekday'] = pd.Series(df.index).dt.dayofweek
df
Out[16]:
Value weekday
Timestamp
2012-06-01 00:00:00 100 NaN
2012-06-01 00:15:00 150 NaN
2012-06-01 00:30:00 120 NaN
2012-06-01 01:00:00 220 NaN
2012-06-01 01:15:00 80 NaN
In [17]:
df['weekday'] = df.reset_index()['Timestamp'].dt.dayofweek
df
Out[17]:
Value weekday
Timestamp
2012-06-01 00:00:00 100 NaN
2012-06-01 00:15:00 150 NaN
2012-06-01 00:30:00 120 NaN
2012-06-01 01:00:00 220 NaN
2012-06-01 01:15:00 80 NaN
Run Code Online (Sandbox Code Playgroud)
编辑
正如用户@joris所指出的那样,您只需访问weekday索引的属性,以便以下工作并且更紧凑:
df['Weekday'] = df.index.weekday
小智 8
您可以通过以下方式获得:
import datetime
df['weekday'] = pd.Series(df.index).dt.day_name()
Run Code Online (Sandbox Code Playgroud)
如果Timestamp为datatime,则可以使用:
df['weekday'] = df['Timestamp'].apply(lambda x: x.weekday())
要么
df['weekday'] = pd.to_datetime(df['Timestamp']).apply(lambda x: x.weekday())