根据确切日期按季节对数据进行分组

Question

根据确切日期按季节对数据进行分组

Mar*_*tte 5 python-2.7 pandas pandas-groupby

我有一个包含 4 年数据的 csv 文件，我试图将 4 年中每个季节的数据分组，换言之，我只需要将我的整个数据汇总并绘制成 4 个季节。看看我的数据文件：

timestamp,heure,lat,lon,impact,type
2006-01-01 00:00:00,13:58:43,33.837,-9.205,10.3,1
2006-01-02 00:00:00,00:07:28,34.5293,-10.2384,17.7,1
2007-02-01 00:00:00,23:01:03,35.0617,-1.435,-17.1,2
2007-02-02 00:00:00,01:14:29,36.5685,0.9043,36.8,1
2008-01-01 00:00:00,05:03:51,34.1919,-12.5061,-48.9,1
2008-01-02 00:00:00,05:03:51,34.1919,-12.5061,-48.9,1
....
2011-12-31 00:00:00,05:03:51,34.1919,-12.5061,-48.9,1

Run Code Online (Sandbox Code Playgroud)

这是我想要的输出：

winter     (the mean value of impacts)
summer     (the mean value of impacts)
autumn      ....
spring      .....

Run Code Online (Sandbox Code Playgroud)

其实我试过这个代码：

names =["timestamp","heure","lat","lon","impact","type"]
data = pd.read_csv('flash.txt',names=names, parse_dates=['timestamp'],index_col=['timestamp'], dayfirst=True)

spring = range(80, 172)
summer = range(172, 264)
fall = range(264, 355)

def season(x):
    if x in spring:
       return 'Spring'
    if x in summer:
       return 'Summer'
    if x in fall:
       return 'Fall'
   else :
       return 'Winter'

 data['SEASON'] = data.index.to_series().dt.month.map(lambda x : season(x))
 data['impact'] = data['impact'].abs()
 seasonly = data.groupby('SEASON')['impact'].mean()

Run Code Online (Sandbox Code Playgroud)

我得到了这个可怕的结果：

我错在哪里？

Answer 1

jez*_*ael 4

你需要DatetimeIndex.dayofyear：

data['SEASON'] = data.index.dayofyear.map(season)

Run Code Online (Sandbox Code Playgroud)

另一个解决方案pandas.cut：

bins = [0, 91, 183, 275, 366]
labels=['Winter', 'Spring', 'Summer', 'Fall']
doy = data.index.dayofyear
data['SEASON1'] = pd.cut(doy + 11 - 366*(doy > 355), bins=bins, labels=labels)

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，5 月前
查看次数：	2852 次
最近记录：	8 年，5 月前