工作日重新订购熊猫系列

Sim*_*mon 4 python python-2.7 pandas anaconda

使用Pandas,我已经输入了一个CSV文件然后创建了一系列数据,以找出一周中哪些日子崩溃最多:

crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

然后我将其绘制出来,但当然它将它们按照与系列相同的排序顺序绘制.

crashes_by_day.plot(kind='bar')
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

将这些重新排名为Mon,Tue,Wed,Thur,Fri,Sat,Sun的最有效方法是什么?

我是否必须将其分解为列表?谢谢.

jez*_*ael 8

你可以使用Ordered Categorical然后sort_index:

print bc
   DAY_OF_WEEK    a    b
0       Sunday  0.7  0.5
1       Monday  0.4  0.1
2      Tuesday  0.3  0.2
3    Wednesday  0.4  0.1
4     Thursday  0.3  0.6
5       Friday  0.4  0.9
6     Saturday  0.3  0.2
7       Sunday  0.7  0.5
8       Monday  0.4  0.1
9      Tuesday  0.3  0.2
10   Wednesday  0.4  0.1
11    Thursday  0.3  0.6
12      Friday  0.4  0.9
13    Saturday  0.3  0.2
14      Sunday  0.7  0.5
15      Monday  0.4  0.1
16     Tuesday  0.3  0.2
17   Wednesday  0.4  0.1
18    Thursday  0.3  0.6
19      Friday  0.4  0.9
20    Saturday  0.3  0.2
Run Code Online (Sandbox Code Playgroud)
bc['DAY_OF_WEEK'] = pd.Categorical(bc['DAY_OF_WEEK'], categories=
    ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday'],
    ordered=True)

print bc['DAY_OF_WEEK']
0        Sunday
1        Monday
2       Tuesday
3     Wednesday
4      Thursday
5        Friday
6      Saturday
7        Sunday
8        Monday
9       Tuesday
10    Wednesday
11     Thursday
12       Friday
13     Saturday
14       Sunday
15       Monday
16      Tuesday
17    Wednesday
18     Thursday
19       Friday
20     Saturday
Name: DAY_OF_WEEK, dtype: category
Categories (7, object): [Monday < Tuesday < Wednesday < Thursday < Friday < Saturday < Sunday]
Run Code Online (Sandbox Code Playgroud)
crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
crashes_by_day = crashes_by_day.sort_index()
print crashes_by_day
Monday       3
Tuesday      3
Wednesday    3
Thursday     3
Friday       3
Saturday     3
Sunday       3
dtype: int64

crashes_by_day.plot(kind='bar')
Run Code Online (Sandbox Code Playgroud)

下一个可能的解决方案Categorical是通过映射设置排序:

crashes_by_day = bc['DAY_OF_WEEK'].value_counts().reset_index()
crashes_by_day.columns = ['DAY_OF_WEEK', 'count']
print crashes_by_day
  DAY_OF_WEEK  count
0    Thursday      3
1   Wednesday      3
2      Friday      3
3     Tuesday      3
4      Monday      3
5    Saturday      3
6      Sunday      3

days = ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday']
mapping = {day: i for i, day in enumerate(days)}
key = crashes_by_day['DAY_OF_WEEK'].map(mapping)
print key
0    3
1    2
2    4
3    1
4    0
5    5
6    6
Name: DAY_OF_WEEK, dtype: int64

crashes_by_day = crashes_by_day.iloc[key.argsort()].set_index('DAY_OF_WEEK')
Run Code Online (Sandbox Code Playgroud)
print crashes_by_day
             count
DAY_OF_WEEK       
Monday           3
Tuesday          3
Wednesday        3
Thursday         3
Friday           3
Saturday         3
Sunday           3

crashes_by_day.plot(kind='bar')
Run Code Online (Sandbox Code Playgroud)

图形