Python Pandas Dataframe赋值

Mat*_*ist 5 python dataframe python-3.x pandas

我正在遵循Lynda教程,他们使用以下代码:

import pandas as pd
import seaborn
flights = seaborn.load_dataset('flights')
flights_indexed = flights.set_index(['year','month'])
flights_unstacked = flights_indexed.unstack()
flights_unstacked['passengers','total']  = flights_unstacked.sum(axis=1)
Run Code Online (Sandbox Code Playgroud)

它完美无缺.但是,在我的情况下,似乎代码没有编译,对于最后一行我一直收到错误.

TypeError:无法将项插入到尚未作为现有类别的CategoricalIndex中

我在视频中知道他们使用的是Python 2,但是因为我正在学习工作(使用Python 3),所以我有Python 3.我能够弄清楚的大多数差异,但是我无法弄清楚如何'total'用乘客的总和来创建这个新的列.

Max*_*axU 8

此错误消息的根本原因是month列的分类性质:

In [42]: flights.dtypes
Out[42]:
year             int64
month         category
passengers       int64
dtype: object

 In [43]: flights.month.cat.categories
Out[43]: Index(['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'], d
type='object')
Run Code Online (Sandbox Code Playgroud)

而你正试图添加一个类别total- 熊猫不喜欢这样.

解决方法:

In [45]: flights.month.cat.add_categories('total', inplace=True)

In [46]: x = flights.pivot(index='year', columns='month', values='passengers')

In [47]: x['total'] = x.sum(1)

In [48]: x
Out[48]:
month  January  February  March  April    May   June   July  August  September  October  November  December   total
year
1949     112.0     118.0  132.0  129.0  121.0  135.0  148.0   148.0      136.0    119.0     104.0     118.0  1520.0
1950     115.0     126.0  141.0  135.0  125.0  149.0  170.0   170.0      158.0    133.0     114.0     140.0  1676.0
1951     145.0     150.0  178.0  163.0  172.0  178.0  199.0   199.0      184.0    162.0     146.0     166.0  2042.0
1952     171.0     180.0  193.0  181.0  183.0  218.0  230.0   242.0      209.0    191.0     172.0     194.0  2364.0
1953     196.0     196.0  236.0  235.0  229.0  243.0  264.0   272.0      237.0    211.0     180.0     201.0  2700.0
1954     204.0     188.0  235.0  227.0  234.0  264.0  302.0   293.0      259.0    229.0     203.0     229.0  2867.0
1955     242.0     233.0  267.0  269.0  270.0  315.0  364.0   347.0      312.0    274.0     237.0     278.0  3408.0
1956     284.0     277.0  317.0  313.0  318.0  374.0  413.0   405.0      355.0    306.0     271.0     306.0  3939.0
1957     315.0     301.0  356.0  348.0  355.0  422.0  465.0   467.0      404.0    347.0     305.0     336.0  4421.0
1958     340.0     318.0  362.0  348.0  363.0  435.0  491.0   505.0      404.0    359.0     310.0     337.0  4572.0
1959     360.0     342.0  406.0  396.0  420.0  472.0  548.0   559.0      463.0    407.0     362.0     405.0  5140.0
1960     417.0     391.0  419.0  461.0  472.0  535.0  622.0   606.0      508.0    461.0     390.0     432.0  5714.0
Run Code Online (Sandbox Code Playgroud)

更新:或者,如果您不想触摸原始DF,您可以删除DF中的分类列flights_unstacked:

In [76]: flights_unstacked.columns = \
    ...:     flights_unstacked.columns \
    ...:     .set_levels(flights_unstacked.columns.get_level_values(1).categories,
    ...:                 level=1)
    ...:

In [77]: flights_unstacked['passengers','total']  = flights_unstacked.sum(axis=1)

In [78]: flights_unstacked
Out[78]:
      passengers
month    January February March April  May June July August September October November December total
year
1949         112      118   132   129  121  135  148    148       136     119      104      118  1520
1950         115      126   141   135  125  149  170    170       158     133      114      140  1676
1951         145      150   178   163  172  178  199    199       184     162      146      166  2042
1952         171      180   193   181  183  218  230    242       209     191      172      194  2364
1953         196      196   236   235  229  243  264    272       237     211      180      201  2700
1954         204      188   235   227  234  264  302    293       259     229      203      229  2867
1955         242      233   267   269  270  315  364    347       312     274      237      278  3408
1956         284      277   317   313  318  374  413    405       355     306      271      306  3939
1957         315      301   356   348  355  422  465    467       404     347      305      336  4421
1958         340      318   362   348  363  435  491    505       404     359      310      337  4572
1959         360      342   406   396  420  472  548    559       463     407      362      405  5140
1960         417      391   419   461  472  535  622    606       508     461      390      432  5714
Run Code Online (Sandbox Code Playgroud)