hel*_*loB 5 python pandas statsmodels
我正在尝试对常用的航空公司乘客数据集运行基本的seasonal_decompose,该数据集以这些行开头:
Month
1949-02 4.770685
1949-03 4.882802
1949-04 4.859812
1949-05 4.795791
1949-06 4.905275
1949-07 4.997212
1949-08 4.997212
1949-09 4.912655
1949-10 4.779123
1949-11 4.644391
1949-12 4.770685
1950-01 4.744932
1950-02 4.836282
1950-03 4.948760
1950-04 4.905275
1950-05 4.828314
1950-06 5.003946
1950-07 5.135798
1950-08 5.135798
Freq: M, Name: Passengers, dtype: float64
Run Code Online (Sandbox Code Playgroud)
我的索引类型是:
pandas.tseries.period.PeriodIndex
Run Code Online (Sandbox Code Playgroud)
我尝试运行一些非常简单的代码:
from statsmodels.tsa.seasonal import seasonal_decompose
log_passengers.interpolate(inplace = True)
decomposition = seasonal_decompose(log_passengers)
Run Code Online (Sandbox Code Playgroud)
这是错误的完整输出:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-113-bf122d457673> in <module>()
1 from statsmodels.tsa.seasonal import seasonal_decompose
2 log_passengers.interpolate(inplace = True)
----> 3 decomposition = seasonal_decompose(log_passengers)
/Users/ann/anaconda/lib/python3.5/site-packages/statsmodels/tsa/seasonal.py in seasonal_decompose(x, model, filt, freq)
56 statsmodels.tsa.filters.convolution_filter
57 """
---> 58 _pandas_wrapper, pfreq = _maybe_get_pandas_wrapper_freq(x)
59 x = np.asanyarray(x).squeeze()
60 nobs = len(x)
/Users/ann/anaconda/lib/python3.5/site-packages/statsmodels/tsa/filters/_utils.py in _maybe_get_pandas_wrapper_freq(X, trim)
44 index = X.index
45 func = _get_pandas_wrapper(X, trim)
---> 46 freq = index.inferred_freq
47 return func, freq
48 else:
pandas/src/properties.pyx in pandas.lib.cache_readonly.__get__ (pandas/lib.c:44097)()
/Users/ann/anaconda/lib/python3.5/site-packages/pandas/tseries/base.py in inferred_freq(self)
233 """
234 try:
--> 235 return frequencies.infer_freq(self)
236 except ValueError:
237 return None
/Users/ann/anaconda/lib/python3.5/site-packages/pandas/tseries/frequencies.py in infer_freq(index, warn)
854
855 if com.is_period_arraylike(index):
--> 856 raise TypeError("PeriodIndex given. Check the `freq` attribute "
857 "instead of using infer_freq.")
858 elif isinstance(index, pd.TimedeltaIndex):
TypeError: PeriodIndex given. Check the `freq` attribute instead of using infer_freq.
Run Code Online (Sandbox Code Playgroud)
这是我尝试过的:
decomposition = seasonal_decompose(log_passengers, infer_freq = True)会产生错误: TypeError: seasonal_decompose() got an unexpected keyword argument 'infer_freq'decomposition = seasonal_decompose(log_passengers, freq = 'M')会导致错误:TypeError: PeriodIndex given. Check thefreqattribute instead of using infer_freq.set([x.freq for x in log_passengers.index])这确实产生了一组仅一个频率:{<MonthEnd>}我在各种 Github 问题上看到了一些关于此问题的讨论(https://github.com/pydata/pandas/issues/6771),但所讨论的内容似乎都没有帮助。关于如何解决此问题或我在这个简单的 seasona_decompose 中做错了什么有什么建议吗?
seasonal_decompose 不接受 periodIndex,解决方法是使用 to_timestamp 方法将索引转换为 DatetimeIndex:
from statsmodels.tsa.seasonal import seasonal_decompose
log_passengers.interpolate(inplace = True)
log_passengers.index=log_passengers.index.to_timestamp()
decomposition = seasonal_decompose(log_passengers)
Run Code Online (Sandbox Code Playgroud)