我有一些数据,当我导入它时,我得到以下不需要的列我正在寻找一种简单的方法来删除所有这些
'Unnamed: 24', 'Unnamed: 25', 'Unnamed: 26', 'Unnamed: 27',
'Unnamed: 28', 'Unnamed: 29', 'Unnamed: 30', 'Unnamed: 31',
'Unnamed: 32', 'Unnamed: 33', 'Unnamed: 34', 'Unnamed: 35',
'Unnamed: 36', 'Unnamed: 37', 'Unnamed: 38', 'Unnamed: 39',
'Unnamed: 40', 'Unnamed: 41', 'Unnamed: 42', 'Unnamed: 43',
'Unnamed: 44', 'Unnamed: 45', 'Unnamed: 46', 'Unnamed: 47',
'Unnamed: 48', 'Unnamed: 49', 'Unnamed: 50', 'Unnamed: 51',
'Unnamed: 52', 'Unnamed: 53', 'Unnamed: 54', 'Unnamed: 55',
'Unnamed: 56', 'Unnamed: 57', 'Unnamed: 58', 'Unnamed: 59',
'Unnamed: 60'
Run Code Online (Sandbox Code Playgroud)
它们被0索引编入索引,所以我尝试了类似的东西
df.drop(df.columns[[22, 23, 24, 25, …Run Code Online (Sandbox Code Playgroud) 问题陈述 我想从这个基本上是热编码的数据帧开始.
In [2]: pd.DataFrame({"monkey":[0,1,0],"rabbit":[1,0,0],"fox":[0,0,1]})
Out[2]:
fox monkey rabbit
0 0 0 1
1 0 1 0
2 1 0 0
3 0 0 0
4 0 0 0
Run Code Online (Sandbox Code Playgroud)
对于这个"反向"单热编码的那个.
In [3]: pd.DataFrame({"animal":["monkey","rabbit","fox"]})
Out[3]:
animal
0 monkey
1 rabbit
2 fox
Run Code Online (Sandbox Code Playgroud)
我想有一些聪明的使用apply或zip来做这些但是我不确定怎么样......有人可以帮忙吗?
我没有成功使用索引等来尝试解决这个问题.
我查看了统计模型的示例,但没有看到很多将交叉验证应用于时间序列的示例。
假设我有这样的东西
`In [1]: from __future__ import print_function
In [2]: import numpy as np
In [3]: import statsmodels.api as sm
import pandas as pd
from statsmodels.tsa.arima_process import arma_generate_sample
np.random.seed(12345)
In [4]: import pandas as pd
In [5]: from statsmodels.tsa.arima_process import arma_generate_sample
In [6]: np.random.seed(12345)
In [7]: arparams = np.array([.75, -.25])
In [8]: maparams = np.array([.65, .35])
In [9]:
In [9]: arparams = np.r_[1, -arparams]
In [10]: maparam = np.r_[1, maparams]
In [11]: nobs = 250
In [12]: y = arma_generate_sample(arparams, …Run Code Online (Sandbox Code Playgroud)