小编tro*_*rob的帖子

Python:使用Google财经下载索引数据

我成功地从Google财经下载了股票数据,如下所示:

import pandas as pd
from pandas_datareader import data as web   
import datetime
start = datetime.datetime(2016,1,1)
end   = datetime.date.today()
apple = web.DataReader('aapl', 'google', start, end)

Run Code Online (Sandbox Code Playgroud)

我以为我能够为索引数据使用相同的框架.但这不起作用:

spx = web.DataReader('INDEXSP', 'google', start, end)

RemoteDataError: Unable to read URL: http://www.google.com/finance/historical

Run Code Online (Sandbox Code Playgroud)

谷歌不支持索引吗？

或者我需要一个不同的协议？

python google-finance google-finance-api quantitative-finance yahoo-finance

tro*_*rob

2017 05-24

7
推荐指数

1
解决办法

4473
查看次数

使用pivot的Pandas KeyError

我是Python的新手,我想用Python来复制一个常见的excel任务.如果已经回答了这样的问题,请告诉我.我一直无法找到它.我有以下pandas数据帧(数据):

Date    Stage   SubStage    Value
12/31/2015   1.00   a   0.896882891
1/1/2016     1.00   a   0.0458843
1/2/2016     1.00   a   0.126805588
1/3/2016     1.00   b   0.615824461
1/4/2016     1.00   b   0.245092069
1/5/2016     1.00   c   0.121936318
1/6/2016     1.00   c   0.170198128
1/7/2016     1.00   c   0.735872415
1/8/2016     1.00   c   0.542361912
1/4/2016     2.00   a   0.723769247
1/5/2016     2.00   a   0.305570257
1/6/2016     2.00   b   0.47461605
1/7/2016     2.00   b   0.173702623
1/8/2016     2.00   c   0.969260251
1/9/2016     2.00   c   0.017170798

Run Code Online (Sandbox Code Playgroud)

在excel中,我可以使用数据透视表来生成以下内容:

使用'数据'的excel数据透视表

在python中执行以下操作似乎是合理的:

data.pivot(index='Date',columns = ['Stage','SubStage'],values = 'Value')

Run Code Online (Sandbox Code Playgroud)

但这会产生:

KeyError: 'Level Stage not found'

Run Code Online (Sandbox Code Playgroud)

是什么赋予了？

python pivot-table pandas keyerror

tro*_*rob

2016 05-11

3
推荐指数

1
解决办法

5732
查看次数

Python:foward填充nans和0

假设我有一个数据帧df1,它有零和nans:

dates = pd.date_range('20170101',periods=20)
df1 = pd.DataFrame(np.random.randint(10,size=(20,3)),index=dates,columns=['foo','bar','see'])
df1.iloc[3:12,0] = np.nan
df1.iloc[6:17,1] = 0

Run Code Online (Sandbox Code Playgroud)

什么是前进填充zeors和nans的简洁方法？我试过以下:

df1 = (df1.fillna(method='ffill', inplace=True)).replace(to_replace=0, method='ffill')

AttributeError: 'NoneType' object has no attribute 'replace'

Run Code Online (Sandbox Code Playgroud)

python fill missing-data pandas

tro*_*rob

2017 07-13

3
推荐指数

1
解决办法

624
查看次数

Pandas sumif有重复的列名

通过下面的df3列对df2列进行求和的最佳方法是什么？

df = pd.DataFrame(np.random.rand(25).reshape((5,5)),index = ['A','B','C','D','E'])
df1 = pd.DataFrame(np.random.rand(15).reshape((5,3)),index = ['A','B','C','D','E'])
df2 = pd.concat([df,df1],axis=1)
df3 =  pd.DataFrame(np.random.rand(25).reshape((5,5)),columns = np.arange(5),index = ['A','B','C','D','E'])

Run Code Online (Sandbox Code Playgroud)

答案是df3的形状.

为清晰起见编辑:

df = pd.DataFrame(np.ones(25).reshape((5,5)),index = ['A','B','C','D','E'])
df1 = pd.DataFrame(np.ones(15).reshape((5,3))*2,index = ['A','B','C','D','E'],columns = [1,3,4])
df2 = pd.concat([df,df1],axis=1)
df3 =  pd.DataFrame(np.empty((5,5)),columns = np.arange(5),index = ['A','B','C','D','E'])
print(df2)
     0    1    2    3    4    1    3    4
A  1.0  1.0  1.0  1.0  1.0  2.0  2.0  2.0
B  1.0  1.0  1.0  1.0  1.0  2.0  2.0  2.0
C  1.0  1.0  1.0  1.0  1.0  2.0  2.0 …

Run Code Online (Sandbox Code Playgroud)

python dataframe pandas

tro*_*rob

2017 04-21

2
推荐指数

1
解决办法

345
查看次数