Excel工作簿的工作表从URL到`pandas.DataFrame`

ben*_*oss 11 python url xlrd pandas

在查看了读取url链接的不同方法后,指向.xls文件,我决定使用xlrd.

我很难将'xlrd.book.Book'类型转换为'pandas.DataFrame'

我有以下内容:

import pandas
import xlrd 
import urllib2

link ='http://www.econ.yale.edu/~shiller/data/chapt26.xls'
socket = urllib2.urlopen(link)

#this line gets me the excel workbook 
xlfile = xlrd.open_workbook(file_contents = socket.read())

#storing the sheets
sheets = xlfile.sheets()
Run Code Online (Sandbox Code Playgroud)

我想把最后一张sheets和导入作为一个pandas.DataFrame关于如何实现这一点的任何想法?我试过了,pandas.ExcelFile.parse()但它想要一个excel文件的路径.我当然可以将文件保存到内存然后解析(使用tempfile或者其他),但我正在尝试遵循pythonic指南并使用可能已经写入pandas的功能.

一如既往地非常感谢任何指导.

DSM*_*DSM 24

你可以把你传递socketExcelFile:

>>> import pandas as pd
>>> import urllib2
>>> link = 'http://www.econ.yale.edu/~shiller/data/chapt26.xls'
>>> socket = urllib2.urlopen(link)
>>> xd = pd.ExcelFile(socket)
NOTE *** Ignoring non-worksheet data named u'PDVPlot' (type 0x02 = Chart)
NOTE *** Ignoring non-worksheet data named u'ConsumptionPlot' (type 0x02 = Chart)
>>> xd.sheet_names
[u'Data', u'Consumption', u'Calculations']
>>> df = xd.parse(xd.sheet_names[-1], header=None)
>>> df
                                   0   1   2   3         4
0        Average Real Interest Rate: NaN NaN NaN  1.028826
1    Geometric Average Stock Return: NaN NaN NaN  0.065533
2              exp(geo. Avg. return) NaN NaN NaN  0.067728
3  Geometric Average Dividend Growth NaN NaN NaN  0.012025
Run Code Online (Sandbox Code Playgroud)


小智 6

您可以将 URL 传递给pandas.read_excel()

import pandas as pd

link ='http://www.econ.yale.edu/~shiller/data/chapt26.xls'
data = pd.read_excel(link,'sheetname')
Run Code Online (Sandbox Code Playgroud)