Pandas xlrd 引擎通过仍然值错误

Dar*_*dav 3 python excel xlrd python-3.x pandas

我正在尝试从 url 读取 xls 文件:

使用请求:

page = requests.get(url) # xls url
df = pd.read_excel(page.content,engine = 'xlrd')  #engine is passed



File "f:\python36\lib\site-packages\pandas\util\_decorators.py", line 118, in wrapper
    return func(*args, **kwargs)
  File "f:\python36\lib\site-packages\pandas\io\excel.py", line 230, in read_excel
    io = ExcelFile(io, engine=engine)
  File "f:\python36\lib\site-packages\pandas\io\excel.py", line 296, in __init__
    raise ValueError('Must explicitly set engine if not passing in'
ValueError: Must explicitly set engine if not passing in buffer or path for io.

# if i put in random engine name it throws an unsupported engine error but with xlrd it throws must set engine
Run Code Online (Sandbox Code Playgroud)

我尝试保存文件然后读取它:

with open('file.xls','wb') as f:
    f.write(page.content)

df = pd.read_excel('file.xls',engine='xlrd')  #this works
Run Code Online (Sandbox Code Playgroud)

编辑:

我试过传递它引发的 page.text :

ValueError: embedded null character
Run Code Online (Sandbox Code Playgroud)

unu*_*tbu 5

如果 to 的第一个参数pd.read_excel是 a str,则将其解释为文件(或 URL)的路径。如果我们希望将文件的内容直接传递给 read_excel,那么我们需要将内容包装在 a 中BytesIO,使其成为类文件对象:

因此,使用

BytesIO = pd.io.common.BytesIO
df = pd.read_excel(BytesIO(page.content), engine='xlrd')
Run Code Online (Sandbox Code Playgroud)