Vik*_*tam 2 python unzip zipfile pandas
我试图在pandas数据帧中读取WGIData.csv文件.WGIData.csv存在于我从此网址下载的zip文件中
http://databank.worldbank.org/data/download/WGI_csv.zip
但是当我试图阅读时,它会抛出错误BadZipFile:文件不是zip文件
这是我的python代码
import pandas as pd
from urllib.request import urlopen
from zipfile import ZipFile
class Get_Data():
def Return_csv_from_zip(self, url):
self.zip = urlopen(url)
self.myzip = ZipFile(self.zip)
self.myzip = self.zip.extractall(self.myzip)
self.file = pd.read_csv(self.myzip)
self.zip.close()
return self.file
url = 'http://databank.worldbank.org/data/download/WGI_csv.zip'
data = Get_Data()
df = data.Return_csv_from_zip(url)
Run Code Online (Sandbox Code Playgroud)
urlopen()不会返回HTTPResponse您可以发送的对象()ZipFile().您可以read()回复并使用io.BytesIO()它来做您需要的事情:
In []:
from io import BytesIO
z = urlopen('http://databank.worldbank.org/data/download/WGI_csv.zip')
myzip = ZipFile(BytesIO(z.read())).extract('WGIData.csv')
pd.read_csv(myzip)
Out[]:
Country Name Country Code Indicator Name Indicator Code 1996 \
0 Anguilla AIA Control of Corruption: Estimate CC.EST NaN
1 Anguilla AIA Control of Corruption: Number of Sources CC.NO.SRC NaN
2 Anguilla AIA Control of Corruption: Percentile Rank CC.PER.RNK NaN
3 Anguilla AIA Control of Corruption: Percentile Rank, Lower ... CC.PER.RNK.LOWER NaN
4 Anguilla AIA Control of Corruption: Percentile Rank, Upper ... CC.PER.RNK.UPPER NaN
5 Anguilla AIA Control of Corruption: Standard Error CC.STD.ERR NaN
...
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1220 次 |
| 最近记录: |