python pandas read_excel：sep参数可用吗？

Question

python pandas read_excel：sep参数可用吗？

我正在尝试使用 .xlsx 将 .xlsx 读入 pandas 数据帧
pd.read_excel("C:/...")。

问题是我只得到一列，其中包含用“，”分隔的所有数据。

|---| "Country","Year","Export" |  
|---|---------------------------|  
| 0 | Canada,2017,3002          |  
| 1 | Bulgaria,2016,3960        |  
| 2 | Germany,2015,3818         |

Run Code Online (Sandbox Code Playgroud)

但这不是我想要的格式...我想得到如下表所示的三列。

|---| "Country"    | "Year"   | "Export"   |  
|---|--------------|----------| -----------|  
|1  | Canada       | 2017     |       3002 |  
|2  | Bulgaria     | 2016     |       3960 |  
|3  | Germany      | 2015     |       3818 |

Run Code Online (Sandbox Code Playgroud)

所以我正在寻找 pd.read_csv 中包含的 sep=',' 或 delimiter=',' 参数。我已经完成了 pandas.read_excel 的文档，但还没有找到处理这个问题的参数......

谢谢！

Answer 1

ALo*_*llz 3

一种选择是将 .xlsx 保存为 csv 文件。如果您在文本编辑器中打开它，您应该会看到烦人的列保存在引号内，但其值用逗号分隔，例如：

"Country,Year,Export",...  
"Canada,2017,3002",...
"Bulgaria,2016,3960",...        
"Germany,2015,3818",...

Run Code Online (Sandbox Code Playgroud)

然后你可以用它来读取这个文件pd.read_csv()，它将创建一个名为的列'Country,Year,Export'，看起来像

  Country,Year,Export
0    Canada,2017,3002
1  Bulgaria,2016,3960
2   Germany,2015,3818

Run Code Online (Sandbox Code Playgroud)

然后您可以将其拆分为单独的列str.split()

df[['Country', 'Year', 'Export']] = pd.DataFrame(df['Country,Year,Export'].str.split(',').tolist())

  Country,Year,Export   Country  Year Export
0    Canada,2017,3002    Canada  2017   3002
1  Bulgaria,2016,3960  Bulgaria  2016   3960
2   Germany,2015,3818   Germany  2015   3818

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，9 月前
查看次数：	17860 次
最近记录：	7 年，9 月前