无法将 Pandas 数据框导出到 excel/编码

kni*_*fni 6 python xlwt pandas

由于一些编码困难,我无法导出我的数据帧之一。

sjM.dtypes

Customer Name              object
Total Sales               float64
Sales Rank                float64
Visit_Frequency           float64
Last_Sale          datetime64[ns]
dtype: object
Run Code Online (Sandbox Code Playgroud)

csv 导出工作正常

path = 'c:\\test'
sjM.to_csv(path + '.csv')   # Works
Run Code Online (Sandbox Code Playgroud)

但excel导出失败

sjM.to_excel(path + '.xls')

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "testing.py", line 338, in <module>
    sjM.to_excel(path + '.xls')
  File "c:\Anaconda\Lib\site-packages\pandas\core\frame.py", line 1197, in to_excel
    excel_writer.save()
  File "c:\Anaconda\Lib\site-packages\pandas\io\excel.py", line 595, in save
    return self.book.save(self.path)
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 662, in save
    doc.save(filename, self.get_biff_data())
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 637, in get_biff_data
    shared_str_table   = self.__sst_rec()
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 599, in __sst_rec
    return self.__sst.get_biff_record()
  File "c:\Anaconda\Lib\site-packages\xlwt\BIFFRecords.py", line 76, in get_biff_record
    self._add_to_sst(s)
  File "c:\Anaconda\Lib\site-packages\xlwt\BIFFRecords.py", line 91, in _add_to_sst
    u_str = upack2(s, self.encoding)
  File "c:\Anaconda\Lib\site-packages\xlwt\UnicodeUtils.py", line 50, in upack2
    us = unicode(s, encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x81 in position 22: ordinal not in range(128)
Run Code Online (Sandbox Code Playgroud)

我知道问题来自“客户名称”列,因为删除后导出到 excel 工作正常。

我已经尝试遵循那个问题的建议(Python pandas to_excel 'utf8' codec can't decode byte),使用函数来解码和重新编码有问题的列

def changeencode(data):
    cols = data.columns
    for col in cols:
        if data[col].dtype == 'O':
            data[col] = data[col].str.decode('latin-1').str.encode('utf-8')
    return data

sJM = changeencode(sjM)

sjM['Customer Name'].str.decode('utf-8')

L2-00864                         SETIA 2
K1-00279                     BERKAT JAYA
L2-00664                        TK. ANTO
BR00035                   BRASIL JAYA,TK
RA00011               CV. RAHAYU SENTOSA
Run Code Online (Sandbox Code Playgroud)

所以转换为unicode似乎是成功的

sjM.to_excel(path + '.xls')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "c:\Anaconda\Lib\site-packages\pandas\core\frame.py", line 1197, in to_excel
    excel_writer.save()
  File "c:\Anaconda\Lib\site-packages\pandas\io\excel.py", line 595, in save
    return self.book.save(self.path)
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 662, in save
    doc.save(filename, self.get_biff_data())
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 637, in get_biff_data
    shared_str_table   = self.__sst_rec()
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 599, in __sst_rec
    return self.__sst.get_biff_record()
  File "c:\Anaconda\Lib\site-packages\xlwt\BIFFRecords.py", line 76, in get_biff_record
    self._add_to_sst(s)
  File "c:\Anaconda\Lib\site-packages\xlwt\BIFFRecords.py", line 91, in _add_to_sst
    u_str = upack2(s, self.encoding)
  File "c:\Anaconda\Lib\site-packages\xlwt\UnicodeUtils.py", line 50, in upack2
    us = unicode(s, encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 22: ordinal not in range(128)
Run Code Online (Sandbox Code Playgroud)
  1. 为什么它会失败,即使转换为 unicode 似乎是成功的?
  2. 我如何解决这个问题以将该数据框导出到 excel?

@杰夫

谢谢你告诉我正确的方向

使用步骤:

安装 xlsxwriter(未与 Pandas 捆绑)

sjM.to_excel(path + '.xlsx', sheet_name='Sheet1', engine='xlsxwriter')
Run Code Online (Sandbox Code Playgroud)

Jef*_*eff 3

您需要使用 pandas >= 0.13,以及xlsxwriter支持本机 unicode 写入的 excel 引擎。xlwt,默认引擎将支持传递编码选项,将在 0.14 中提供。

请参阅此处了解引擎文档。