导入CSV文件时Python 3中的UnicodeDecodeError

Rya*_*ini 15 python csv unicode non-ascii-characters python-3.x

我正在尝试使用以下代码导入CSV:

    import csv
    import sys

    def load_csv(filename):
        # Open file for reading
        file = open(filename, 'r')

        # Read in file
        return csv.reader(file, delimiter=',', quotechar='\n')

    def main(argv):
        csv_file = load_csv("myfile.csv")

        for item in csv_file:
            print(item)

    if __name__ == "__main__":
        main(sys.argv[1:])
Run Code Online (Sandbox Code Playgroud)

这是我的csv文件的示例:

    foo,bar,test,1,2
    this,wont,work,because,?
Run Code Online (Sandbox Code Playgroud)

而错误:

    Traceback (most recent call last):
      File "test.py", line 22, in <module>
        main(sys.argv[1:])
      File "test.py", line 18, in main
        for item in csv_file:
      File "/usr/lib/python3.2/encodings/ascii.py", line 26, in decode
        return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 40: ordinal not in range(128)
Run Code Online (Sandbox Code Playgroud)

显然,它击中了CSV末尾的角色并抛出了这个错误,但我对如何解决这个问题感到茫然.有帮助吗?

这是:

    Python 3.2.3 (default, Apr 23 2012, 23:35:30)
    [GCC 4.7.0 20120414 (prerelease)] on linux2
Run Code Online (Sandbox Code Playgroud)

jfs*_*jfs 15

看来你的问题归结为:

print("?")
Run Code Online (Sandbox Code Playgroud)

你可以通过指定来修复它PYTHONIOENCODING:

$ PYTHONIOENCODING=utf-8 python3 test.py > output.txt
Run Code Online (Sandbox Code Playgroud)

注意:

$ python3 test.py 
Run Code Online (Sandbox Code Playgroud)

如果你的终端配置支持它应该工作,其中test.py:

import csv

with open('myfile.csv', newline='', encoding='utf-8') as file:
    for row in csv.reader(file):
        print(row)
Run Code Online (Sandbox Code Playgroud)

如果open()没有encoding上面的参数,那么你会得到UnicodeDecodeErrorLC_ALL=C.

即使没有重定向,LC_ALL=C你也会得到,UnicodeEncodeErrorPYTHONIOENCODING在这种情况下是必要的.


The*_*ude 13

python文档中,您必须设置文件的编码.以下是该网站的示例:

import csv

 with open('some.csv', newline='', encoding='utf-8') as f:
   reader = csv.reader(f)
   for row in reader:
     print(row)
Run Code Online (Sandbox Code Playgroud)

编辑:您的问题似乎与打印有关.试试漂亮的打印机:

import csv
import pprint

with open('some.csv', newline='', encoding='utf-8') as f:
  reader = csv.reader(f)
  for row in reader:
    pprint.pprint(row)
Run Code Online (Sandbox Code Playgroud)

  • 设置文件的编码没有解决问题...`file = open(filename,'r',encoding ='utf-8')`仍然给我`UnicodeDecodeError:'ascii'编解码器无法解码字节0xce在位置40:序数不在范围内(128)` (3认同)
  • `export PYTHONIOENCODING = utf-8`解决了我的问题. (2认同)

小智 6

另一种选择是通过传递错误处理程序来掩盖错误:

with open('some.csv', newline='', errors='replace') as f:
   reader = csv.reader(f)
   for row in reader:
    print(row)
Run Code Online (Sandbox Code Playgroud)

这将用“丢失的字符”替换文件中任何无法解码的字节。