我有一个巨大的文本文件,我想打开.
我正在以块的形式读取文件,避免了与一次读取过多文件相关的内存问题.  
代码段:
def open_delimited(fileName, args):
    with open(fileName, args, encoding="UTF16") as infile:
        chunksize = 10000
        remainder = ''
        for chunk in iter(lambda: infile.read(chunksize), ''):
            pieces = re.findall(r"(\d+)\s+(\d+_\d+)", remainder + chunk)
            for piece in pieces[:-1]:
                yield piece
            remainder = '{} {} '.format(*pieces[-1]) 
        if remainder:
            yield remainder
代码抛出错误UnicodeDecodeError: 'utf16' codec can't decode bytes in position 8190-8191: unexpected end of data.
我试过UTF8并得到了错误UnicodeDecodeError: 'utf8' codec can't decode byte 0xff in position 0: invalid start byte.  
latin-1并iso-8859-1 …