相关疑难解决方法(0)

UnicodeDecodeError:意外的数据结束

我有一个巨大的文本文件,我想打开.
我正在以块的形式读取文件,避免了与一次读取过多文件相关的内存问题.

代码段:

def open_delimited(fileName, args):

    with open(fileName, args, encoding="UTF16") as infile:
        chunksize = 10000
        remainder = ''
        for chunk in iter(lambda: infile.read(chunksize), ''):
            pieces = re.findall(r"(\d+)\s+(\d+_\d+)", remainder + chunk)
            for piece in pieces[:-1]:
                yield piece
            remainder = '{} {} '.format(*pieces[-1]) 
        if remainder:
            yield remainder
Run Code Online (Sandbox Code Playgroud)

代码抛出错误UnicodeDecodeError: 'utf16' codec can't decode bytes in position 8190-8191: unexpected end of data.

我试过UTF8并得到了错误UnicodeDecodeError: 'utf8' codec can't decode byte 0xff in position 0: invalid start byte.

latin-1iso-8859-1 …

unicode python-3.x

9
推荐指数
1
解决办法
2万
查看次数

标签 统计

python-3.x ×1

unicode ×1