相关疑难解决方法(0)

如何确定文本的编码？

我收到了一些编码的文本,但我不知道使用了什么字符集.有没有办法使用Python确定文本文件的编码？如何检测文本文件的编码/代码页处理C#.

python encoding text-files

Nop*_*ope

2019 02-12

204
推荐指数

7
解决办法

20万
查看次数

UnicodeDecodeError:'utf-8'编解码器无法解码字节

这是我的代码,

for line in open('u.item'):
#read each line

Run Code Online (Sandbox Code Playgroud)

每当我运行此代码时,它会给出以下错误:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 2892: invalid continuation byte

Run Code Online (Sandbox Code Playgroud)

我试图解决这个问题并在open()中添加一个额外的参数,代码看起来像;

for line in open('u.item', encoding='utf-8'):
#read each line

Run Code Online (Sandbox Code Playgroud)

但它再次给出了同样的错误.那我该怎么办!请帮忙.

python character-encoding python-3.x

Suj*_*itS

2019 07-03

179
推荐指数

11
解决办法

42万
查看次数

Python 如何检查文件名是否为 UTF8？

我有一个 PHP 脚本，可以在目录中创建文件列表，但是，PHP 只能看到英文文件名，而完全忽略其他语言（例如俄语或亚洲语言）的文件名。

\n\n

经过大量努力，我找到了唯一适合我的解决方案 - 使用 python 脚本将文件重命名为 UTF8，以便 PHP 脚本可以在之后处理它们。

\n\n

（PHP处理完文件后，我将文件重命名为英文，不将它们保留为UTF8）。

\n\n

我使用了以下 python 脚本，效果很好：

\n\n

import sys\nimport os\nimport glob\nimport ntpath\nfrom random import randint\n\nfor infile in glob.glob( os.path.join('C:\\\\MyFiles', u'*') ):\n    if os.path.isfile(infile):\n      infile_utf8 = infile.encode('utf8')\n      os.rename(infile, infile_utf8)\n

Run Code Online (Sandbox Code Playgroud)\n\n

问题是它还会转换已经采用 UTF8 格式的文件名。我需要一种方法来跳过转换，以防文件名已经是 UTF8。

\n\n

我正在尝试这个 python 脚本：

\n\n

for infile in glob.glob( os.path.join('C:\\\\MyFiles', u'*') ):\n    if os.path.isfile(infile):\n      try:\n        infile.decode('UTF-8', 'strict')\n      except UnicodeDecodeError:\n        infile_utf8 = infile.encode('utf8')\n        os.rename(infile, infile_utf8)    \n

Run Code Online (Sandbox Code Playgroud)\n\n

但是，如果文件名已经是 utf8 格式，我会收到致命错误：

\n\n

UnicodeDecodeError: 'ascii' codec can't …

Run Code Online (Sandbox Code Playgroud)

python windows unicode filenames utf-8

Phy*_*ser

2013 10-03

5
推荐指数

1
解决办法

1万
查看次数

标签统计

python ×3

character-encoding ×1

encoding ×1

filenames ×1

python-3.x ×1

text-files ×1

unicode ×1

utf-8 ×1

windows ×1

如何确定文本的编码？

UnicodeDecodeError:'utf-8'编解码器无法解码字节

Python 如何检查文件名是否为 UTF8？

标签 统计

标签统计