ziy*_*ang 2 python for-loop exception
我正在枚举像这样的大字符集的字符(以GB2312为例,但在实践中要大得多):
def get_gb2312_characters():
higher_range = range(0xb0, 0xf7 + 1)
lower_range = range(0xa1, 0xfe + 1)
# see http://en.wikipedia.org/wiki/GB_2312#Encodings_of_GB2312
for higher in higher_range:
for lower in lower_range:
encoding = (higher << 8) | lower
yield encoding.to_bytes(2, byteorder='big').decode(encoding='gb2312')
for c in get_gb2312_characters():
print(c)
Run Code Online (Sandbox Code Playgroud)
这不起作用,因为代码页中存在一些"间隙"(或"垃圾"字节组合).当程序试图从最后for一行的生成器中获取一个字符时,它将引发一个UnicodeDecodeError.问题是我不能try...except用来包含for循环
try:
for c in gb2312:
print(c)
except UnicodeDecodeError:
pass
Run Code Online (Sandbox Code Playgroud)
如果存在异常,则循环将立即终止,因此不要在for循环内使用该对
for c in gb2312:
try:
print(c)
except UnicodeDecodeError:
pass
Run Code Online (Sandbox Code Playgroud)
因为内部没有引发异常.那么有办法解决这个问题吗?谢谢.
for在函数内部使用此循环尝试此操作:
for higher in higher_range:
for lower in lower_range:
encoding = (higher << 8) | lower
try:
yield encoding.to_bytes(2, byteorder='big').decode(encoding='gb2312')
except UnicodeDecodeError:
pass
Run Code Online (Sandbox Code Playgroud)
失败的值将被静默忽略,生成器将仅返回有效值.