为什么 mbstring 无法将 latin1 字符检测为 cp1252？

ASCII 字符被检测为有效的 latin1 ，但不是cp1252。

\n\n

mb_detect_encoding("a",["ISO-8859-1"],true);   // "ISO-8859-1"  \nmb_detect_encoding("a",["Windows-1252"],true); // false\n

Run Code Online (Sandbox Code Playgroud)\n\n

80-9F 范围内的附加字符可检测为：

\n\n

mb_detect_encoding("\\x80",["ISO-8859-1"],true);   // "ISO-8859-1"  \nmb_detect_encoding("\\x80",["Windows-1252"],true); // "Windows-1252"\n

Run Code Online (Sandbox Code Playgroud)\n\n

但常见的扩展字符则不然。取\xc3\xa9处的字符0xE9。PHP 将其检测为 ISO，但不是超集：

\n\n

mb_detect_encoding("\\xE9",["ISO-8859-1"],true);   // "ISO-8859-1"  \nmb_detect_encoding("\\xE9",["Windows-1252"],true); // false\n

Run Code Online (Sandbox Code Playgroud)\n\n

将附加字符转换为 UTF-8 需要使用 Windows 字符集，其工作原理如下：

\n\n

mb_convert_encoding("a\\xE9\\x80","UTF-8","Windows-1252"); // a\xc3\xa9\xe2\x82\xac\nmb_convert_encoding("a\\xE9\\x80","UTF-8","ISO-8859-1");   // a\xc3\xa9<control>\n

Run Code Online (Sandbox Code Playgroud)\n\n

我可以忍受这种限制（检测为 ISO，转换为 Windows），但我很想知道为什么会出现这种情况。

归档时间：	7 年，3 月前
查看次数：	465 次
最近记录：	7 年，3 月前