如何在 Zsh 中使用无效的 unicode 字符创建字符串？

$ printf %s $invalid_byte_sequence| uconv -x any-name
Conversion to Unicode from codepage failed at input byte position 0. Bytes: 80 Error: Illegal character found
Conversion to Unicode from codepage failed at input byte position 1. Bytes: 81 Error: Illegal character found
$ printf %s $other_invalid_byte_sequence| uconv -x any-name
Conversion to Unicode from codepage failed at input byte position 0. Bytes: c2 Error: Illegal character found
Conversion to Unicode from codepage failed at input byte position 1. Bytes: c2 Error: Truncated character found
$ printf %s $non_character| uconv -x any-name
\N{<noncharacter-FFFE>}
$ printf %s $not_valid_anymore| uconv -x any-name
Conversion to Unicode from codepage failed at input byte position 0. Bytes: f4 90 80 80 Error: Illegal character found
$ printf %s $utf16_surrogate | uconv -x any-name
Conversion to Unicode from codepage failed at input byte position 0. Bytes: ed a0 80 Error: Illegal character found
$ printf %s $unassigned | uconv -x any-name
\N{<unassigned-0378>}
$ printf %s $unicode_8_and_above_only | uconv -x any-name
\N{<unassigned-1F917>}
$

Run Code Online (Sandbox Code Playgroud)

使用 GNU grep，您可以使用它grep .来查看它是否可以在输入中找到一个字符：

l=(invalid_byte_sequence other_invalid_byte_sequence non_character
  not_valid_anymore utf16_surrogate unassigned unicode_8_and_above_only)
for c ($l) print -r ${(P)c} | grep -q . && print $c

Run Code Online (Sandbox Code Playgroud)

这对我来说是：

non_character
not_valid_anymore
utf16_surrogate
unassigned
unicode_8_and_above_only

Run Code Online (Sandbox Code Playgroud)

也就是说，我grep仍然认为其中一些无效、非字符或尚未分配的字符是（或包含）字符。YMMV 用于grep其他实用程序的其他实现。

归档时间：	9 年，11 月前
查看次数：	5372 次
最近记录：	9 年，11 月前