Perl是否\w
匹配Unicode标准中定义的所有字母数字字符?
例如,是否\w
匹配所有(比方说)中文和俄文字母数字字符?
我写了一个简单的测试脚本(见下文),它表明\w
我测试的非ASCII字母数字字符确实符合"预期".但测试显然远非详尽无遗.
#!/usr/bin/perl
use utf8;
binmode(STDOUT, ':utf8');
my @ok;
$ok[0] = "abcdefghijklmnopqrstuvwxyz";
$ok[1] = "éèëáàåäö??ž?í???øáý?óæš?ô?";
$ok[2] = "??ü??âi?ó?????íá??????????";
$ok[3] = "??????????????????????????";
$ok[4] = "??????????????????????????";
$ok[5] = "?????????????????????";
foreach my $ok (@ok) {
die unless ($ok =~ /^\w+$/);
}
Run Code Online (Sandbox Code Playgroud) regex unicode perl internationalization character-properties