如何在PHP中拆分中文字符?

Som*_*ent 2 php split character cjk

关于如何在PHP中分割混有英文单词和数字的汉字,我需要一些帮助.

例如,如果我读

FrontPage 2000???????
Run Code Online (Sandbox Code Playgroud)

我希望得到

FrontPage, 2000, ?,?,?,?,?,?,?
Run Code Online (Sandbox Code Playgroud)

要么

FrontPage, 2,0,0,0, ?,?,?,?,?,?,?
Run Code Online (Sandbox Code Playgroud)

我怎样才能做到这一点?

提前致谢 :)

nop*_*ole 11

假设您正在使用UTF-8(或者您可以使用Iconv或其他工具将其转换为UTF-8),那么使用u修饰符(doc:http://www.php.net/manual/en/reference.pcre. pattern.modifiers.php)

<?
$s = "FrontPage 2000???????";
print_r(preg_match_all('/./u', $s, $matches));
echo "\n";
print_r($matches);
?>
Run Code Online (Sandbox Code Playgroud)

会给

21
Array
(
    [0] => Array
        (
            [0] => F
            [1] => r
            [2] => o
            [3] => n
            [4] => t
            [5] => P
            [6] => a
            [7] => g
            [8] => e
            [9] =>  
            [10] => 2
            [11] => 0
            [12] => 0
            [13] => 0
            [14] => ?
            [15] => ?
            [16] => ?
            [17] => ?
            [18] => ?
            [19] => ?
            [20] => ?
        )

)
Run Code Online (Sandbox Code Playgroud)

请注意,我的源代码也存储在以UTF-8编码的文件中,$ s包含这些字符.

以下内容将字母数字作为一组匹配:

<?
$s = "FrontPage 2000???????";
print_r(preg_match_all('/(\w+)|(.)/u', $s, $matches));
echo "\n";
print_r($matches[0]);
?>
Run Code Online (Sandbox Code Playgroud)

结果:

10
Array
(
    [0] => FrontPage
    [1] =>  
    [2] => 2000
    [3] => ?
    [4] => ?
    [5] => ?
    [6] => ?
    [7] => ?
    [8] => ?
    [9] => ?
)
Run Code Online (Sandbox Code Playgroud)

  • 请注意,仅使用`.`与`ä`(U + 0061和U + 0308的组合)等字符组合不匹配. (2认同)