使用RegEx选择数千个分隔符

Question

使用RegEx选择数千个分隔符

Ham*_*deh 4 php regex localization numbers

我需要在给定的字符串中更改带数字的十进制分隔符。

哪些RegEx代码只能选择字符串中的千位分隔符？

只有周围有数字时，才需要选择。例如仅当123,456我需要选择和替换时,

我正在将英语数字转换为波斯语（例如：Hello 123变为Hello ???）。现在，我也需要用波斯版本替换小数点分隔符。但是我不知道如何用正则表达式选择它。例如Hello 121,534大多数成为Hello ???/???

需要被替换的字符是,与/

Answer 1

Bar*_*mar 5

使用带有环顾四周的正则表达式。

$new_string = preg_replace('/(?<=\d),(?=\d)/', '/', $string);

Run Code Online (Sandbox Code Playgroud)

演示

(?<=\d)表示逗号前必须有一个数字，(?=\d)表示逗号后必须有一个数字。但是由于这些是环顾四周，因此比赛中不包含它们，因此不会被替换。

Answer 2

hak*_*kre 5

根据您的问题，您面临的主要问题是将英语数字转换为波斯语。

在PHP中，有一个可用的库可以根据语言环境格式化和解析数字，您可以在类NumberFormatter中找到它，该类利用Unicode公共语言环境数据存储库（CLDR）来处理-最终-所有已知的语言。世界。

因此，在这个小示例中显示了将数字123,456从en_UK（或en_US）转换fa_IR为：

$string = '123,456';
$float = (new NumberFormatter('en_UK', NumberFormatter::DECIMAL))->parse($string);
var_dump(
    (new NumberFormatter('fa_IR', NumberFormatter::DECIMAL))->format($float)
);

Run Code Online (Sandbox Code Playgroud)

输出：

string(14) "???????"

Run Code Online (Sandbox Code Playgroud)

（在3v4l.org上玩）

现在，这显示（以某种方式）如何转换数字。我对波斯语不是很坚定，所以如果我在这里使用错误的语言环境，请原谅。也许还有一些选项可以告诉您要使用哪个字符进行分组，但是在当前示例中，这只是为了说明数字的转换已由现有库负责。您不需要重新发明它，这甚至是一种措辞不当的措辞，这不是一个人可以做的任何事情，或者至少单独做这件事会有些发疯。

因此，在澄清了如何转换这些数字之后，仍然存在关于如何在整个文本中进行转换的问题。好吧，为什么不找到所有潜在的地方，然后尝试解析匹配项，如果成功（并且只有成功），才能将其转换为其他区域设置。

幸运的是，NumberFormatter::parse()如果解析失败（如果您对更多详细信息感兴趣的话，还会报告更多错误），该方法将返回false，因此这是可行的。

对于正则表达式匹配，只需要一个与数字匹配（最大匹配获胜）的模式，即可通过回调完成替换。在以下示例中，翻译是冗长的，因此实际的解析和格式设置更加可见：

# some text
$buffer = <<<TEXT
it need to only select , when there is number around it. for example only 
when 123,456 i need to select and replace "," I'm converting English
numbers into Persian (e.g: "Hello 123" becomes "Hello ???"). now I need to
replace the Decimal separator with Persian version too. but I don't know how
I can select it with regex. e.g: "Hello 121,534" most become 
"Hello ???/???" The character that needs to be replaced is , with /
TEXT;    

# prepare formatters
$inFormat = new NumberFormatter('en_UK', NumberFormatter::DECIMAL);
$outFormat = new NumberFormatter('fa_IR', NumberFormatter::DECIMAL);

$bufferWithFarsiNumbers = preg_replace_callback(
    '(\b[1-9]\d{0,2}(?:[ ,.]\d{3})*\b)u',
    function (array $matches) use ($inFormat, $outFormat) {
        [$number] = $matches;

        $result = $inFormat->parse($number);
        if (false === $result) {
            return $number;
        }

        return sprintf("< %s (%.4f) = %s >", $number, $result, $outFormat->format($result));
    },
    $buffer
);

echo $bufferWithFarsiNumbers;

Run Code Online (Sandbox Code Playgroud)

输出：

it need to only select , when there is number around it. for example only 
when < 123,456 (123456.0000) = ??????? > i need to select and replace "," I'm converting English
numbers into Persian (e.g: "Hello < 123 (123.0000) = ??? >" becomes "Hello ???"). now I need to
replace the Decimal separator with Persian version too. but I don't know how
I can select it with regex. e.g: "Hello < 121,534 (121534.0000) = ??????? >" most become 
"Hello ???/???" The character that needs to be replaced is , with /

Run Code Online (Sandbox Code Playgroud)

在这里，魔术只有两个，通过使用preg_replace_callback正则表达式模式来使字符串部分与数字转换一起起作用，该正则表达式模式应符合您的问题的需求，但是在定义整数部分和假阳性时相对容易提炼。由于NumberFormatter类而被过滤：

                    pattern for Unicode UTF-8 strings
                                 |
(\b[1-9]\d{0,2}(?:[ ,.]\d{3})*\b)u
  |                 |          |
  |        grouping character  |
  |                            |
word boundary -----------------+

Run Code Online (Sandbox Code Playgroud)

（在regex101.com上玩）

编辑：

要仅在数千个块中匹配相同的分组字符，可以创建一个已命名的引用并将其引用回去以进行重复：

(\b[1-9]\d{0,2}(?:(?<grouping_char>[ ,.])\d{3}(?:(?&grouping_char)\d{3})*)?\b)u

Run Code Online (Sandbox Code Playgroud)

（现在，此内容变得不那么容易阅读，可以在regex101.com上进行解密并使用）

为了最终确定答案，仅需要将return子句压缩，return $outFormat->format($result);并且$outFormat NumberFormatter可能需要更多配置，但是由于它在闭包中可用，因此可以在创建它时完成。

（在3v4l.org上玩）

我希望这会有所帮助，并开阔视野，不要仅仅因为撞墙（而且只有撞墙）而寻找解决方案。仅靠正则表达式通常不是答案。我很确定有正则表达式怪胎可以为您提供非常稳定的单线，但是使用它的上下文将不会非常稳定。但是，没有说只有一个答案。取而代之的是将不同级别的操作（除法和征服）放在一起，即使仍然不确定如何对英文数字进行正则表达式格式化，也可以依靠稳定的数字转换。

归档时间：	6 年，8 月前
查看次数：	106 次
最近记录：	6 年，8 月前