正则表达式模式选择BETWEEN匹配引号的数据

Joh*_*tos 5 c# regex vb.net

假设我有以下字符串我想在其上运行正则表达式:

This is a test string with "quotation-marks" within it.
The "problem" I am having, per-se, is "knowing" which "quotation-marks"
go with which words.
Run Code Online (Sandbox Code Playgroud)

现在,假设我想用引号替换引号之间的所有-字符,比如说空格.我以为我可以使用正则表达式进行如下操作:

Find What:      (\"[^"]*?)(\-)([^"]*?\")
Replace With:   $1 $3
Run Code Online (Sandbox Code Playgroud)

我遇到的问题是使用这种模式,它没有考虑引号是打开还是关闭语句.

因此,在上面的示例中,-字符per-se将被空格替换,因为它在2个引号之间,但在结束标记和开始标记之间 - 当我特别想在文本中查看开始和结束标记之间时.

你如何在这样的正则表达式中解释这一点?

我希望这是有道理的.

我正在使用VB/C#Regex.


只是为了完成这个问题(并希望在必要时再详细说明),我想得到的最终结果是:

This is a test string with "quotation marks" within it.
The "problem" I am having, per-se, is "knowing" which "quotation marks"
go with which words.
Run Code Online (Sandbox Code Playgroud)

谢谢!!

dee*_*see 8

您遇到的问题与尝试匹配HTML或打开和关闭括号的人有同样的问题,正则表达式只能匹配常规语言,并且知道哪个"是结束语,而开放的语句除了微不足道的情况之外什么都不可用.

编辑:正如Vasili Syrakis的回答所示,有时它可以完成,但正则表达式是解决此类问题的脆弱解决方案.

话虽如此,你可以在琐碎的情况下转换你的问题.由于您使用的是.NET,因此您可以简单地匹配每个引用的字符串并使用带有匹配评估程序的重载.

Regex.Replace(text, "\".*?\"", m => m.Value.Replace("-", " "))
Run Code Online (Sandbox Code Playgroud)

测试:

var text = @"This is a test string with ""quotation-marks"" within it.
The ""problem"" I am having, per-se, is ""knowing"" which ""quotation-marks""
go with which words.";

Console.Write(Regex.Replace(text, "\".*?\"", m => m.Value.Replace("-", " ")));
//This is a test string with "quotation marks" within it.
//The "problem" I am having, per-se, is "knowing" which "quotation marks"
//go with which words. 
Run Code Online (Sandbox Code Playgroud)

  • @JohnBus​​tos:Vache是​​对的,没有纯粹的正则表达式解决方案,不是hackish和脆弱.非常感谢您使用的是正则表达式(.NET),它不仅支持lambda,而且使它们易于使用. (2认同)

Joh*_*ner 6

从长远来看,执行此操作的常规方法可能更容易维护,而不是正则表达式:

public static String replaceDashInQuotes(this string source, String newValue)
{
    StringBuilder sb = new StringBuilder();

    bool inquote = false;

    for (int i = 0; i < source.Length; i++)
    {
        if (source[i] == '\"')
            inquote = !inquote;

        if (source[i] == '-' && inquote)
            sb.Append(newValue);
        else
            sb.Append(source[i]);
    }

    return sb.ToString();
}
Run Code Online (Sandbox Code Playgroud)

然后使用它:

var s = @"This is a test string with ""quotation-marks"" within it.
    The ""problem"" I am having, per-se, is ""knowing"" which ""quotation-marks""
    go with which words.";

MessageBox.Show(s.replaceDashInQuotes(" "));
Run Code Online (Sandbox Code Playgroud)


Vas*_*kis 5

把我的大脑搞得一团糟,事实证明,指定非单词边界\B可以解决问题:

正则表达式

\B("[^"]*)-([^"]*")\B

替换

$1 $2


演示

http://regex101.com/r/dS0bH8