如何在C#中读取RegEx捕获

Adm*_*ama 3 c# regex console

我开始写一本C#书,我决定把RegEx放到混音中,让枯燥的控制台练习更有趣.我想要做的是在控制台中询问用户他们的电话号码,根据RegEx进行检查,然后捕获数字,这样我就可以按照我想要的方式对其进行格式化.除了RegEx捕获部分之外,我已经完成了所有工作.如何将捕获值转换为C#变量?

也可以随意更正任何代码格式或变量命名问题.

static void askPhoneNumber()
{
    String pattern = @"[(]?(\d{3})[)]?[ -.]?(\d{3})[ -.]?(\d{4})";

    System.Console.WriteLine("What is your phone number?");
    String phoneNumber = Console.ReadLine();

    while (!Regex.IsMatch(phoneNumber, pattern))
    {
        Console.WriteLine("Bad Input");
        phoneNumber = Console.ReadLine();
    }

    Match match = Regex.Match(phoneNumber, pattern);
    Capture capture = match.Groups.Captures;

    System.Console.WriteLine(capture[1].Value + "-" + capture[2].Value + "-" + capture[3].Value);
}
Run Code Online (Sandbox Code Playgroud)

Adr*_*HHH 11

比赛结果可能很难理解。我编写这段代码是为了帮助我了解发现了什么以及在哪里发现的。目的是可以将输出片段(来自标有 的行//**)复制到程序中以利用在匹配中找到的值。

public static void DisplayMatchResults(Match match)
{
    Console.WriteLine("Match has {0} captures", match.Captures.Count);

    int groupNo = 0;
    foreach (Group mm in match.Groups)
    {
        Console.WriteLine("  Group {0,2} has {1,2} captures '{2}'", groupNo, mm.Captures.Count, mm.Value);

        int captureNo = 0;
        foreach (Capture cc in mm.Captures)
        {
            Console.WriteLine("       Capture {0,2} '{1}'", captureNo, cc);
            captureNo++;
        }
        groupNo++;
    }

    groupNo = 0;
    foreach (Group mm in match.Groups)
    {
        Console.WriteLine("    match.Groups[{0}].Value == \"{1}\"", groupNo, match.Groups[groupNo].Value); //**
        groupNo++;
    }

    groupNo = 0;
    foreach (Group mm in match.Groups)
    {
        int captureNo = 0;
        foreach (Capture cc in mm.Captures)
        {
            Console.WriteLine("    match.Groups[{0}].Captures[{1}].Value == \"{2}\"", groupNo, captureNo, match.Groups[groupNo].Captures[captureNo].Value); //**
            captureNo++;
        }
        groupNo++;
    }
}
Run Code Online (Sandbox Code Playgroud)

给定以下输入,使用此方法的简单示例:

Regex regex = new Regex("/([A-Za-z]+)/(\\d+)$");
String text = "some/directory/Pictures/Houses/12/apple/banana/"
            + "cherry/345/damson/elderberry/fig/678/gooseberry");
Match match = regex.Match(text);
DisplayMatchResults(match);
Run Code Online (Sandbox Code Playgroud)

输出是:

Match has 1 captures
  Group  0 has  1 captures '/Houses/12'
       Capture  0 '/Houses/12'
  Group  1 has  1 captures 'Houses'
       Capture  0 'Houses'
  Group  2 has  1 captures '12'
       Capture  0 '12'
    match.Groups[0].Value == "/Houses/12"
    match.Groups[1].Value == "Houses"
    match.Groups[2].Value == "12"
    match.Groups[0].Captures[0].Value == "/Houses/12"
    match.Groups[1].Captures[0].Value == "Houses"
    match.Groups[2].Captures[0].Value == "12"
Run Code Online (Sandbox Code Playgroud)

假设我们想要在上面的文本中找到上面正则表达式的所有匹配项。然后我们可以MatchCollection在代码中使用,例如:

MatchCollection matches = regex.Matches(text);
for (int ii = 0; ii < matches.Count; ii++)
{
    Console.WriteLine("Match[{0}]  // of 0..{1}:", ii, matches.Count-1);
    RegexMatchDisplay.DisplayMatchResults(matches[ii]);
}
Run Code Online (Sandbox Code Playgroud)

其输出是:

Match[0]  // of 0..2:
Match has 1 captures
  Group  0 has  1 captures '/Houses/12/'
       Capture  0 '/Houses/12/'
  Group  1 has  1 captures 'Houses'
       Capture  0 'Houses'
  Group  2 has  1 captures '12'
       Capture  0 '12'
    match.Groups[0].Value == "/Houses/12/"
    match.Groups[1].Value == "Houses"
    match.Groups[2].Value == "12"
    match.Groups[0].Captures[0].Value == "/Houses/12/"
    match.Groups[1].Captures[0].Value == "Houses"
    match.Groups[2].Captures[0].Value == "12"
Match[1]  // of 0..2:
Match has 1 captures
  Group  0 has  1 captures '/cherry/345/'
       Capture  0 '/cherry/345/'
  Group  1 has  1 captures 'cherry'
       Capture  0 'cherry'
  Group  2 has  1 captures '345'
       Capture  0 '345'
    match.Groups[0].Value == "/cherry/345/"
    match.Groups[1].Value == "cherry"
    match.Groups[2].Value == "345"
    match.Groups[0].Captures[0].Value == "/cherry/345/"
    match.Groups[1].Captures[0].Value == "cherry"
    match.Groups[2].Captures[0].Value == "345"
Match[2]  // of 0..2:
Match has 1 captures
  Group  0 has  1 captures '/fig/678/'
       Capture  0 '/fig/678/'
  Group  1 has  1 captures 'fig'
       Capture  0 'fig'
  Group  2 has  1 captures '678'
       Capture  0 '678'
    match.Groups[0].Value == "/fig/678/"
    match.Groups[1].Value == "fig"
    match.Groups[2].Value == "678"
    match.Groups[0].Captures[0].Value == "/fig/678/"
    match.Groups[1].Captures[0].Value == "fig"
    match.Groups[2].Captures[0].Value == "678"
Run Code Online (Sandbox Code Playgroud)

因此:

    matches[1].Groups[0].Value == "/cherry/345/"
    matches[1].Groups[1].Value == "cherry"
    matches[1].Groups[2].Value == "345"
    matches[1].Groups[0].Captures[0].Value == "/cherry/345/"
    matches[1].Groups[1].Captures[0].Value == "cherry"
    matches[1].Groups[2].Captures[0].Value == "345"
Run Code Online (Sandbox Code Playgroud)

对于matches[0]和也是如此matches[2]


Luc*_*ski 10

C#正则表达式API可能非常令人困惑.有团体捕获:

  • 表示捕获组,它用于从文本中提取的子串
  • 如果组出现在量词中,则每组可以有多个捕获.

层次结构是:

  • 比赛
      • 捕获

(一个匹配可以有几个组,每个组可以有几个捕获)

例如:

Subject: aabcabbc
Pattern: ^(?:(a+b+)c)+$
Run Code Online (Sandbox Code Playgroud)

在这个例子中,只有一个组:(a+b+).该组位于量词内,并且匹配两次.它会生成两个捕获:aababb:

aabcabbc
^^^ ^^^
Cap1  Cap2
Run Code Online (Sandbox Code Playgroud)

当一个组不在量词内时,它只生成一个捕获.在您的情况下,您有3个组,每组捕获一次.您可以使用match.Groups[1].Value,match.Groups[2].Valuematch.Groups[3].Value提取3子你有兴趣,而不诉诸捕捉概念都没有.

  • @CausingUnderflowsEverywhere 索引 0 处的组代表整个匹配。捕获组从索引 1 开始。 (4认同)