在字符串中解析这个字符串的最佳方法是什么?

leo*_*ora 3 c# regex string parsing

我有以下字符串:

 string fullString = "group = '2843360' and (team in ('TEAM1', 'TEAM2','TEAM3'))"
Run Code Online (Sandbox Code Playgroud)

我想解析出这个字符串

 string group = ParseoutGroup(fullString);  // Expect "2843360"
 string[] teams = ParseoutTeamNames(fullString); // Expect array with three items
Run Code Online (Sandbox Code Playgroud)

就完整字符串的示例而言,我可以列出一个或多个团队(并不总是如上所述的三个).

我有这部分的工作,但我的代码感觉很哈克并没有很面向未来的,所以我想看看是否有更好的正则表达式这里溶液或分析这些值超出这个满弦的更优雅的方式?之后可能会在字符串中添加其他内容,因此我希望这样做尽可能万无一失.

Val*_*aev 6

在最简单的情况下,正则表达式可能是最好的答案.不幸的是,在这种情况下,我们似乎需要解析SQL语言的一个子集.虽然可以使用正则表达式解决此问题,但它们并非旨在解析复杂语言(嵌套括号和转义字符串).

需求也可能随着时间的推移而发展,并且需要解析更复杂的结构.

如果公司政策允许,我将选择构建内部DSL以解析此字符串.

我最喜欢的构建内部DLS的工具之一叫做Sprache

您可以在下面找到使用内部DSL方法的示例解析器.

在代码中,我定义了基元来处理所需的SQL运算符,并用这些运算符组成最终的解析器.

    [Test]
    public void Test()
    {
        string fullString = "group = '2843360' and (team in ('TEAM1', 'TEAM2','TEAM3'))";


        var resultParser =
            from @group in OperatorEquals("group")
            from @and in OperatorEnd()
            from @team in Brackets(OperatorIn("team"))
            select new {@group, @team};
        var result = resultParser.Parse(fullString);
        Assert.That(result.group, Is.EqualTo("2843360"));
        Assert.That(result.team, Is.EquivalentTo(new[] {"TEAM1", "TEAM2", "TEAM3"}));
    }

    private static readonly Parser<char> CellSeparator =
        from space1 in Parse.WhiteSpace.Many()
        from s in Parse.Char(',')
        from space2 in Parse.WhiteSpace.Many()
        select s;

    private static readonly Parser<char> QuoteEscape = Parse.Char('\\');

    private static Parser<T> Escaped<T>(Parser<T> following)
    {
        return from escape in QuoteEscape
               from f in following
               select f;
    }

    private static readonly Parser<char> QuotedCellDelimiter = Parse.Char('\'');

    private static readonly Parser<char> QuotedCellContent =
        Parse.AnyChar.Except(QuotedCellDelimiter).Or(Escaped(QuotedCellDelimiter));

    private static readonly Parser<string> QuotedCell =
        from open in QuotedCellDelimiter
        from content in QuotedCellContent.Many().Text()
        from end in QuotedCellDelimiter
        select content;

    private static Parser<string> OperatorEquals(string column)
    {
        return
            from c in Parse.String(column)
            from space1 in Parse.WhiteSpace.Many()
            from opEquals in Parse.Char('=')
            from space2 in Parse.WhiteSpace.Many()
            from content in QuotedCell
            select content;
    }

    private static Parser<bool> OperatorEnd()
    {
        return
            from space1 in Parse.WhiteSpace.Many()
            from c in Parse.String("and")
            from space2 in Parse.WhiteSpace.Many()
            select true;
    }

    private static Parser<T> Brackets<T>(Parser<T> contentParser)
    {
        return from open in Parse.Char('(')
               from space1 in Parse.WhiteSpace.Many()
               from content in contentParser
               from space2 in Parse.WhiteSpace.Many()
               from close in Parse.Char(')')
               select content;
    }

    private static Parser<IEnumerable<string>> ComaSeparated()
    {
        return from leading in QuotedCell
               from rest in CellSeparator.Then(_ => QuotedCell).Many()
               select Cons(leading, rest);
    }

    private static Parser<IEnumerable<string>> OperatorIn(string column)
    {
        return
            from c in Parse.String(column)
            from space1 in Parse.WhiteSpace
            from opEquals in Parse.String("in")
            from space2 in Parse.WhiteSpace.Many()
            from content in Brackets(ComaSeparated())
            from space3 in Parse.WhiteSpace.Many()
            select content;
    }

    private static IEnumerable<T> Cons<T>(T head, IEnumerable<T> rest)
    {
        yield return head;
        foreach (T item in rest)
            yield return item;
    }
Run Code Online (Sandbox Code Playgroud)