我有一个字符串,里面有几个html注释.我需要计算表达式的唯一匹配.
例如,字符串可能是:
var teststring = "<!--X1-->Hi<!--X1-->there<!--X2-->";
Run Code Online (Sandbox Code Playgroud)
我目前用这个来获得比赛:
var regex = new Regex("<!--X.-->");
var matches = regex.Matches(teststring);
Run Code Online (Sandbox Code Playgroud)
结果是3场比赛.但是,我想这只有两场比赛,因为只有两场比赛.
我知道我可以循环生成MatchCollection并删除额外的Match,但我希望有一个更优雅的解决方案.
澄清:样本字符串与实际使用的内容大大简化.很容易就有X8或X9,字符串中可能有几十个.
Svi*_*ish 24
我只是使用Enumerable.Distinct方法,例如:
string subjectString = "<!--X1-->Hi<!--X1-->there<!--X2--><!--X1-->Hi<!--X1-->there<!--X2-->";
var regex = new Regex(@"<!--X\d-->");
var matches = regex.Matches(subjectString);
var uniqueMatches = matches
.OfType<Match>()
.Select(m => m.Value)
.Distinct();
uniqueMatches.ToList().ForEach(Console.WriteLine);
Run Code Online (Sandbox Code Playgroud)
输出:
<!--X1-->
<!--X2-->
Run Code Online (Sandbox Code Playgroud)
对于正则表达式,你可以使用这个吗?
(<!--X\d-->)(?!.*\1.*)
Run Code Online (Sandbox Code Playgroud)
似乎至少在RegexBuddy中测试你的测试字符串=)
// (<!--X\d-->)(?!.*\1.*)
//
// Options: dot matches newline
//
// Match the regular expression below and capture its match into backreference number 1 «(<!--X\d-->)»
// Match the characters “<!--X” literally «<!--X»
// Match a single digit 0..9 «\d»
// Match the characters “-->” literally «-->»
// Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?!.*\1.*)»
// Match any single character «.*»
// Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
// Match the same text as most recently matched by capturing group number 1 «\1»
// Match any single character «.*»
// Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Run Code Online (Sandbox Code Playgroud)