什么构成有效的组名?
var re = new Regex(@"(?<what-letters-can-go-here>pattern)");
Run Code Online (Sandbox Code Playgroud)
允许的字符是 [a-zA-Z0-9_]
根据微软文档:
name不能包含任何标点符号,并且不能以数字开头。
但这不是很具体,所以让我们看一下源代码:
System.Text.RegularExpressions.RegexParser 类的源代码向我们展示了允许的字符本质上是[a-zA-Z0-9_]。不过,准确地说,在用于检查字符是否对捕获组名称有效的方法中有此注释:
Run Code Online (Sandbox Code Playgroud)internal static bool IsWordChar(char ch) { // According to UTS#18 Unicode Regular Expressions (http://www.unicode.org/reports/tr18/) // RL 1.4 Simple Word Boundaries The class of <word_character> includes all Alphabetic // values from the Unicode character database, from UnicodeData.txt [UData], plus the U+200C // ZERO WIDTH NON-JOINER and U+200D ZERO WIDTH JOINER. return CharInClass(ch, WordClass) || ch == ZeroWidthJoiner || ch == ZeroWidthNonJoiner; }
如果你想自己测试一下,这个 .NET fiddle确认有许多非标点字符在捕获组的名称中是不允许的: