正则表达式匹配以"/"结尾的子组

Eva*_*tis 2 javascript regex

我有两个网址,需要在域扩展后捕获一个字符串,如果它是一个两个字符的字符串,它以"/"结尾.到目前为止,我有这个:

var t1 = "http://www.test.net/shop/test-3";
var t2 = "http://www.test.net/gb/shop/test-2";

var rgx = /\.([a-z]{0,3})\/([a-z]{2}\/)?/;



console.log(rgx.exec(t1));

console.log(rgx.exec(t2));
Run Code Online (Sandbox Code Playgroud)

它吐了出来

[".net/", "net", undefined]
[".net/gb/", "net", "gb/"]
Run Code Online (Sandbox Code Playgroud)

这是正确的,除了我不想捕获"gb /",而是"gb"而不是.有任何想法吗?我很困惑..

Wik*_*żew 6

您可以使用的技术是在可选的非捕获组中使用捕获组:

/\.([a-z]{0,3})\/(?:([a-z]{2})\/)?/
                 ^^^^           ^^
Run Code Online (Sandbox Code Playgroud)

请参阅正则表达式演示

var t1 = "http://www.test.net/shop/test-3";
var t2 = "http://www.test.net/gb/shop/test-2";
console.log(/\.([a-z]+)\/(?:([a-z]{2})\/)?/.exec(t1));
console.log(/\.([a-z]+)\/(?:([a-z]{2})\/)?/.exec(t2));
Run Code Online (Sandbox Code Playgroud)

谈到替代方法,这个正则表达式似乎更安全,因为它更精确:

/^https?:\/\/[^\/]+\.([a-z]+)\/(?:([a-z]{2})\/)?/
Run Code Online (Sandbox Code Playgroud)

看到这个正则表达式演示

细节:

  • ^ - 字符串的开头
  • https?:\/\/- 协议部分(http://https://)
  • [^\/]+\.([a-z]+)\/-域部分匹配比其它一个或多个字符/,然后.再捕获TLD(1个或多个字母,[a-z]+)转换成第1组
  • (?:([a-z]{2})\/)? - 可选序列:
    • ([a-z]{2}) - 第2组捕获2个小写ASCII字母
    • \/ - 斜线

var t1 = "http://www.test.net/shop/test-3";
var t2 = "http://www.test.net/gb/shop/test-2";
console.log(/^https?:\/\/[^\/]+\.([a-z]+)\/(?:([a-z]{2})\/)?/.exec(t1));
console.log(/^https?:\/\/[^\/]+\.([a-z]+)\/(?:([a-z]{2})\/)?/.exec(t2));
Run Code Online (Sandbox Code Playgroud)