我有两个网址,需要在域扩展后捕获一个字符串,如果它是一个两个字符的字符串,它以"/"结尾.到目前为止,我有这个:
var t1 = "http://www.test.net/shop/test-3";
var t2 = "http://www.test.net/gb/shop/test-2";
var rgx = /\.([a-z]{0,3})\/([a-z]{2}\/)?/;
console.log(rgx.exec(t1));
console.log(rgx.exec(t2));
Run Code Online (Sandbox Code Playgroud)
它吐了出来
[".net/", "net", undefined]
[".net/gb/", "net", "gb/"]
Run Code Online (Sandbox Code Playgroud)
这是正确的,除了我不想捕获"gb /",而是"gb"而不是.有任何想法吗?我很困惑..
您可以使用的技术是在可选的非捕获组中使用捕获组:
/\.([a-z]{0,3})\/(?:([a-z]{2})\/)?/
^^^^ ^^
Run Code Online (Sandbox Code Playgroud)
请参阅正则表达式演示
var t1 = "http://www.test.net/shop/test-3";
var t2 = "http://www.test.net/gb/shop/test-2";
console.log(/\.([a-z]+)\/(?:([a-z]{2})\/)?/.exec(t1));
console.log(/\.([a-z]+)\/(?:([a-z]{2})\/)?/.exec(t2));Run Code Online (Sandbox Code Playgroud)
谈到替代方法,这个正则表达式似乎更安全,因为它更精确:
/^https?:\/\/[^\/]+\.([a-z]+)\/(?:([a-z]{2})\/)?/
Run Code Online (Sandbox Code Playgroud)
细节:
^ - 字符串的开头https?:\/\/- 协议部分(http://或https://)[^\/]+\.([a-z]+)\/-域部分匹配比其它一个或多个字符/,然后.再捕获TLD(1个或多个字母,[a-z]+)转换成第1组(?:([a-z]{2})\/)? - 可选序列:
([a-z]{2}) - 第2组捕获2个小写ASCII字母\/ - 斜线var t1 = "http://www.test.net/shop/test-3";
var t2 = "http://www.test.net/gb/shop/test-2";
console.log(/^https?:\/\/[^\/]+\.([a-z]+)\/(?:([a-z]{2})\/)?/.exec(t1));
console.log(/^https?:\/\/[^\/]+\.([a-z]+)\/(?:([a-z]{2})\/)?/.exec(t2));Run Code Online (Sandbox Code Playgroud)