RegExp.exec()偶尔返回NULL

cpa*_*pak 77 javascript regex

我真的很疯狂,我已经花了不成比例的时间试图弄清楚这里发生了什么.所以请帮我一个=)

我需要在JavaScript中对字符串进行一些RegExp匹配.不幸的是,它表现得非常奇怪.这段代码:

var rx = /(cat|dog)/gi;
var w = new Array("I have a cat and a dog too.", "There once was a dog and a cat.", "I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.");

for (var i in w) {
    var m = null;
    m = rx.exec(w[i]);
    if(m){
        document.writeln("<pre>" + i + "\nINPUT: " + w[i] + "\nMATCHES: " + m.slice(1) + "</pre>");
    }else{
        document.writeln("<pre>" + i + "\n'" + w[i] + "' FAILED.</pre>");
    }
}
Run Code Online (Sandbox Code Playgroud)

返回前两个元素的"cat"和"dog",它应该是,但是然后一些exec()-calls开始返回null.我不明白为什么.

我在这里发布了一个小提琴,您可以其中运行和编辑代码.

到目前为止,我已经在Chrome和Firefox中尝试了这一点.

干杯!

/克里斯托弗

Sil*_*ost 67

哦,在这里.因为您正在定义正则表达式全局,所以它首先匹配cat,并在循环的第二次传递中匹配dog.所以,基本上你只需要重置你的正则表达式(它的内部指针).参看 这个:

var w = new Array("I have a cat and a dog too.", "I have a cat and a dog too.", "I have a cat and a dog too.", "I have a cat and a dog too.");

for (var i in w) {
    var rx = /(cat|dog)/gi;
    var m = null;
    m = rx.exec(w[i]);
    if(m){
        document.writeln("<p>" + i + "<br/>INPUT: " + w[i] + "<br/>MATCHES: " + w[i].length + "</p>");
    }else{
        document.writeln("<p><b>" + i + "<br/>'" + w[i] + "' FAILED.</b><br/>" + w[i].length + "</p>");
    }
    document.writeln(m);
}
Run Code Online (Sandbox Code Playgroud)

  • 如果可以的话我会给你1000票。这不仅节省了我的时间,而且让我大吃一惊。 (2认同)

Fro*_*ode 65

正则表达式对象具有lastIndex在运行时更新的属性exec.所以当你执行正则表达式例如"我也有一只猫和一只狗."时,lastIndex设置为12.下次你运行exec同一个正则表达式对象时,它会从索引12开始查找.所以你必须重置lastIndex属性每次运行之间.

  • 感谢您的解释!通过设置`myRe.lastIndex = 0;`供后续使用,它有很大帮助. (8认同)
  • 我认为这应该是正确的答案,因为它遵循重用相同正则表达式对象的最佳实践 (2认同)

ESL*_*ESL 27

两件事情:

  1. 使用(全局)标志时需要重置g.为了解决这个问题,我建议简单地分配0给对象的lastIndex成员RegExp.这比破坏和重新创建具有更好的性能.
  2. 使用in关键字Array时要小心,以便遍历一个对象,因为某些库会导致意外的结果.有时您应该检查一些类似的东西isNaN(i),或者如果您知道它没有孔,请使用经典的for循环.

代码可以是:

var rx = /(cat|dog)/gi;
w = ["I have a cat and a dog too.", "There once was a dog and a cat.", "I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat."];

for (var i in w)
 if(!isNaN(i))        // Optional, check it is an element if Array could have some odd members.
  {
   var m = null;
   m = rx.exec(w[i]); // Run
   rx.lastIndex = 0;  // Reset
   if(m)
    {
     document.writeln("<pre>" + i + "\nINPUT: " + w[i] + "\nMATCHES: " + m.slice(1) + "</pre>");
    } else {
     document.writeln("<pre>" + i + "\n'" + w[i] + "' FAILED.</pre>");
    }
  }
Run Code Online (Sandbox Code Playgroud)

  • 这应该是正确的答案。设置 `rx.lastIndex = 0` 比在循环内重新创建 RegEx 对象要好得多。 (2认同)

小智 5

我仅使用 /g 时遇到了类似的问题,这里提出的解决方案在 FireFox 3.6.8 中对我不起作用。我的脚本可以使用

var myRegex = new RegExp("my string", "g");
Run Code Online (Sandbox Code Playgroud)

我添加此内容是为了防止其他人遇到与我使用上述解决方案相同的问题。