正则表达式:查找不带子串的字符串

Art*_*tem 21 regex

我有一个大文:

"Big piece of text. This sentence includes 'regexp' word. And this
sentence doesn't include that word"
Run Code Online (Sandbox Code Playgroud)

我需要找到以' this ' 开头并以' word ' 结尾但包含单词' regexp '的子字符串.

在这种情况下,字符串:" this sentence doesn't include that word"正是我想要接收的.

我怎么能通过正则表达式来做到这一点?

And*_*ark 39

使用ignore case选项,以下内容应该有效:

\bthis\b(?:(?!\bregexp\b).)*?\bword\b
Run Code Online (Sandbox Code Playgroud)

示例:http://www.rubular.com/r/g6tYcOy8IT

说明:

\bthis\b           # match the word 'this', \b is for word boundaries
(?:                # start group, repeated zero or more times, as few as possible
   (?!\bregexp\b)    # fail if 'regexp' can be matched (negative lookahead)
   .                 # match any single character
)*?                # end group
\bword\b           # match 'word'
Run Code Online (Sandbox Code Playgroud)

\b周围的每一个字可以确保你是不匹配的字符串,就像在"罗嗦"的"本"的"蓟",或"字"的匹配.

这可以通过检查起始单词和结束单词之间的每个字符来确保排除的单词不会发生.

  • 这正是我需要的!谢谢! (2认同)
  • +1对正则表达式的良好解释以及与它一起玩的链接 - 我能够将它应用于类似的东西,并且在没有解释的情况下会挣扎.我厌倦了只给出一些代码而不说它是如何工作的答案. (2认同)

Igo*_*bin 7

使用前瞻性断言.

如果要检查字符串是否包含其他子字符串,可以编写:

/^(?!.*substring)/
Run Code Online (Sandbox Code Playgroud)

您还必须检查开始和行结束thisword:

/^this(?!.*substring).*word$/
Run Code Online (Sandbox Code Playgroud)

这里的另一个问题是你不工作找到字符串,你想找到句子(如果我理解你的任务正确).

所以解决方案看起来像这样:

perl -e '
  local $/;
  $_=<>;
  while($_ =~ /(.*?[.])/g) { 
    $s=$1;
    print $s if $s =~ /^this(?!.*substring).*word[.]$/
  };'
Run Code Online (Sandbox Code Playgroud)

用法示例:

$ cat 1.pl
local $/;
$_=<>;
while($_ =~ /(.*?[.])/g) {
    $s=$1;
    print $s if $s =~ /^\s*this(?!.*regexp).*word[.]/i;
};

$ cat 1.txt
This sentence has the "regexp" word. This sentence doesn't have the word. This sentence does have the "regexp" word again.

$ cat 1.txt | perl 1.pl 
 This sentence doesn't have the word.
Run Code Online (Sandbox Code Playgroud)