Java正则表达式模式

rec*_*lax 1 java regex

我需要帮助解决这个问题.看看下面的正则表达式:

Pattern pattern = Pattern.compile("[A-Za-z]+(\\-[A-Za-z]+)");
Matcher matcher = pattern.matcher(s1);
Run Code Online (Sandbox Code Playgroud)

我想寻找像这样的词:"自制","aaaa-bbb"而不是"aaa - bbb",而不是 "aaa - aa - aaa".基本上,我想要以下内容:

单词 - 连字符 - 单词.

它适用于所有事情,除了这种模式将通过:"aaa - aaa - aaa",不应该.正则表达式适用于这种模式?

How*_*ard 5

可以从表达式中删除反斜杠:

"[A-Za-z]+-[A-Za-z]+"
Run Code Online (Sandbox Code Playgroud)

以下代码应该可以使用

Pattern pattern = Pattern.compile("[A-Za-z]+-[A-Za-z]+");
Matcher matcher = pattern.matcher("aaa-bbb");
match = matcher.matches();
Run Code Online (Sandbox Code Playgroud)

请注意,您可以使用Matcher.matches()而不是Matcher.find()为了检查匹配的完整字符串.

如果您想Matcher.find()使用表达式查看字符串内部,则可以使用表达式

"(^|\\s)[A-Za-z]+-[A-Za-z]+(\\s|$)"
Run Code Online (Sandbox Code Playgroud)

但请注意,只会找到由空格分隔的单词(即没有单词aaa-bbb.).要捕获这种情况,您可以使用lookbehinds和lookaheads:

"(?<![A-Za-z-])[A-Za-z]+-[A-Za-z]+(?![A-Za-z-])"
Run Code Online (Sandbox Code Playgroud)

会读

(?<![A-Za-z-])        // before the match there must not be and A-Z or -
[A-Za-z]+             // the match itself consists of one or more A-Z
-                     // followed by a -
[A-Za-z]+             // followed by one or more A-Z
(?![A-Za-z-])         // but afterwards not by any A-Z or -
Run Code Online (Sandbox Code Playgroud)

一个例子:

Pattern pattern = Pattern.compile("(?<![A-Za-z-])[A-Za-z]+-[A-Za-z]+(?![A-Za-z-])");
Matcher matcher = pattern.matcher("It is home-made.");
if (matcher.find()) {
    System.out.println(matcher.group());    // => home-made
}
Run Code Online (Sandbox Code Playgroud)