如何在模式匹配中转义'+'以突出显示关键字?

Gna*_*nam 0 java regex escaping

我正在Java中实现关键字高亮显示.我java.util.regex.Pattern用来在String内容中突出显示(制作粗体)关键字.以下代码可以正常使用字母数字关键字,但它不适用于某些特殊字符.例如,在String内容中,我想突出显示c++具有特殊字符+(加号)的关键字,但它没有正确突出显示.如何突出+字符以c++突出显示?

public static void main(String[] args)
{
    String content = "java,c++,ejb,struts,j2ee,hibernate";
    System.out.println("CONTENT: " + content);
    String highlight = "C++";

    System.out.println("HIGHLIGHT KEYWORD: " + highlight);

    //highlight = highlight.replaceAll(Pattern.quote("+"), "\\\\+");
    java.util.regex.Pattern pattern = java.util.regex.Pattern.compile("\\b" + highlight + "\\b", java.util.regex.Pattern.CASE_INSENSITIVE);
    System.out.println("PATTERN: " + pattern.pattern());
    java.util.regex.Matcher matcher = pattern.matcher(content);

    while (matcher.find()) {
        System.out.println("Match found!!!");
        for (int i = 0; i <= matcher.groupCount(); i++) {
        System.out.println(matcher.group(i));
        content = matcher.replaceAll("<B>" + matcher.group(i) + "</B>");
        }
    }
    System.out.println("RESULT: " + content);
}
Run Code Online (Sandbox Code Playgroud)

输出:
CONTENT:java,c ++,ejb,struts,j2ee,hibernate
HIGHLIGHT KEYWORD:C++
PATTERN:\ bC++\b
匹配找到!!!
c
结果:java,c ++,ejb,struts,j2ee,hibernate


在调用Pattern.compile之前我甚至试图逃避'+' ,

highlight = highlight.replaceAll(Pattern.quote("+"), "\\\\+");
Run Code Online (Sandbox Code Playgroud)

但我仍然无法正确使用语法.有人可以帮我解决这个问题吗?

Sea*_*oyd 6

这应该做你需要的:

Pattern pattern = Pattern.compile(
    "\\b" 
    + Pattern.quote(highlight)
    + "\\b",
    Pattern.CASE_INSENSITIVE);
Run Code Online (Sandbox Code Playgroud)

更新:你是对的,上面的内容不适用于C++(\b匹配单词边界,不能识别++作为单词).我们需要一个更复杂的解决方案

Pattern pattern = Pattern.compile(
    "\\b" 
    + Pattern.quote(highlight)
    + "(?![^\\p{Punct}\\s])", // matches if the match is not followed by
                              // anything other than whitespace or punctuation
    Pattern.CASE_INSENSITIVE);
Run Code Online (Sandbox Code Playgroud)

更新以回应评论:您的模式创建中似乎需要更多逻辑.这是为您创建模式的辅助方法:

private static final String WORD_BOUNDARY = "\\b";
// edit this to suit your neds:
private static final String ALLOWED = "[^,.!\\-\\s]";
private static final String LOOKAHEAD = "(?!" + ALLOWED + ")";
private static final String LOOKBEHIND = "(?<!" + ALLOWED + ")";

public static Pattern createHighlightPattern(final String highlight) {
    final Pattern pattern = Pattern.compile(
            (Character.isLetterOrDigit(highlight.charAt(0)) 
             ? WORD_BOUNDARY : LOOKBEHIND)
            + Pattern.quote(highlight)
            + (Character.isLetterOrDigit(highlight.charAt(highlight.length() - 1))
             ? WORD_BOUNDARY : LOOKAHEAD),
            Pattern.CASE_INSENSITIVE);
    return pattern;
}
Run Code Online (Sandbox Code Playgroud)

这里有一些测试代码来检查它是否有效:

private static void testMatch(final String haystack, final String needle) {
    final Matcher matcher = createHighlightPattern(needle).matcher(haystack);
    if (!matcher.find())
        System.out.println("Failed to find pattern " + needle);
    while (matcher.find())
        System.out.println("Found additional match: " + matcher.group() +
                           " for pattern " + needle);
}

public static void main(final String[] args) {
    final String testString = "java,c++,hibernate,.net,asp.net,c#,spring";
    testMatch(testString, "java");
    testMatch(testString, "c++");
    testMatch(testString, ".net");
    testMatch(testString, "c#");
}
Run Code Online (Sandbox Code Playgroud)

当我运行这个方法时,我没有看到任何输出(这是好的:-))