出于某种原因,我想扫描 java 文件的内容(例如 TagMatchingInterface.java)并通过正则表达式获取类名(TagMatchingInterface),但我的正则表达式匹配不正确的类名,因为有一些关键字(类/接口/枚举)隐藏在评论中:
/**
*
* @author XXXX
* Introduction: A common interface that judges all kinds of algorithm tags.
* some other comment
*/
public class TagMatchingInterface
{
// content
public class InnerClazz{
// content
}
}
Run Code Online (Sandbox Code Playgroud)
这是我的模式:
public Pattern CLASS_PATTERN = Pattern.compile("(?:public\\s)?(?:.*\\s)?(class|interface|enum)\\s+([$_a-zA-Z][$_a-zA-Z0-9]*)");
....
Matcher matcher = CLASS_PATTERN.matcher(content);
if (matcher.find()) {
System.out.println(match.group(2));
}
Run Code Online (Sandbox Code Playgroud)
知道我的正则表达式吗?
(?<=\n|\A)(?:public\s)?(class|interface|enum)\s([^\n\s]*)
Run Code Online (Sandbox Code Playgroud)

此正则表达式执行以下操作:
public或不以class或interface或enum注意,我建议使用全局和不区分大小写的标志
现场示例
https://regex101.com/r/vR0iK3/1
示例文本
/**
*
* @author XXXX
* Introduction: A common interface that judges all kinds of algorithm tags.
* some other comment
*/
public class TagMatchingInterface
{
// content
public class InnerClazz{
// content
}
}
Run Code Online (Sandbox Code Playgroud)
样本匹配
[0][0] = public class TagMatchingInterface
[0][1] = class
[0][2] = TagMatchingInterface
Run Code Online (Sandbox Code Playgroud)
捕获组:
NODE EXPLANATION
----------------------------------------------------------------------
(?<= look behind to see if there is:
----------------------------------------------------------------------
\n '\n' (newline)
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
\A Start of the string
----------------------------------------------------------------------
) end of look-behind
----------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
----------------------------------------------------------------------
public 'public'
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
)? end of grouping
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
class 'class'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
interface 'interface'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
enum 'enum'
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
( group and capture to \2:
----------------------------------------------------------------------
[^\n\s]* any character except: '\n' (newline),
whitespace (\n, \r, \t, \f, and " ") (0
or more times (matching the most amount
possible))
----------------------------------------------------------------------
) end of \2
----------------------------------------------------------------------
Run Code Online (Sandbox Code Playgroud)