如何抓住所有以大写字母开头的单词?

Bra*_*rad 4 java regex

我想创建一个Java正则表达式来获取以大写字母开头的所有单词,然后是大写或小写字母,但这些字母可能包含重音符号.

例子 :

哪里

Àdónde

RAPIDO

阿斯特

你能帮帮我吗?

Tim*_*ker 8

正则表达式:

\b\p{Lu}\p{L}*\b
Run Code Online (Sandbox Code Playgroud)

Java字符串:

"(?U)\\b\\p{Lu}\\p{L}*\\b"
Run Code Online (Sandbox Code Playgroud)

说明:

\b      # Match at a word boundary (start of word)
\p{Lu}  # Match an uppercase letter
\p{L}*  # Match any number of letters (any case)
\b      # Match at a word boundary (end of word)
Run Code Online (Sandbox Code Playgroud)

警告:这只适用于最近的Java版本(JDK7); 对于其他人,您可能需要替换更长的子正则表达式\b.正如你在这里看到的,你可能需要使用(kudos to @tchrist)

(?:(?<=[\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]])(?![\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]])|(?<![\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]])(?=[\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]]))
Run Code Online (Sandbox Code Playgroud)

for \b,所以Java字符串看起来像这样:

"(?:(?<=[\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\])(?![\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\])|(?<![\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\])(?=[\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\]))\\p{Lu}\\p{L}*(?:(?<=[\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\])(?![\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\])|(?<![\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\])(?=[\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}\\[\\p{InEnclosedAlphanumerics}&&\\p{So}]\\]))"
Run Code Online (Sandbox Code Playgroud)