Jmi*_*ini 76 java regex title-case camelcasing
我找到了一个出色的RegEx来提取camelCase或TitleCase表达式的一部分.
(?<!^)(?=[A-Z])
Run Code Online (Sandbox Code Playgroud)
它按预期工作:
例如使用Java:
String s = "loremIpsum";
words = s.split("(?<!^)(?=[A-Z])");
//words equals words = new String[]{"lorem","Ipsum"}
Run Code Online (Sandbox Code Playgroud)
我的问题是它在某些情况下不起作用:
在我看来,结果应该是:
换句话说,给定n个大写字符:
关于如何改进这个正则表达式的任何想法?
NPE*_*NPE 106
以下正则表达式适用于以上所有示例:
public static void main(String[] args)
{
for (String w : "camelValue".split("(?<!(^|[A-Z]))(?=[A-Z])|(?<!^)(?=[A-Z][a-z])")) {
System.out.println(w);
}
}
Run Code Online (Sandbox Code Playgroud)
它通过强制负面的lookbehind不仅忽略字符串开头的匹配,而且还忽略大写字母前面有另一个大写字母的匹配.这会处理像"VALUE"这样的情况.
由于未能在"RPC"和"Ext"之间进行拆分,正则表达式的第一部分本身在"eclipseRCPExt"上失败.这是第二个条款的目的:(?<!^)(?=[A-Z][a-z].此子句允许在每个大写字母之前进行拆分,后跟小写字母,但字符串的开头除外.
rid*_*ner 71
看起来你正在使它变得比它需要的更复杂.对于camelCase,拆分位置只是一个大写字母紧跟小写字母的任何地方:
(?<=[a-z])(?=[A-Z])
以下是此正则表达式如何拆分您的示例数据:
value -> valuecamelValue -> camel / ValueTitleValue -> Title / ValueVALUE -> VALUEeclipseRCPExt -> eclipse / RCPExt与你想要的输出的唯一区别是eclipseRCPExt,我认为这是正确分裂在这里.
注意:这个答案最近得到了一个upvote,我意识到有更好的方法......
通过添加上述正则表达式的第二种替代方法,所有OP的测试用例都被正确分割.
(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])
以下是改进的正则表达式如何拆分示例数据:
value -> valuecamelValue -> camel / ValueTitleValue -> Title / ValueVALUE -> VALUEeclipseRCPExt -> eclipse / RCP / Ext编辑:20130824添加了改进版本来处理RCPExt -> RCP / Ext案例.
dea*_*dog 10
我无法让aix的解决方案工作(并且它也无法在RegExr上工作),所以我想出了我自己的测试,似乎正在寻找你正在寻找的东西:
((^[a-z]+)|([A-Z]{1}[a-z]+)|([A-Z]+(?=([A-Z][a-z])|($))))
Run Code Online (Sandbox Code Playgroud)
这是使用它的一个例子:
; Regex Breakdown: This will match against each word in Camel and Pascal case strings, while properly handling acrynoms.
; (^[a-z]+) Match against any lower-case letters at the start of the string.
; ([A-Z]{1}[a-z]+) Match against Title case words (one upper case followed by lower case letters).
; ([A-Z]+(?=([A-Z][a-z])|($))) Match against multiple consecutive upper-case letters, leaving the last upper case letter out the match if it is followed by lower case letters, and including it if it's followed by the end of the string.
newString := RegExReplace(oldCamelOrPascalString, "((^[a-z]+)|([A-Z]{1}[a-z]+)|([A-Z]+(?=([A-Z][a-z])|($))))", "$1 ")
newString := Trim(newString)
Run Code Online (Sandbox Code Playgroud)
这里我用空格分隔每个单词,所以这里有一些如何转换字符串的例子:
上面的解决方案完成了原始帖子要求的内容,但我还需要一个正则表达式来查找包含数字的camel和pascal字符串,所以我也想出了这个变体来包含数字:
((^[a-z]+)|([0-9]+)|([A-Z]{1}[a-z]+)|([A-Z]+(?=([A-Z][a-z])|($)|([0-9]))))
Run Code Online (Sandbox Code Playgroud)
以及使用它的一个例子:
; Regex Breakdown: This will match against each word in Camel and Pascal case strings, while properly handling acrynoms and including numbers.
; (^[a-z]+) Match against any lower-case letters at the start of the command.
; ([0-9]+) Match against one or more consecutive numbers (anywhere in the string, including at the start).
; ([A-Z]{1}[a-z]+) Match against Title case words (one upper case followed by lower case letters).
; ([A-Z]+(?=([A-Z][a-z])|($)|([0-9]))) Match against multiple consecutive upper-case letters, leaving the last upper case letter out the match if it is followed by lower case letters, and including it if it's followed by the end of the string or a number.
newString := RegExReplace(oldCamelOrPascalString, "((^[a-z]+)|([0-9]+)|([A-Z]{1}[a-z]+)|([A-Z]+(?=([A-Z][a-z])|($)|([0-9]))))", "$1 ")
newString := Trim(newString)
Run Code Online (Sandbox Code Playgroud)
以下是一些使用此正则表达式转换带数字的字符串的示例:
A-Z:s.split("(?<=\\p{Ll})(?=\\p{Lu})|(?<=\\p{L})(?=\\p{Lu}\\p{Ll})");
Run Code Online (Sandbox Code Playgroud)
任何一个:
例如parseXML-> parse, XML.
或者
例如XMLParser-> XML, Parser.
public class SplitCamelCaseTest {
static String BETWEEN_LOWER_AND_UPPER = "(?<=\\p{Ll})(?=\\p{Lu})";
static String BEFORE_UPPER_AND_LOWER = "(?<=\\p{L})(?=\\p{Lu}\\p{Ll})";
static Pattern SPLIT_CAMEL_CASE = Pattern.compile(
BETWEEN_LOWER_AND_UPPER +"|"+ BEFORE_UPPER_AND_LOWER
);
public static String splitCamelCase(String s) {
return SPLIT_CAMEL_CASE.splitAsStream(s)
.collect(joining(" "));
}
@Test
public void testSplitCamelCase() {
assertEquals("Camel Case", splitCamelCase("CamelCase"));
assertEquals("lorem Ipsum", splitCamelCase("loremIpsum"));
assertEquals("XML Parser", splitCamelCase("XMLParser"));
assertEquals("eclipse RCP Ext", splitCamelCase("eclipseRCPExt"));
assertEquals("VALUE", splitCamelCase("VALUE"));
}
}
Run Code Online (Sandbox Code Playgroud)