Nee*_*raj 18 java string str-replace
我正在编辑一些来自tesseract ocr的电子邮件.
这是我的代码:
if (email != null) {
email = email.replaceAll(" ", "");
email = email.replaceAll("caneer", "career");
email = email.replaceAll("canaer", "career");
email = email.replaceAll("canear", "career");
email = email.replaceAll("caraer", "career");
email = email.replaceAll("carear", "career");
email = email.replace("|", "l");
email = email.replaceAll("}", "j");
email = email.replaceAll("j3b", "job");
email = email.replaceAll("gmaii.com", "gmail.com");
email = email.replaceAll("hotmaii.com", "hotmail.com");
email = email.replaceAll(".c0m", ".com");
email = email.replaceAll(".coin", ".com");
email = email.replaceAll("consuit", "consult");
}
return email;
Run Code Online (Sandbox Code Playgroud)
但输出不正确.
输入:
amrut=ac.hrworks@g mai|.com
Run Code Online (Sandbox Code Playgroud)
输出:
lalcl.lhlrlwlolrlklsl@lglmlalil|l.lclolml
Run Code Online (Sandbox Code Playgroud)
但是当我在每次替换后将结果分配给新的String时,它工作正常.为什么在同一个String中连续赋值不起作用?
Bri*_*ach 38
您将在Javadoc中注意String.replaceAll(),第一个参数是正则表达式.
句点(.)和pipe(|)一样具有特殊含义,就像大括号(})一样.您需要将它们全部转义,例如:
email = email.replaceAll("gmaii\\.com", "gmail.com");
Run Code Online (Sandbox Code Playgroud)
小智 10
(这是Java吗?)
请注意,在Java中,replaceAll接受正则表达式,并且点匹配任何字符.你需要逃避点或使用
somestring.replaceAll(Pattern.quote("gmail.com"), "replacement");
Run Code Online (Sandbox Code Playgroud)
还要注意这里的拼写错误:
email = emai.replaceAll("canear", "career");
Run Code Online (Sandbox Code Playgroud)
应该
email = email.replaceAll("canear", "career");
Run Code Online (Sandbox Code Playgroud)
你必须逃离.通过\\.类似以下内容:
if (email != null) {
email = email.replaceAll(" ", "");
email = email.replaceAll("caneer", "career");
email = email.replaceAll("canaer", "career");
email = email.replaceAll("canear", "career");
email = email.replaceAll("caraer", "career");
email = email.replaceAll("carear", "career");
email = email.replace("|", "l");
email = email.replaceAll("}", "j");
email = email.replaceAll("j3b", "job");
email = email.replaceAll("gmaii\\.com", "gmail.com");
email = email.replaceAll("hotmaii\\.com", "hotmail.com");
email = email.replaceAll("\\.c0m", "com");
email = email.replaceAll("\\.coin", "com");
email = email.replaceAll("consuit", "consult");
}
return email;
Run Code Online (Sandbox Code Playgroud)
通过实现replaceAll()第一个参数,regex你可以使你的比较更少
例如,您可以career通过以下方法检查单词可能的拼写错误regex
email = email.replaceAll("ca[n|r][e|a][e|a]r", "career"));
我想你不知道第一个参数replaceAll是正则表达式.
.,|,}可能会从你的期望不同的方式来解释.
. Any character (may or may not match line terminators)
Run Code Online (Sandbox Code Playgroud)
http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html
对于空间,你最好使用
\s A whitespace character: [ \t\n\x0B\f\r]
Run Code Online (Sandbox Code Playgroud)
并以领先的方式逃避其他特殊人物 \\