说我有两个字符串,
String s1 = "AbBaCca";
String s2 = "bac";
Run Code Online (Sandbox Code Playgroud)
我想执行一个s2包含在其中的检查返回s1.我可以这样做:
return s1.contains(s2);
Run Code Online (Sandbox Code Playgroud)
我很确定这contains()是区分大小写的,但是我无法通过阅读文档来确定这一点.如果是,那么我想我最好的方法是这样的:
return s1.toLowerCase().contains(s2.toLowerCase());
Run Code Online (Sandbox Code Playgroud)
除此之外,还有另一种(可能更好的)方法来实现这一目标而不关心区分大小写吗?
Dav*_* L. 310
是的,包含区分大小写.您可以将java.util.regex.Pattern与CASE_INSENSITIVE标志一起用于不区分大小写的匹配:
Pattern.compile(Pattern.quote(wantedStr), Pattern.CASE_INSENSITIVE).matcher(source).find();
Run Code Online (Sandbox Code Playgroud)
编辑:如果s2包含正则表达式特殊字符(其中有很多),首先引用它是很重要的.我已经纠正了我的答案,因为这是人们会看到的第一个答案,但是自从他指出这一点后就投票给Matt Quail.
Mat*_*ail 256
Dave L.的答案的一个问题是当s2包含诸如\d等的正则表达式标记时.
你想在s2上调用Pattern.quote():
Pattern.compile(Pattern.quote(s2), Pattern.CASE_INSENSITIVE).matcher(s1).find();
Run Code Online (Sandbox Code Playgroud)
muh*_*dto 149
您可以使用
org.apache.commons.lang3.StringUtils.containsIgnoreCase("AbBaCca", "bac");
Run Code Online (Sandbox Code Playgroud)
在Apache的共享库是这样的事情是非常有用的.而且这个特定的可能比正则表达式更好,因为正则表达式在性能方面总是很昂贵.
icz*_*cza 117
String.regionMatches()使用regexp可能会相对较慢.如果您只是想检查一个案例,那么(缓慢)并不重要.但是如果你有一个数组或数千或数十万个字符串的集合,那么事情就会变得非常缓慢.
下面介绍的解决方案不使用正则表达式toLowerCase()(这也很慢,因为它会创建另一个字符串,并在检查后将它们抛弃).
该解决方案基于String.regionMatches()方法构建,该方法似乎未知.它检查2个String区域是否匹配,但重要的是它还有一个带有方便ignoreCase参数的重载.
public static boolean containsIgnoreCase(String src, String what) {
final int length = what.length();
if (length == 0)
return true; // Empty string is contained
final char firstLo = Character.toLowerCase(what.charAt(0));
final char firstUp = Character.toUpperCase(what.charAt(0));
for (int i = src.length() - length; i >= 0; i--) {
// Quick check before calling the more expensive regionMatches() method:
final char ch = src.charAt(i);
if (ch != firstLo && ch != firstUp)
continue;
if (src.regionMatches(true, i, what, 0, length))
return true;
}
return false;
}
Run Code Online (Sandbox Code Playgroud)
这种速度分析并不意味着是火箭科学,只是粗略描述了不同方法的速度.
我比较了5种方法.
String.contains().String.contains()使用预缓存的低级子字符串进行调用.这个解决方案已经不那么灵活,因为它测试了一个预先定义的子字符串.Pattern.compile().matcher().find()......)Pattern.此解决方案已经不那么灵活,因为它测试预定义的子字符串.结果(通过调用方法1000万次):
Pattern:1845毫秒结果表:
RELATIVE SPEED 1/RELATIVE SPEED
METHOD EXEC TIME TO SLOWEST TO FASTEST (#1)
------------------------------------------------------------------------------
1. Using regionMatches() 670 ms 10.7x 1.0x
2. 2x lowercase+contains 2829 ms 2.5x 4.2x
3. 1x lowercase+contains cache 2446 ms 2.9x 3.7x
4. Regexp 7180 ms 1.0x 10.7x
5. Regexp+cached pattern 1845 ms 3.9x 2.8x
Run Code Online (Sandbox Code Playgroud)
我们的方法是4倍快比lowercasing和使用contains(),速度快10倍相比,使用正则表达式,也快3倍,即使Pattern是预先缓存(大和丢失的任意子检查的灵活性).
如果您对分析的执行方式感兴趣,请参阅完整的可运行应用程序:
import java.util.regex.Pattern;
public class ContainsAnalysis {
// Case 1 utilizing String.regionMatches()
public static boolean containsIgnoreCase(String src, String what) {
final int length = what.length();
if (length == 0)
return true; // Empty string is contained
final char firstLo = Character.toLowerCase(what.charAt(0));
final char firstUp = Character.toUpperCase(what.charAt(0));
for (int i = src.length() - length; i >= 0; i--) {
// Quick check before calling the more expensive regionMatches()
// method:
final char ch = src.charAt(i);
if (ch != firstLo && ch != firstUp)
continue;
if (src.regionMatches(true, i, what, 0, length))
return true;
}
return false;
}
// Case 2 with 2x toLowerCase() and contains()
public static boolean containsConverting(String src, String what) {
return src.toLowerCase().contains(what.toLowerCase());
}
// The cached substring for case 3
private static final String S = "i am".toLowerCase();
// Case 3 with pre-cached substring and 1x toLowerCase() and contains()
public static boolean containsConverting(String src) {
return src.toLowerCase().contains(S);
}
// Case 4 with regexp
public static boolean containsIgnoreCaseRegexp(String src, String what) {
return Pattern.compile(Pattern.quote(what), Pattern.CASE_INSENSITIVE)
.matcher(src).find();
}
// The cached pattern for case 5
private static final Pattern P = Pattern.compile(
Pattern.quote("i am"), Pattern.CASE_INSENSITIVE);
// Case 5 with pre-cached Pattern
public static boolean containsIgnoreCaseRegexp(String src) {
return P.matcher(src).find();
}
// Main method: perfroms speed analysis on different contains methods
// (case ignored)
public static void main(String[] args) throws Exception {
final String src = "Hi, I am Adam";
final String what = "i am";
long start, end;
final int N = 10_000_000;
start = System.nanoTime();
for (int i = 0; i < N; i++)
containsIgnoreCase(src, what);
end = System.nanoTime();
System.out.println("Case 1 took " + ((end - start) / 1000000) + "ms");
start = System.nanoTime();
for (int i = 0; i < N; i++)
containsConverting(src, what);
end = System.nanoTime();
System.out.println("Case 2 took " + ((end - start) / 1000000) + "ms");
start = System.nanoTime();
for (int i = 0; i < N; i++)
containsConverting(src);
end = System.nanoTime();
System.out.println("Case 3 took " + ((end - start) / 1000000) + "ms");
start = System.nanoTime();
for (int i = 0; i < N; i++)
containsIgnoreCaseRegexp(src, what);
end = System.nanoTime();
System.out.println("Case 4 took " + ((end - start) / 1000000) + "ms");
start = System.nanoTime();
for (int i = 0; i < N; i++)
containsIgnoreCaseRegexp(src);
end = System.nanoTime();
System.out.println("Case 5 took " + ((end - start) / 1000000) + "ms");
}
}
Run Code Online (Sandbox Code Playgroud)
Phi*_*hil 21
这样做的一种更简单的方法(不用担心模式匹配)会将两个Strings 转换为小写:
String foobar = "fooBar";
String bar = "FOO";
if (foobar.toLowerCase().contains(bar.toLowerCase()) {
System.out.println("It's a match!");
}
Run Code Online (Sandbox Code Playgroud)
小智 16
是的,这是可以实现的:
String s1 = "abBaCca";
String s2 = "bac";
String s1Lower = s1;
//s1Lower is exact same string, now convert it to lowercase, I left the s1 intact for print purposes if needed
s1Lower = s1Lower.toLowerCase();
String trueStatement = "FALSE!";
if (s1Lower.contains(s2)) {
//THIS statement will be TRUE
trueStatement = "TRUE!"
}
return trueStatement;
Run Code Online (Sandbox Code Playgroud)
此代码将返回字符串"TRUE!" 因为它发现你的角色被包含了.
小智 6
您可以使用正则表达式,它可以工作:
boolean found = s1.matches("(?i).*" + s2+ ".*");
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
338508 次 |
| 最近记录: |