如何在保留包含空格的复合表达式的同时拆分单词中的句子?

Cha*_*der 2 java string split

我需要在空格上拆分一个字符串,但我需要忽略一些包含空格的复合关键字.例如,我有一个String如下,

String testCase = "The patient is currently being treated for Diabetes with Thiazide diuretics";
Run Code Online (Sandbox Code Playgroud)

我需要拆分字符串,但需要Thiazide diuretics作为一个整体复合表达式

String[] array = testCase.split(" ");
Run Code Online (Sandbox Code Playgroud)

结果必须如下:

The
patient
is
currently
being
treated
for
Diabetes
with 
Thiazide diuretics
Run Code Online (Sandbox Code Playgroud)

怎么做 ?

ken*_*ytm 5

在这种情况下,您需要直接处理正则表达式,.split()不适合您的目的.

String s = "The patient is currently being treated for Diabetes with Thiazide diuretics";

Matcher m = Pattern.compile("\\b(?:Thiazide diuretics)\\b|\\S+").matcher(s);
ArrayList<String> result = new ArrayList<>();
while (m.find()) {
    result.add(m.group());
}
System.out.println(result);
// [The, patient, is, currently, being, treated, for, Diabetes, with, Thiazide diuretics]
Run Code Online (Sandbox Code Playgroud)

注意:从技术上讲,可以.split()使用lookarounds来实现:

String s = "Thiazide not-a-keyword diuretics and Thiazide diuretics keyword";

String[] result = s.split("(?<!Thiazide) | (?!diuretics)");
System.out.println(Arrays.toString(result));
// [Thiazide, not-a-keyword, diuretics, and, Thiazide diuretics, keyword]
Run Code Online (Sandbox Code Playgroud)

但是,如果您有更多关键字,这不会扩展.尽量避免这种情况.