获取字符串中某个位置的单词

use*_*145 9 java string

我想得到字符串中某个位置周围的单词.例如,之后的两个单词和之前的两个单词.

例如,考虑字符串:

String str = "Hello my name is John and I like to go fishing and hiking I have two sisters and one brother.";
String find = "I";

for (int index = str.indexOf("I"); index >= 0; index = str.indexOf("I", index + 1))
{
    System.out.println(index);
}
Run Code Online (Sandbox Code Playgroud)

这写出了单词"I"所在的索引.但我希望能够得到这些位置周围的单词的子串.

我希望能够打印出"John and I like to"和"and hiking I have have two".

不仅应该能够选择单个字符串.搜索"John and"将返回"name is John and I like".

这样做有什么简洁明智的方法吗?

acd*_*ior 11

一个字:

您可以achiveve,使用Stringsplit()方法.该解决方案是O(n).

public static void main(String[] args) {
    String str = "Hello my name is John and I like to go fishing and "+
                         "hiking I have two sisters and one brother.";
    String find = "I";

    String[] sp = str.split(" +"); // "+" for multiple spaces
    for (int i = 2; i < sp.length; i++) {
        if (sp[i].equals(find)) {
            // have to check for ArrayIndexOutOfBoundsException
            String surr = (i-2 > 0 ? sp[i-2]+" " : "") +
                          (i-1 > 0 ? sp[i-1]+" " : "") +
                          sp[i] +
                          (i+1 < sp.length ? " "+sp[i+1] : "") +
                          (i+2 < sp.length ? " "+sp[i+2] : "");
            System.out.println(surr);
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

输出:

John and I like to
and hiking I have two
Run Code Online (Sandbox Code Playgroud)

多字:

对于find多字的情况,正则表达式是一个伟大而干净的解决方案.但是,由于它的性质,它错过了周围的单词也匹配的情况find(参见下面的例子).

下面的算法处理所有情况(所有解决方案的空间).记住的是,由于该问题的性质,这种解决方案在最坏情况下是O(n*m个) (与nstr的长度和mfind的长度).

public static void main(String[] args) {
    String str = "Hello my name is John and John and I like to go...";
    String find = "John and";

    String[] sp = str.split(" +"); // "+" for multiple spaces

    String[] spMulti = find.split(" +"); // "+" for multiple spaces
    for (int i = 2; i < sp.length; i++) {
        int j = 0;
        while (j < spMulti.length && i+j < sp.length 
                                  && sp[i+j].equals(spMulti[j])) {
            j++;
        }           
        if (j == spMulti.length) { // found spMulti entirely
            StringBuilder surr = new StringBuilder();
            if (i-2 > 0){ surr.append(sp[i-2]); surr.append(" "); }
            if (i-1 > 0){ surr.append(sp[i-1]); surr.append(" "); }
            for (int k = 0; k < spMulti.length; k++) {
                if (k > 0){ surr.append(" "); }
                surr.append(sp[i+k]);
            }
            if (i+spMulti.length < sp.length) {
                surr.append(" ");
                surr.append(sp[i+spMulti.length]);
            }
            if (i+spMulti.length+1 < sp.length) {
                surr.append(" ");
                surr.append(sp[i+spMulti.length+1]);
            }
            System.out.println(surr.toString());
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

输出:

name is John and John and
John and John and I like
Run Code Online (Sandbox Code Playgroud)