Java 11:使用收集器将 List<String> 转换为 TreeMap<String, List<String>>

Tha*_*s M 4 java java-stream

我有一个这样的清单

List<String> customList = Arrays.asList(
   "5000  Buruli ulcer is an infectious disease",
   "6000  characterized by the development",
   "7000  of painless open wounds.",
   "8000  The disease largely occurs in",
   "10000  sub-Saharan Africa and Australia."
);
Run Code Online (Sandbox Code Playgroud)

我想把它转换ListTreeMap<String, List<String>>这样:

"5000", ["Buruli", "ulcer", "is", "an", "infectious", "disease"]
"6000", ["characterized", "by", "the", "development"]
// etc
Run Code Online (Sandbox Code Playgroud)

到目前为止我的代码:

TreeMap<String, List<String[]>> collect = customList.stream()
      .map(s -> s.split("  ", 2))
      .collect(Collectors
         .groupingBy(a -> a[0], TreeMap::new, Collectors.mapping(a -> a[1].split(" "), Collectors.toList())));
Run Code Online (Sandbox Code Playgroud)

我有两个问题。

  1. 首先,这TreeMap::new可能不起作用,因为顺序与原始List.
  2. 其次,我似乎没有找到一种方法将其List<String[]>变成List<String>.

有任何想法吗?

ern*_*t_k 7

您想使用 aLinkedHashMap来保留原始顺序。所以你的代码应该是这样的:

Map<String, List<String>> collect = customList.stream()
    .map(s -> s.split(" +"))
    .collect(Collectors.toMap(a -> a[0], a -> Arrays.asList(a)
        .subList(1, a.length), (a, b) -> a, LinkedHashMap::new));
Run Code Online (Sandbox Code Playgroud)

如果您的键不是唯一的,您可以使用具有以下内容的分组收集器(Collectors.flatMapping需要 Java 9+):

collect = customList.stream()
    .map(s -> Arrays.asList(s.split(" +")))
    .collect(Collectors.groupingBy(l -> l.get(0), 
        LinkedHashMap::new, 
        Collectors.flatMapping(l -> l.stream().skip(1), Collectors.toList())));
Run Code Online (Sandbox Code Playgroud)

  • @Thomas 是的,我假设密钥是唯一的。否则,正如 Arvind 所建议的那样,groupBy 是执行此操作的方法(或者当然是 toMap 中适当的合并函数)。 (2认同)

Arv*_*ash 5

还有一个更新:

此更新是为了满足 OP 在答案下方作为评论提到的以下要求:

我希望每个单词都作为列表中的一个单独元素。对于您的解决方案,所有元素都在同一个 List 条目中。例如,我想要 10000=[撒哈拉以南、非洲和澳大利亚。]

为了实现这一点,您不应拆分单词串。

演示:

import java.util.Arrays;
import java.util.Comparator;
import java.util.List;
import java.util.TreeMap;
import java.util.stream.Collectors;

public class Main {
    public static void main(String[] args) {
        List<String> customList = Arrays.asList(
                   "5000  Buruli ulcer is an infectious disease",
                   "6000  characterized by the development",
                   "7000  of painless open wounds.",
                   "8000  The disease largely occurs in",
                   "10000  sub-Saharan Africa and Australia."
                );
        
        TreeMap<String, List<String>> collect = customList.stream().map(s -> s.split("  ", 2))
                .collect(Collectors.groupingBy(a -> a[0],
                        () -> new TreeMap<String, List<String>>(Comparator.comparingInt(Integer::parseInt)),
                        Collectors.mapping(a -> a[1], Collectors.toList())));
        
        System.out.println(collect);
    }
}
Run Code Online (Sandbox Code Playgroud)

输出:

{5000=[Buruli ulcer is an infectious disease], 6000=[characterized by the development], 7000=[of painless open wounds.], 8000=[The disease largely occurs in], 10000=[sub-Saharan Africa and Australia.]}
Run Code Online (Sandbox Code Playgroud)

或者基于我原来的答案:

import java.util.Arrays;
import java.util.List;
import java.util.Map;
import java.util.TreeMap;
import java.util.stream.Collectors;

public class Main {
    public static void main(String[] args) {
        List<String> customList = Arrays.asList(
                   "5000  Buruli ulcer is an infectious disease",
                   "6000  characterized by the development",
                   "7000  of painless open wounds.",
                   "8000  The disease largely occurs in",
                   "10000  sub-Saharan Africa and Australia."
                );

        Map<String, List<String>> collect = customList.stream().map(s -> s.split("\\s+", 2))
                .collect(Collectors.groupingBy(a -> a[0], TreeMap::new,
                        Collectors.mapping(a -> a[1], Collectors.toList())));

        System.out.println(collect);
    }
}
Run Code Online (Sandbox Code Playgroud)

输出:

{10000=[sub-Saharan Africa and Australia.], 5000=[Buruli ulcer is an infectious disease], 6000=[characterized by the development], 7000=[of painless open wounds.], 8000=[The disease largely occurs in]}
Run Code Online (Sandbox Code Playgroud)

Aniket建议的解决方案:

import java.util.Arrays;
import java.util.Comparator;
import java.util.List;
import java.util.TreeMap;
import java.util.stream.Collectors;

public class Main {
    public static void main(String[] args) {
        List<String> customList = Arrays.asList(
                   "5000  Buruli ulcer is an infectious disease",
                   "6000  characterized by the development",
                   "7000  of painless open wounds.",
                   "8000  The disease largely occurs in",
                   "10000  sub-Saharan Africa and Australia."
                );
        
        TreeMap<String, List<String>> collect = customList.stream().map(s -> s.split("  ", 2))
                .collect(Collectors.groupingBy(a -> a[0],
                        () -> new TreeMap<String, List<String>>(Comparator.comparingInt(Integer::parseInt)),
                        Collectors.mapping(a -> Arrays.toString(a[1].split(" ")), Collectors.toList())));

        System.out.println(collect);
    }
}
Run Code Online (Sandbox Code Playgroud)

输出:

{5000=[[Buruli, ulcer, is, an, infectious, disease]], 6000=[[characterized, by, the, development]], 7000=[[of, painless, open, wounds.]], 8000=[[The, disease, largely, occurs, in]], 10000=[[sub-Saharan, Africa, and, Australia.]]}
Run Code Online (Sandbox Code Playgroud)

原答案:

你快到了。你可以这样做:

import java.util.Arrays;
import java.util.List;
import java.util.Map;
import java.util.TreeMap;
import java.util.stream.Collectors;

public class Main {
    public static void main(String[] args) {
        List<String> customList = Arrays.asList(
                   "5000  Buruli ulcer is an infectious disease",
                   "6000  characterized by the development",
                   "7000  of painless open wounds.",
                   "8000  The disease largely occurs in",
                   "10000  sub-Saharan Africa and Australia."
                );

        Map<Object, List<Object>> collect = customList.stream().map(s -> s.split("\\s+", 2))
                .collect(Collectors.groupingBy(a -> a[0], TreeMap::new,
                        Collectors.mapping(a -> Arrays.asList(a[1].split("\\s+")), Collectors.toList())));

        System.out.println(collect);
    }
}
Run Code Online (Sandbox Code Playgroud)

输出:

{10000=[[sub-Saharan, Africa, and, Australia.]], 5000=[[Buruli, ulcer, is, an, infectious, disease]], 6000=[[characterized, by, the, development]], 7000=[[of, painless, open, wounds.]], 8000=[[The, disease, largely, occurs, in]]}
Run Code Online (Sandbox Code Playgroud)