pus*_*kin 4 java classification arraylist decision-tree java-8
我有一份清单清单:
List<ArrayList<String>> D = new ArrayList<>();
当它被填充时,它可能看起来像:
["A","B","Y"]
["C","D","Y"]
["A","D","N"]
我想根据唯一属性值将列表列表拆分为分区(比如索引1).
所以索引1的属性有两个唯一值,"B"和"D",所以我想分成:
["A","B","Y"]
["C","D","Y"]
["A","D","N"]
把它们放进去 List<ArrayList<ArrayList<String>>> sublists;
有没有一种聪明的方法可以做到这一点,或者我只是做这样的事情:
List<ArrayList<ArrayList<String>>> sublists = new ArrayList<>();
int featIdx = 1;
// generate the subsets
for (ArrayList<String> record : D) {
String val = record.get(featIdx);
// check if the value exists in sublists
boolean found = false;
for (ArrayList<ArrayList<String>> entry : sublists) {
if (entry.get(0).get(featIdx).equals(val)) {
entry.add(record);
found = true;
break;
}
}
if (!found) {
sublists.add(new ArrayList<>());
sublists.get(sublists.size()-1).add(record);
}
}
Run Code Online (Sandbox Code Playgroud)
这是C4.5决策树算法的一个步骤,所以如果有人有这方面的经验,如果你能告诉我这是否是生成子列表的正确方法,我将不胜感激.
谢谢.
使用Java 8,您可以使用groupingBy收集器:
Map<String, List<List<String>>> grouped = D.stream()
.collect(Collectors.groupingBy(list -> list.get(1)));
Collection<List<List<String>>> sublists = grouped.values();
Run Code Online (Sandbox Code Playgroud)
或者@AlexisC的建议:
import static java.util.stream.Collectors.collectingAndThen;
import static java.util.stream.Collectors.groupingBy;
Collection<List<List<String>>> sublists = D.stream()
.collect(collectingAndThen(groupingBy(list -> list.get(1)), Map::values));
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
4514 次 |
| 最近记录: |