一对多 KStream-KTable 连接

Liz*_*ury 2 apache-kafka apache-kafka-streams

我有一个 kStream 大学 - 当大学是 -

University(universityId: String, name: String, studentIds: Seq[String])

val universityKStream = builder.stream[String, University](...)
Run Code Online (Sandbox Code Playgroud)

还有一个 kTable 的学生,当学生是 -

Student(studentId: String, name: String)

val studentsKtable = builder.table[String, Student](...)
Run Code Online (Sandbox Code Playgroud)

我想加入两者并产生一个 ResolvedUniverity 对象的主题:

ResolvedUniversity(universityId: String, name: String, students: Seq[Student])
Run Code Online (Sandbox Code Playgroud)

我不能 groupBy 并用 universityId 聚合学生,因为 student 对象中不存在 universityId 字段..

Dmi*_*sky 7

仅使用 DSL,我认为您可以做的最简单的是(Java):

    class Student {
        String studentId;
        String name;
    }
    class University {
        String universityId;
        String name;
        List<String> studentIds;
    }
    class ResolvedUniversity {
        String universityId;
        String name;
        List<Student> students;
    }
    Serde<String> stringSerde = null;
    Serde<Student> studentSerde = null;
    Serde<University> universitySerde = null;
    Serde<ResolvedUniversity> resolvedUniversitySerde = null;

    KStream<String, University> universities = topology
      .stream("universities", Consumed.with(stringSerde, universitySerde));

    KTable<String, Student> students = topology
      .table("students", Consumed.with(stringSerde, studentSerde));

    KTable<String, ResolvedUniversity> resolvedUniversities = universities
      .flatMap((k, v) -> {
          return v.studentIds.stream()
            .map(id -> new KeyValue<>(id, v))
            .collect(Collectors.toList());
      })
      .join(students, Pair::pair, Joined.with(stringSerde, universitySerde, studentSerde))
      .groupBy((k, v) -> v.left().universityId)
      .aggregate(ResolvedUniversity::new,
                 (k, v, a) -> {
                     a.universityId = v.left().universityId;
                     a.name = v.left().name;
                     a.students.add(v.right());
                     return a;
                 },
                 Materialized.with(stringSerde, resolvedUniversitySerde));
Run Code Online (Sandbox Code Playgroud)

使用这种类型的连接,为了进行历史处理,您KTable的大学必须在KStream连接之前用其数据“准备” 。