获取连接到Apache Spark GraphX中节点的所有节点

Aja*_*pta 6 scala graph apache-spark spark-graphx

假设我们在Apache GraphX中获得了以下输入:

顶点RDD:

val vertexArray = Array(
  (1L, "Alice"),
  (2L, "Bob"),
  (3L, "Charlie"),
  (4L, "David"),
  (5L, "Ed"),
  (6L, "Fran")
)
Run Code Online (Sandbox Code Playgroud)

Edge RDD:

val edgeArray = Array(
  Edge(1L, 2L, 1),
  Edge(2L, 3L, 1),
  Edge(3L, 4L, 1),
  Edge(5L, 6L, 1)
)
Run Code Online (Sandbox Code Playgroud)

我需要连接到Apache Spark GraphX中的节点的所有组件

1,[1,2,3,4]
5,[5,6]
Run Code Online (Sandbox Code Playgroud)

zer*_*323 9

您可以使用ConnectedComponents哪个退货

顶点值包含连接组件中包含该顶点的最低顶点id的图形.

并重塑结果

graph.connectedComponents.vertices.map(_.swap).groupByKey
Run Code Online (Sandbox Code Playgroud)