小编Kar*_*rai的帖子

pyspark 中 toDebugstring() 的准确使用

我是 pyspark 的新手,正在尝试了解 toDebugstring() 的确切用法。您能从下面的代码片段中解释一下吗?

 >>> a = sc.parallelize([1,2,3]).distinct()
    >>> print a.toDebugString()
    (8) PythonRDD[27] at RDD at PythonRDD.scala:44 [Serialized 1x Replicated]
     |  MappedRDD[26] at values at NativeMethodAccessorImpl.java:-2 [Serialized 1x Replicated]
     |  ShuffledRDD[25] at partitionBy at NativeMethodAccessorImpl.java:-2 [Serialized 1x Replicated]
     +-(8) PairwiseRDD[24] at distinct at <stdin>:1 [Serialized 1x Replicated]
        |  PythonRDD[23] at distinct at <stdin>:1 [Serialized 1x Replicated]
        |  ParallelCollectionRDD[21] at parallelize at PythonRDD.scala:358 [Serialized 1x Replicated]
Run Code Online (Sandbox Code Playgroud)

rdd pyspark

5
推荐指数
1
解决办法
3747
查看次数

标签 统计

pyspark ×1

rdd ×1