value join不是org.apache.spark.rdd.RDD的成员

sds*_*sds 3 scala apache-spark

我收到此错误:

value join is not a member of 
    org.apache.spark.rdd.RDD[(Long, (Int, (Long, String, Array[_0])))
        forSome { type _0 <: (String, Double) }]
Run Code Online (Sandbox Code Playgroud)

我发现的唯一建议是import org.apache.spark.SparkContext._我已经这样做了.

我究竟做错了什么?

编辑:更改代码以消除forSome(即,当对象具有类型时org.apache.spark.rdd.RDD[(Long, (Int, (Long, String, Array[(String, Double)]))))解决了问题.这是Spark中的一个错误吗?

Dan*_*bos 6

join是...的成员org.apache.spark.rdd.PairRDDFunctions.那么为什么隐式类不会触发?

scala> val s = Seq[(Long, (Int, (Long, String, Array[_0]))) forSome { type _0 <: (String, Double) }]()
scala> val r = sc.parallelize(s)
scala> r.join(r) // Gives your error message.
scala> val p = new org.apache.spark.rdd.PairRDDFunctions(r)
<console>:25: error: no type parameters for constructor PairRDDFunctions: (self: org.apache.spark.rdd.RDD[(K, V)])(implicit kt: scala.reflect.ClassTag[K], implicit vt: scala.reflect.ClassTag[V], implicit ord: Ordering[K])org.apache.spark.rdd.PairRDDFunctions[K,V] exist so that it can be applied to arguments (org.apache.spark.rdd.RDD[(Long, (Int, (Long, String, Array[_0]))) forSome { type _0 <: (String, Double) }])
 --- because ---
argument expression's type is not compatible with formal parameter type;
 found   : org.apache.spark.rdd.RDD[(Long, (Int, (Long, String, Array[_0]))) forSome { type _0 <: (String, Double) }]
 required: org.apache.spark.rdd.RDD[(?K, ?V)]
Note: (Long, (Int, (Long, String, Array[_0]))) forSome { type _0 <: (String, Double) } >: (?K, ?V), but class RDD is invariant in type T.
You may wish to define T as -T instead. (SLS 4.5)
       val p = new org.apache.spark.rdd.PairRDDFunctions(r)
               ^
<console>:25: error: type mismatch;
 found   : org.apache.spark.rdd.RDD[(Long, (Int, (Long, String, Array[_0]))) forSome { type _0 <: (String, Double) }]
 required: org.apache.spark.rdd.RDD[(K, V)]
       val p = new org.apache.spark.rdd.PairRDDFunctions(r)
Run Code Online (Sandbox Code Playgroud)

我确信错误信息对其他人来说都是清楚的,但只是为了我自己的慢自我,我们试着理解它.PairRDDFunctions有两个类型参数,KV.你forSome是整个货币对,所以它不能分成单独的KV类型.没有K,VRDD[(K, V)]将等于您的RDD类型.

但是,您可以forSome只使用密钥,而不是整个密钥.现在加入工作,因为这种类型可以分为KV.

scala> val s2 = Seq[(Long, (Int, (Long, String, Array[_0])) forSome { type _0 <: (String, Double) })]()
scala> val r2 = sc.parallelize(2s)
scala> r2.join(r2)
res0: org.apache.spark.rdd.RDD[(Long, ((Int, (Long, String, Array[_0])) forSome { type _0 <: (String, Double) }, (Int, (Long, String, Array[_0])) forSome { type _0 <: (String, Double) }))] = MapPartitionsRDD[5] at join at <console>:26
Run Code Online (Sandbox Code Playgroud)