Shi*_*ude 44 scala apache-spark
我正在尝试导入spark.implicits._显然,这是scala中类中的一个对象.当我用这样的方法导入它时:
def f() = {
val spark = SparkSession()....
import spark.implicits._
}
Run Code Online (Sandbox Code Playgroud)
它工作正常,但我正在编写一个测试类,我想让这个导入可用于我尝试过的所有测试:
class SomeSpec extends FlatSpec with BeforeAndAfter {
var spark:SparkSession = _
//This won't compile
import spark.implicits._
before {
spark = SparkSession()....
//This won't either
import spark.implicits._
}
"a test" should "run" in {
//Even this won't compile (although it already looks bad here)
import spark.implicits._
//This was the only way i could make it work
val spark = this.spark
import spark.implicits._
}
}
Run Code Online (Sandbox Code Playgroud)
这不仅看起来很糟糕,我不想为每次测试都做到这一点."正确"的做法是什么?
blu*_*e10 19
您可以执行类似于Spark测试套件中所做的操作.例如,这将起作用(受到启发SQLTestData):
class SomeSpec extends FlatSpec with BeforeAndAfter { self =>
var spark: SparkSession = _
private object testImplicits extends SQLImplicits {
protected override def _sqlContext: SQLContext = self.spark.sqlContext
}
import testImplicits._
before {
spark = SparkSession.builder().master("local").getOrCreate()
}
"a test" should "run" in {
// implicits are working
val df = spark.sparkContext.parallelize(List(1,2,3)).toDF()
}
}
Run Code Online (Sandbox Code Playgroud)
或者你也可以使用类似的东西SharedSQLContext,它提供了一个testImplicits: SQLImplicits,即:
class SomeSpec extends FlatSpec with SharedSQLContext {
import testImplicits._
// ...
}
Run Code Online (Sandbox Code Playgroud)
Keh*_*CAI 10
我认为SparkSession.scala文件中的GitHub代码可以给你一个很好的提示:
/**
* :: Experimental ::
* (Scala-specific) Implicit methods available in Scala for converting
* common Scala objects into [[DataFrame]]s.
*
* {{{
* val sparkSession = SparkSession.builder.getOrCreate()
* import sparkSession.implicits._
* }}}
*
* @since 2.0.0
*/
@Experimental
object implicits extends SQLImplicits with Serializable {
protected override def _sqlContext: SQLContext = SparkSession.this.sqlContext
}
Run Code Online (Sandbox Code Playgroud)
这里"spark.implicits._"中的"spark"就是我们创建的sparkSession对象.
这是另一个参考!
我只是实例化SparkSession,然后再使用"import implicits".
@transient lazy val spark = SparkSession
.builder()
.master("spark://master:7777")
.getOrCreate()
import spark.implicits._
Run Code Online (Sandbox Code Playgroud)