SBT测试不适用于火花测试

use*_*782 7 derby sbt apache-spark

我有一个简单的火花功能来测试DF窗口:

    import org.apache.spark.sql.{DataFrame, SparkSession}

    object ScratchPad {

      def main(args: Array[String]): Unit = {
        val spark = SparkSession.builder().master("local[*]").getOrCreate()
        spark.sparkContext.setLogLevel("ERROR")
        get_data_frame(spark).show()
      }

      def get_data_frame(spark: SparkSession): DataFrame = {
        import spark.sqlContext.implicits._
        val hr = spark.sparkContext.parallelize(List(
          ("Steinbeck", "Sales", 100),
          ("Woolf", "IT", 99),
          ("Wodehouse", "Sales", 250),
          ("Hemingway", "IT", 349)
        )
        ).toDF("emp", "dept", "sal")

        import org.apache.spark.sql.expressions.Window
        import org.apache.spark.sql.functions._

        val windowspec = Window.partitionBy($"dept").orderBy($"sal".desc)


        hr.withColumn("rank", row_number().over(windowspec))

      }
    }
Run Code Online (Sandbox Code Playgroud)

我写了一个像这样的测试:

    import com.holdenkarau.spark.testing.DataFrameSuiteBase
    import org.apache.spark.sql.Row
    import org.apache.spark.sql.types._
    import org.scalatest.FunSuite

    class TestDF extends FunSuite with DataFrameSuiteBase  {

      test ("DFs equal") {
        val expected=sc.parallelize(List(
          Row("Wodehouse","Sales",250,1),
          Row("Steinbeck","Sales",100,2),
          Row("Hemingway","IT",349,1),
          Row("Woolf","IT",99,2)
        ))

        val schema=StructType(
          List(
          StructField("emp",StringType,true),
          StructField("dept",StringType,true),
          StructField("sal",IntegerType,false),
          StructField("rank",IntegerType,true)
          )
        )

        val e2=sqlContext.createDataFrame(expected,schema)
        val actual=ScratchPad.get_data_frame(sqlContext.sparkSession)
        assertDataFrameEquals(e2,actual)
      }
Run Code Online (Sandbox Code Playgroud)

}

当我右键单击intellij中的类并单击"运行"时工作正常.当我使用"sbt test"运行相同的测试时,它会失败并显示以下内容:

    java.security.AccessControlException: access denied 
    org.apache.derby.security.SystemPermission( "engine", 
    "usederbyinternals" )
        at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
        at java.security.AccessController.checkPermission(AccessController.java:884)
        at org.apache.derby.iapi.security.SecurityUtil.checkDerbyInternalsPrivilege(Unknown Source)
        ...
Run Code Online (Sandbox Code Playgroud)

这是我的SBT脚本,没什么好看的 - 必须放入hive依赖,否则测试不会编译:

    name := "WindowingTest"

    version := "0.1"

    scalaVersion := "2.11.5"


    libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.1"
    libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.2.1"
    libraryDependencies += "org.apache.spark" %% "spark-hive" % "2.2.1"
    libraryDependencies += "com.holdenkarau" %% "spark-testing-base" % "2.2.0_0.8.0" % "test"
Run Code Online (Sandbox Code Playgroud)

Google搜索将我指向derby-6648(https://db.apache.org/derby/releases/release-10.12.1.1.cgi)

其中说:需要更改应用程序在SecurityManager下运行Derby的用户必须编辑策略文件,并为derby.jar,derbynet.jar和derbyoptionaltools.jar授予以下额外权限:

权限org.apache.derby.security.SystemPermission"engine","usederbyinternals";

由于我没有明确安装derby(内部可能由spark使用),我该怎么做?

Dai*_*mon 4

遵循快速而肮脏的黑客解决了问题

System.setSecurityManager(null)
Run Code Online (Sandbox Code Playgroud)

无论如何,因为它仅与自动化测试相关,也许它毕竟不是那么有问题;)

  • 仅当您不使用 Hive 功能时它才有效:) (2认同)