我们可以通过在spark-shell中使用spark.newSession来创建新的Spark会话。现在,我的问题是新的 Spark 会话实例有什么用?
两个最常见的用例是:
保持会话的配置存在微小差异。
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.2.0
/_/
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_141)
Type in expressions to have them evaluated.
Type :help for more information.
scala> spark.range(100).groupBy("id").count.rdd.getNumPartitions
res0: Int = 200
scala>
scala> val newSpark = spark.newSession
newSpark: org.apache.spark.sql.SparkSession = org.apache.spark.sql.SparkSession@618a9cb7
scala> newSpark.conf.set("spark.sql.shuffle.partitions", 99)
scala> newSpark.range(100).groupBy("id").count.rdd.getNumPartitions
res2: Int = 99
scala> spark.range(100).groupBy("id").count.rdd.getNumPartitions // No effect on initial session
res3: Int = 200
Run Code Online (Sandbox Code Playgroud)分隔临时命名空间:
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.2.0
/_/
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_141)
Type in expressions to have them evaluated.
Type :help for more information.
scala> spark.range(1).createTempView("foo")
scala>
scala> spark.catalog.tableExists("foo")
res1: Boolean = true
scala>
scala> val newSpark = spark.newSession
newSpark: org.apache.spark.sql.SparkSession = org.apache.spark.sql.SparkSession@73418044
scala> newSpark.catalog.tableExists("foo")
res2: Boolean = false
scala> newSpark.range(100).createTempView("foo") // No exception
scala> spark.table("foo").count // No effect on inital session
res4: Long = 1
Run Code Online (Sandbox Code Playgroud)| 归档时间: |
|
| 查看次数: |
2301 次 |
| 最近记录: |