小编use*_*684的帖子

进行简单的 head() 调用时,考拉会在 <module 'pyspark.cloudpickle' 上抛出 ' Can't get attribute _fill_function'

当我在 python 脚本中运行以下代码并直接使用 python 运行它时,出现以下错误。当我启动 pyspark 会话,然后导入 koalas、创建数据帧并调用 head() 时,它运行良好并给出了预期的输出。

是否需要设置 SparkSession 才能使考拉工作的特定方式?

from pyspark.sql import SparkSession
import pandas as pd
import databricks.koalas as ks


spark = SparkSession.builder \
        .master("local[*]") \
        .appName("Pycedro Spark Application") \
        .getOrCreate()


kdf = ks.DataFrame({"a" : [4 ,5, 6],
                    "b" : [7, 8, 9],
                    "c" : [10, 11, 12]})

print(kdf.head())
Run Code Online (Sandbox Code Playgroud)

在python脚本中运行时出错:

    File "/usr/local/Cellar/apache-spark/3.1.1/libexec/python/lib/pyspark.zip/pyspark/worker.py", line 586, in main
    func, profiler, deserializer, serializer = read_command(pickleSer, infile)
  File "/usr/local/Cellar/apache-spark/3.1.1/libexec/python/lib/pyspark.zip/pyspark/worker.py", line 69, in read_command
    command = serializer._read_with_length(file)
  File "/usr/local/Cellar/apache-spark/3.1.1/libexec/python/lib/pyspark.zip/pyspark/serializers.py", line …
Run Code Online (Sandbox Code Playgroud)

spark-koalas

6
推荐指数
1
解决办法
3711
查看次数

标签 统计

spark-koalas ×1