小编And*_*res的帖子

PySpark MongoDB :: java.lang.NoClassDefFoundError: com/mongodb/client/model/Collat​​ion

我试图从 PySpark 连接到 MongoDB Atlas,但遇到以下问题:

from pyspark import SparkContext
from pyspark.sql import SparkSession
from pyspark.sql.types import *
from pyspark.sql.functions import *

sc = SparkContext

spark = SparkSession.builder \
        .config("spark.mongodb.input.uri", "mongodb+srv://#USER#:#PASS#@test00-la3lt.mongodb.net/db.BUSQUEDAS?retryWrites=true") \
        .config("spark.mongodb.output.uri", "mongodb+srv://#USER#:#PASS#@test00-la3lt.mongodb.net/db.BUSQUEDAS?retryWrites=true") \
        .getOrCreate()

df = spark.read.format("com.mongodb.spark.sql.DefaultSource").load()
Run Code Online (Sandbox Code Playgroud)

返回此代码的错误是这样的:

---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
<ipython-input-3-346df2de8d22> in <module>()
----> 1 df = spark.read.format("com.mongodb.spark.sql.DefaultSource").load()

c:\users\andres\appdata\local\programs\python\python36\lib\site-packages\pyspark\sql\readwriter.py in load(self, path, format, schema, **options)
    170             return self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path)))
    171         else:
--> 172             return self._df(self._jreader.load())
    173 
    174     @since(1.4)

c:\users\andres\appdata\local\programs\python\python36\lib\site-packages\py4j\java_gateway.py in __call__(self, *args)
   1255         answer = …
Run Code Online (Sandbox Code Playgroud)

mongodb apache-spark pyspark

5
推荐指数
1
解决办法
2702
查看次数

标签 统计

apache-spark ×1

mongodb ×1

pyspark ×1