Spark Mongodb Connector Scala - 缺少数据库名称

Dan*_*nov 6 scala mongodb apache-spark apache-spark-sql

我遇到了一个奇怪的问题.我正在尝试使用mongodb spark连接器将Spark本地连接到MongoDB.

除了设置火花我还使用以下代码:

val readConfig = ReadConfig(Map("uri" -> "mongodb://localhost:27017/movie_db.movie_ratings", "readPreference.name" -> "secondaryPreferred"), Some(ReadConfig(sc)))
val writeConfig = WriteConfig(Map("uri" -> "mongodb://127.0.0.1/movie_db.movie_ratings"))

// Load the movie rating data from Mongo DB
val movieRatings = MongoSpark.load(sc, readConfig).toDF()

movieRatings.show(100)
Run Code Online (Sandbox Code Playgroud)

但是,我收到编译错误:

java.lang.IllegalArgumentException: Missing database name. Set via the 'spark.mongodb.input.uri' or 'spark.mongodb.input.database' property.
Run Code Online (Sandbox Code Playgroud)

在线我在哪里设置readConfig.我不明白为什么当我在地图上清楚地拥有uri属性时,它抱怨没有设置uri.我可能会遗漏一些东西.

mrs*_*vas 7

你可以从SparkSession这里提到的那样做

val spark = SparkSession.builder()
    .master("local")
    .appName("MongoSparkConnectorIntro")
    .config("spark.mongodb.input.uri", "mongodb://localhost:27017/movie_db.movie_ratings")
    .config("spark.mongodb.input.readPreference.name", "secondaryPreferred")
    .config("spark.mongodb.output.uri", "mongodb://127.0.0.1/movie_db.movie_ratings")
    .getOrCreate()
Run Code Online (Sandbox Code Playgroud)

使用配置创建数据帧

val readConfig = ReadConfig(Map("uri" -> "mongodb://localhost:27017/movie_db.movie_ratings", "readPreference.name" -> "secondaryPreferred"))
val df = MongoSpark.load(spark)
Run Code Online (Sandbox Code Playgroud)

写df到mongodb

MongoSpark.save(
df.write
    .option("spark.mongodb.output.uri", "mongodb://127.0.0.1/movie_db.movie_ratings")
    .mode("overwrite"))
Run Code Online (Sandbox Code Playgroud)

在您的代码中:配置中缺少前缀

val readConfig = ReadConfig(Map(
    "spark.mongodb.input.uri" -> "mongodb://localhost:27017/movie_db.movie_ratings", 
    "spark.mongodb.input.readPreference.name" -> "secondaryPreferred"), 
    Some(ReadConfig(sc)))

val writeConfig = WriteConfig(Map(
    "spark.mongodb.output.uri" -> "mongodb://127.0.0.1/movie_db.movie_ratings"))
Run Code Online (Sandbox Code Playgroud)