Rea*_*ger 1 hadoop scala cassandra apache-spark apache-spark-sql
我是Spark和Cassandra的新手。我使用此代码,但它给我错误。
val dfprev = df.select(col = "se","hu")
val a = dfprev.select("se")
val b = dfprev.select("hu")
val collection = sc.parallelize(Seq(a,b))
collection.saveToCassandra("keyspace", "table", SomeColumns("se","hu"))
Run Code Online (Sandbox Code Playgroud)
当我在上输入此代码时savetocassandra,它给我错误,错误是:
java.lang.IllegalArgumentException:不允许多个带有相同数量参数的构造方法。com.datastax.spark.connector.util.Reflect $ .methodSymbol(Reflect.scala:16)com.datastax.spark.connector.util.ReflectionUtil $ .constructorParams(ReflectionUtil.scala:63)com.datastax.spark .connector.mapper.DefaultColumnMapper。(DefaultColumnMapper.scala:45)位于com.datastax.spark.connector.mapper.LowPriorityColumnMapper $ class.defaultColumnMapper(ColumnMapper.scala:51),位于om.datastax.spark.connector.mapper.ColumnMapper $ .defaultColumnMapper(ColumnMapper.scala:55)
val dfprev = df.select("se","hu")
dfprev.write.format("org.apache.spark.sql.cassandra")
.options(Map("keyspace"->"YOUR_KEYSPACE_NAME","table"->"YOUR_TABLE_NAME"))
.mode(SaveMode.Append)
.save()
Run Code Online (Sandbox Code Playgroud)
变量a和b是类型数据帧的。sc.parallelize从元素集合创建RDD,但不接受数据框作为输入。
注意:在中设置spark.cassandra.connection.hostAND spark.cassandra.auth.username&spark.cassandra.auth.password(如果启用了身份验证)sparkconf