我在SparkConf中设置参数spark.cassandra.output.batch.size.rows,如下所示:
val conf = new SparkConf(true)
.set("spark.cassandra.connection.host", "host")
.set("spark.cassandra.auth.username", "cassandra")
.set("spark.cassandra.auth.password", "cassandra")
.set("spark.cassandra.output.batch.size.rows", "5120")
.set("spark.cassandra.output.concurrent.writes", "10")
Run Code Online (Sandbox Code Playgroud)
但是当我表演的时候
saveToCassandra( "数据", "ten_days")
我继续在我的system.log中看到警告
NFO [FlushWriter:7] 2014-11-20 11:11:16,498 Memtable.java (line 395) Completed flushing /var/lib/cassandra/data/system/hints/system-hints-jb-76-Data.db (5747287 bytes) for commitlog position ReplayPosition(segmentId=1416480663951, position=44882909)
INFO [FlushWriter:7] 2014-11-20 11:11:16,499 Memtable.java (line 355) Writing Memtable-ten_days@1656582530(32979978/329799780 serialized/live bytes, 551793 ops)
WARN [Native-Transport-Requests:761] 2014-11-20 11:11:16,499 BatchStatement.java (line 226) Batch of prepared statements for [data.ten_days] is of size 36825, exceeding specified threshold of 5120 by 31705.
WARN [Native-Transport-Requests:777] 2014-11-20 11:11:16,500 BatchStatement.java …Run Code Online (Sandbox Code Playgroud)