我尝试使用Spark和Cassandra Spark Connector将流数据保存到Cassandra中.
我做了类似以下的事情:
创建一个模型类:
public class ContentModel {
String id;
String available_at; //may be null
public ContentModel(String id, String available_at){
this.id=id;
this.available_at=available_at,
}
}
Run Code Online (Sandbox Code Playgroud)
将流内容映射到模型:
JavaDStream<ContentModel> contentsToModel = myStream.map(new Function<String, ContentModel>() {
@Override
public ContentModel call(String content) throws Exception {
String[] parts = content.split(",");
return new ContentModel(parts[0], parts[1]);
}
});
Run Code Online (Sandbox Code Playgroud)
保存:
CassandraStreamingJavaUtil.javaFunctions(contentsToModel).writerBuilder("data", "contents", CassandraJavaUtil.mapToRow(ContentModel.class)).saveToCassandra();
Run Code Online (Sandbox Code Playgroud)
如果某些值是null我得到以下错误:
com.datastax.spark.connector.types.TypeConversionException: Cannot convert object null to struct.ValueRepr.
Run Code Online (Sandbox Code Playgroud)
有没有办法使用Spark Cassandra Connector存储空值?