小编Aks*_*mar的帖子

Spark NullPointerException

我的Spark代码如下所示: -

val logData = sc.textFile("hdfs://localhost:9000/home/akshat/recipes/recipes/simplyrecipes/*/*/*/*")


def doSomething(line: String): (Long,Long) = { 


 val numAs = logData.filter(line => line.contains("a")).count();


 val numBs = logData.filter(line => line.contains("b")).count();
 return (numAs,numBs)

}

 val mapper = logData.map(doSomething _)

 val save = mapper.saveAsTextFile("hdfs://localhost:9000/home/akshat/output3")
Run Code Online (Sandbox Code Playgroud)

mapper是类型org.apache.spark.rdd.RDD[(Long, Long)] = MappedRDD 当我尝试执行saveAsTextFile操作时,它给出了一个错误 java.lang.NullPointerException

我做错了什么以及我应该做些什么改变来纠正这个例外?
提前致谢!

nullpointerexception apache-spark

1
推荐指数
1
解决办法
788
查看次数

火花类型不匹配错误

我有一个功能如下: -

def doSomething(line: RDD[(String, String)]): (String) = {
       val c = line.toLocalIterator.mkString
       val file2 = KeepEverythingExtractor.INSTANCE.getText(c)
       (file2)
    }      
Run Code Online (Sandbox Code Playgroud)

这是类型 org.apache.spark.rdd.RDD[(String, String)])String

我有一些文件存储在hdfs,我必须访问如下: -

val logData = sc.wholeTextFiles("hdfs://localhost:9000/home/akshat/recipes/recipes/simplyrecipes/*/*/*/*")
Run Code Online (Sandbox Code Playgroud)

这是类型 org.apache.spark.rdd.RDD[(String, String)]

我必须根据doSomething函数映射这些文件

val mapper = logData.map(doSomething)
Run Code Online (Sandbox Code Playgroud)

但是出现了这样的错误: -

<console>:32: error: type mismatch;
 found   : org.apache.spark.rdd.RDD[(String, String)] => String
 required: ((String, String)) => ?
       val mapper = logData.map(doSomething)
                                ^
Run Code Online (Sandbox Code Playgroud)

我在我的函数中定义了我应该具有什么类型的输入和输出,并且我只根据它给出输入.为什么会出现此错误,为了纠正此错误,我应该更改哪些内容?
提前致谢!

scala type-mismatch apache-spark

1
推荐指数
1
解决办法
4247
查看次数