Spark Scala如何在RDD中使用替换功能

Rav*_*rra 5 scala apache-spark

我有一个推文文件

396124436845178880,"When's 12.4k gonna roll around",Matty_T_03
396124437168537600,"I really wish I didn't give up everything I did for you.     I'm so mad at my self for even letting it get as far as it did.",savava143
396124436958412800,"I really need to double check who I'm sending my     snapchats to before sending it ",juliannpham
396124437218885632,"@Darrin_myers30 I feel you man, gotta stay prayed up.     Year is important",Ful_of_Ambition
396124437558611968,"tell me what I did in my life to deserve this.",_ItsNotBragging
396124437499502592,"Too many fine men out here...see me drooling",LolaofLife
396124437722198016,"@jaiclynclausen will do",I_harley99
Run Code Online (Sandbox Code Playgroud)

在将文件读入RDD后,我试图替换所有特殊字符,

    val fileReadRdd = sc.textFile(fileInput)
    val fileReadRdd2 = fileReadRdd.map(x => x.map(_.replace(","," ")))
    val fileFlat = fileReadRdd.flatMap(rec => rec.split(" "))
Run Code Online (Sandbox Code Playgroud)

我收到以下错误

Error:(41, 57) value replace is not a member of Char
    val fileReadRdd2 = fileReadRdd.map(x => x.map(_.replace(",","")))
Run Code Online (Sandbox Code Playgroud)

Bri*_*new 4

我猜测:

x => x.map(_.replace(",",""))
Run Code Online (Sandbox Code Playgroud)

正在将您的字符串视为字符序列,并且您实际上想要

x => x.replace(",", "")
Run Code Online (Sandbox Code Playgroud)

(即您不需要映射字符的“序列”)