我可以使用以下方法填充Numberic和String类型列:
masterDF = masterDF.na.fill(-1)
masterDF = masterDF.na.fill("")
masterDF = masterDF.na.fill(-1.0)
Run Code Online (Sandbox Code Playgroud)
但我没有找到api来填充布尔类型列.我试过这个:masterDF = masterDF.na.fill(false)不支持.
有任何想法吗?
您可以使用Map里面fill,其中的关键是列名和值是Int,Long,Float,Double,String,Boolean.
masterDF.na.fill(masterDF.columns.map(_ -> false).toMap)
Run Code Online (Sandbox Code Playgroud)
API文件说:
Run Code Online (Sandbox Code Playgroud)/** * (Scala-specific) Returns a new `DataFrame` that replaces null values. * * The key of the map is the column name, and the value of the map is the replacement value. * The value must be of the following type: `Int`, `Long`, `Float`, `Double`, `String`, `Boolean`. * Replacement values are cast to the column data type. * * For example, the following replaces null values in column "A" with string "unknown", and * null values in column "B" with numeric value 1.0. * {{{ * df.na.fill(Map( * "A" -> "unknown", * "B" -> 1.0 * )) * }}} * * @since 1.3.1 */ def fill(valueMap: Map[String, Any]): DataFrame = fillMap(valueMap.toSeq)
您甚至可以Map在fill函数内部为不同的列设置不同的值.
我希望答案是有帮助的.