相关疑难解决方法(0)

如何在Spark数据框中展平结构?

我有一个具有以下结构的数据帧:

 |-- data: struct (nullable = true)
 |    |-- id: long (nullable = true)
 |    |-- keyNote: struct (nullable = true)
 |    |    |-- key: string (nullable = true)
 |    |    |-- note: string (nullable = true)
 |    |-- details: map (nullable = true)
 |    |    |-- key: string
 |    |    |-- value: string (valueContainsNull = true)
Run Code Online (Sandbox Code Playgroud)

如何展平结构并创建新的数据框:

     |-- id: long (nullable = true)
     |-- keyNote: struct (nullable = true)
     |    |-- key: string (nullable = true)
     |    |-- note: …
Run Code Online (Sandbox Code Playgroud)

java apache-spark apache-spark-sql

24
推荐指数
6
解决办法
3万
查看次数

由于数据类型不匹配 PySpark 无法解析列

PySpark 中遇到的错误:

pyspark.sql.utils.AnalysisException: "cannot resolve '`result_set`.`dates`.`trackers`['token']' due to data type mismatch: argument 2 requires integral type, however, ''token'' is of string type.;;\n'Project [result_parameters#517, result_set#518, <lambda>(result_set#518.dates.trackers[token]) AS result_set.dates.trackers.token#705]\n+- Relation[result_parameters#517,result_set#518] json\n"
Run Code Online (Sandbox Code Playgroud)

数据结构:

-- result_set: struct (nullable = true)
 |    |-- currency: string (nullable = true)
 |    |-- dates: array (nullable = true)
 |    |    |-- element: struct (containsNull = true)
 |    |    |    |-- date: string (nullable = true)
 |    |    |    |-- trackers: array (nullable = true)
 |    |    |    | …
Run Code Online (Sandbox Code Playgroud)

python pyspark

3
推荐指数
1
解决办法
3万
查看次数

标签 统计

apache-spark ×1

apache-spark-sql ×1

java ×1

pyspark ×1

python ×1