Spark 数据框中结构体的字段名称列表

Par*_*ari 3 schema struct dataframe apache-spark pyspark

我有一个具有以下架构的数据框:

root
 |-- _id: long (nullable = true)
 |-- student_info: struct (nullable = true)
 |    |-- firstname: string (nullable = true)
 |    |-- lastname: string (nullable = true)
 |    |-- major: string (nullable = true)
 |    |-- hounour_roll: boolean (nullable = true)
 |-- school_name: string (nullable = true)
Run Code Online (Sandbox Code Playgroud)

如何仅获取“student_info”下的列列表?IE["firstname","lastname","major","honour_roll"]

Zyg*_*ygD 6

以下所有内容都返回结构体的字段名称列表。该.columns方法看起来最干净。

df.select("student_info.*").columns
Run Code Online (Sandbox Code Playgroud)
df.schema["student_info"].dataType.names
Run Code Online (Sandbox Code Playgroud)
df.schema["student_info"].dataType.fieldNames()
Run Code Online (Sandbox Code Playgroud)
df.select("student_info.*").schema.names
Run Code Online (Sandbox Code Playgroud)
df.select("student_info.*").schema.fieldNames()
Run Code Online (Sandbox Code Playgroud)