当 DataFrame 为空时抛出 AnalysisException（没有这样的结构字段）

Kzr*_*tof 5 scala apache-spark databricks

我有一个数据框，在其上应用过滤器，然后进行一系列转换。最后，我选择了几列。

//  Filters the event related to a user_principal.
  var filteredCount = events.filter("Properties.EventTypeName == 'user_principal_created' or Properties.EventTypeName == 'user_principal_updated'");
                            // Selects the columns based on the event type.
                            .withColumn("Username", when(col("Properties.EventTypeName") === lit("user_principal_created"), col("Body.Username"))
                            .otherwise(col("Body.NewValue.Username")))
                            .withColumn("FirstName", when(col("Properties.EventTypeName") === lit("user_principal_created"), col("Body.FirstName"))
                            .otherwise(col("Body.NewValue.FirstName")))
                            .withColumn("LastName", when(col("Properties.EventTypeName") === lit("user_principal_created"), col("Body.LastName"))
                            .otherwise(col("Body.NewValue.LastName")))
                            .withColumn("PrincipalId", when(col("Properties.EventTypeName") === lit("user_principal_created"), col("Body.PrincipalId"))
                            .otherwise(col("Body.NewValue.PrincipalId")))
                            .withColumn("TenantId", when(col("Properties.EventTypeName") === lit("user_principal_created"), col("Body.TenantId"))
                            .otherwise(col("Body.NewValue.TenantId")))
                            .withColumnRenamed("Timestamp", "LastChangeTimestamp")
                            // Create the custom primary key.
                            .withColumn("PrincipalUserId", substring(concat(col("TenantId"), lit("-"), col("PrincipalId")), 0, 128))                           
                            // Select the rows.
                            .select("PrincipalUserId", "TenantId", "PrincipalId", "FirstName", "LastName", "Username", "LastChangeTimestamp")

Run Code Online (Sandbox Code Playgroud)

仅当其中存在与events过滤器匹配的行时，它才起作用。如果没有行与过滤器匹配，那么我会得到以下异常：

org.apache.spark.sql.AnalysisException：没有这样的结构字段用户名...

问题

我该如何处理这种情况并防止withColumn失败？

更新

这是可行时的逻辑计划：

== 分析的逻辑计划 == 正文：结构，CitationNumber：字符串，颜色：字符串，CommitReference：字符串，ContactAddress：结构，ControlId：字符串，数据：字符串，依赖项：数组>，描述：字符串，DeviceId：字符串，错误： bigint,ErrorDetails:string,Exemption:struct,ExternalId:string,FeatureId:string,Features:array,FirstName:string,GroupPrincipals:array,GroupType:bigint,Id:bigint,IsAuthorized:boolean,IsDedicatedStorage:boolean,IsEnabled:boolean, IsInitialCreation:boolean,... 33 个以上字段>, Id: string, Properties: struct, Timestamp: string Relation[Body#248,Id#249,Properties#250,Timestamp#251] json

当抛出异常时：

== 分析的逻辑计划 == Body: struct,Id:bigint,IsAuthorized:boolean,Latitude:double,Longitude:double,Name:string,NewValue:struct,OldValue:struct,Ordinal:bigint,ParentZoneId:string,PrincipalId:bigint ,PrincipalName:string,Requirements:array,FeatureId:string,RequirementId:string,ServiceId:string>>,FeatureId:string,RequirementId:string,ServiceId:string>>,RestrictedZoneId:bigint,StreetName:string,TenantId:string,时间戳:string,... 2 个以上字段>, Id: string, Properties: struct, Timestamp: string Relation[Body#44,Id#45,Properties#46,Timestamp#47] json

归档时间：	6 年，9 月前
查看次数：	4366 次
最近记录：	6 年，9 月前