如何在spark Dataframe中像SQL一样实现EXISTS条件

Sag*_*tro 3 apache-spark apache-spark-sql pyspark

我很想知道,如何以 Spark Dataframe 方式实现类似 SQL 的 Exists 子句。

ven*_*nus 6

LEFT SEMI JOIN相当于EXISTSSpark中的功能。

val cityDF= Seq(("Delhi","India"),("Kolkata","India"),("Mumbai","India"),("Nairobi","Kenya"),("Colombo","Srilanka")).toDF("City","Country")
Run Code Online (Sandbox Code Playgroud)

df1

val CodeDF= Seq(("011","Delhi"),("022","Mumbai"),("033","Kolkata"),("044","Chennai")).toDF("Code","City")
Run Code Online (Sandbox Code Playgroud)

df2

val finalDF= cityDF.join(CodeDF, cityDF("City") === CodeDF("City"), "left_semi")
Run Code Online (Sandbox Code Playgroud)

df3