相关疑难解决方法(0)

如何从Spark SQL查询[PySpark]获取表名?

要从SQL查询获取表名,

select *
from table1 as t1
full outer join table2 as t2
  on t1.id = t2.id
Run Code Online (Sandbox Code Playgroud)

我在Scala中找到了解决方案如何从SQL查询获取表名?

def getTables(query: String): Seq[String] = {
  val logicalPlan = spark.sessionState.sqlParser.parsePlan(query)
  import org.apache.spark.sql.catalyst.analysis.UnresolvedRelation
  logicalPlan.collect { case r: UnresolvedRelation => r.tableName }
}
Run Code Online (Sandbox Code Playgroud)

当我遍历返回序列时,这为我提供了正确的表名 getTables(query).foreach(println)

table1
table2
Run Code Online (Sandbox Code Playgroud)

PySpark的等效语法是什么?我遇到的最接近的是 如何从pyspark中的SQL中提取列名和列类型

table1
table2
Run Code Online (Sandbox Code Playgroud)

追溯失败

Py4JError: An error occurred while calling o78.tableDesc. Trace:
py4j.Py4JException: Method tableDesc([]) does not exist
    at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)
    at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326)
    at py4j.Gateway.invoke(Gateway.java:274)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.base/java.lang.Thread.run(Thread.java:835)
Run Code Online (Sandbox Code Playgroud)

我了解,问题源于以下事实:我需要过滤所有类型的计划项目,UnresolvedRelation但无法在python …

python sql scala apache-spark pyspark

7
推荐指数
1
解决办法
224
查看次数

标签 统计

apache-spark ×1

pyspark ×1

python ×1

scala ×1

sql ×1