我们可以用 pyspark 中的数据从现有表创建一个新表吗

Vik*_*rma 4 apache-spark-sql

为 Teradata 创建表语法:

create table <DBname>.<Tablename>
as
select * from <DBname>.<Tablename>
with data;
Run Code Online (Sandbox Code Playgroud)

类似的,我们如何在Spark SQL中创建表呢?

mrs*_*vas 8

Spark SQL 中也几乎相同。

例子:

CREATE TABLE tablename 
    STORED AS PARQUET LOCATION 'some/location/incase/of/external/table' 
AS
SELECT *
    FROM source_table
WHERE 1=1
Run Code Online (Sandbox Code Playgroud)

正则表达式:(高级)

CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
  [(col_name data_type [COMMENT col_comment], ...)]
  [COMMENT table_comment]
  [
   [ROW FORMAT row_format] 
   [STORED AS file_format]
  ]
  [LOCATION path_to_save]
  [AS select_statement]
Run Code Online (Sandbox Code Playgroud)

顺便说一句,Spark 支持更多 Hive 语法和功能。您可以在此处参考 CTAS 文档