Ard*_*oli 6 apache-spark apache-spark-sql azure-databricks
我正在 Databricks 中编写 Python 和 Spark SQL,并且使用 Spark 2.4.5。
我有两张桌子。
Create table IF NOT EXISTS db_xsi_ed_faits_shahgholi_ardalan.Destination
(
id Int,
Name string,
Deleted int
) USING Delta;
Create table IF NOT EXISTS db_xsi_ed_faits_shahgholi_ardalan.Source
(
id Int,
Name string,
Deleted int
) USING Delta;
Run Code Online (Sandbox Code Playgroud)
我需要在源和目标之间运行合并命令。我写了下面的命令
%sql
MERGE INTO db_xsi_ed_faits_shahgholi_ardalan.Destination AS D
USING db_xsi_ed_faits_shahgholi_ardalan.Source AS S
ON (S.id = D.id)
-- UPDATE
WHEN MATCHED AND S.Name <> D.Name THEN
UPDATE SET
D.Name = S.Name
-- INSERT
WHEN NOT MATCHED THEN
INSERT (id, Name, Deleted)
VALUES (S.id, S.Name, S.Deleted)
-- DELETE
WHEN NOT MATCHED BY SOURCE THEN
UPDATE SET
D.Deleted = 1
Run Code Online (Sandbox Code Playgroud)
当我运行此命令时,出现以下错误:
看来我们还没有NOT MATCHED BY SOURCE火花啊!我需要一个解决方案来做到这一点。
我写了这段代码,但我仍在寻找更好的方法
%sql
MERGE INTO db_xsi_ed_faits_shahgholi_ardalan.Destination AS D
USING db_xsi_ed_faits_shahgholi_ardalan.Source AS S
ON (S.id = D.id)
-- UPDATE
WHEN MATCHED AND S.Name <> D.Name THEN
UPDATE SET
D.Name = S.Name
-- INSERT
WHEN NOT MATCHED THEN
INSERT (id, Name, Deleted)
VALUES (S.id, S.Name, S.Deleted)
;
%sql
-- Logical delete
UPDATE db_xsi_ed_faits_shahgholi_ardalan.Destination
SET Deleted = 1
WHERE db_xsi_ed_faits_shahgholi_ardalan.Destination.id in
(
SELECT
D.id
FROM db_xsi_ed_faits_shahgholi_ardalan.Destination AS D
LEFT JOIN db_xsi_ed_faits_shahgholi_ardalan.Source AS S ON (S.id = D.id)
WHERE S.id is null
)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
5358 次 |
| 最近记录: |