如何在 Hive 0.13 中更新表?

Ani*_*kar 1 hadoop hive acid hiveql

我的 Hive 版本是 0.13。我有两张桌子,table_1还有table_2

table_1 包含:

customer_id | items | price | updated_date
------------+-------+-------+-------------
10          | watch | 1000  | 20170626
11          | bat   | 400   | 20170625
Run Code Online (Sandbox Code Playgroud)

table_2 包含:

customer_id | items    | price | updated_date
------------+----------+-------+-------------
10          | computer | 20000 | 20170624
Run Code Online (Sandbox Code Playgroud)

我想更新它table_2是否customer_id已经存在的记录,如果没有,它应该附加到table_2.

由于 Hive 0.13 不支持更新,我尝试使用 join,但失败了。

lef*_*oin 5

您可以使用row_numberfull join。这是使用的示例row_number

insert overwrite table_1 
select customer_id, items, price, updated_date
from
(
select customer_id, items, price, updated_date,
       row_number() over(partition by customer_id order by new_flag desc) rn
from 
    (
     select customer_id, items, price, updated_date, 0 as new_flag
       from table_1
     union all
     select customer_id, items, price, updated_date, 1 as new_flag
       from table_2
    ) all_data
)s where rn=1;
Run Code Online (Sandbox Code Playgroud)

另请参阅此答案以使用更新FULL JOINhttps : //stackoverflow.com/a/37744071/2700344