如何在HIVE的连续日期之间插入行数据?

Boy*_*lot 2 hadoop hive date insert hiveql

样本数据:

 customer    txn_date    tag
    A           1-Jan-17    1   
    A           2-Jan-17    1 
    A           4-Jan-17    1 
    A           5-Jan-17    0         
    B           3-Jan-17    1
    B           5-Jan-17    0
Run Code Online (Sandbox Code Playgroud)

需要填写日期范围(2017年1月1日至2017年1月5日)中所有缺少的txn_date。就像下面这样:

输出应为:

customer    txn_date    tag
A           1-Jan-17    1   
A           2-Jan-17    1 
A           3-Jan-17    0 (inserted)
A           4-Jan-17    1 
A           5-Jan-17    0  
B           1-Jan-17    0 (inserted)
B           2-Jan-17    0 (inserted)
B           3-Jan-17    1
B           4-Jan-17    0 (inserted)
B           5-Jan-17    0
Run Code Online (Sandbox Code Playgroud)

Dav*_*itz 6

select  c.customer
       ,d.txn_date
       ,coalesce(t.tag,0) as tag       

from   (select date_add (from_date,i)   as txn_date

        from   (select  date '2017-01-01'   as from_date
                       ,date '2017-01-05'   as to_date
                ) p

                lateral view 
                posexplode(split(space(datediff(p.to_date,p.from_date)),' ')) pe as i,x
        ) d

        cross join (select  distinct 
                            customer 

                    from    t
                    ) c

        left join   t

        on          t.customer  = c.customer
                and t.txn_date  = d.txn_date
;                
Run Code Online (Sandbox Code Playgroud)
c.customer  d.txn_date  tag
A   2017-01-01  1
A   2017-01-02  1
A   2017-01-03  0
A   2017-01-04  1
A   2017-01-05  0
B   2017-01-01  0
B   2017-01-02  0
B   2017-01-03  1
B   2017-01-04  0
B   2017-01-05  0
Run Code Online (Sandbox Code Playgroud)