将 mysql 查询重写为 hive

lea*_*ode 2 sql hive count hiveql

我试图加入两个表:

表十

PlayerID   | Name      | Team
007        | Sancho    | Dortmund
010        | Messi     | Barcelona
011        | Werner    | Chelsea
001        | De Gea    | Man Utd
009        | Lewan..ki | Bayern Mun
006        | Pogba     | Man Utd
017        | De Bruyne | Man City
029        | Harvertz  | Chelsea
005        | Upamecano | Leipzig
Run Code Online (Sandbox Code Playgroud)

表 Y

PlayerID.   |Name      | Team
010         | Messi    | Man City
007         | Sancho   | Man Utd
006         | Pogba    | Man Utd
017         | De Bruyne| Man City
011         | Werner   | Liverpool
006         | Pogba    | Real Madrid
Run Code Online (Sandbox Code Playgroud)

使用这个查询

select avg(y.playerID is not null) as accuracy_ratio
from x
left join y 
    on  y.playerID = x.playerID
    and y.name     = x.name
    and y.team     = x.team
Run Code Online (Sandbox Code Playgroud)

但是,当我运行查询时,我得到一个Only numeric or string type arguments are accepted but boolean is passed. 我假设上述查询只能在 mysql 中完成。如何在 Hive 中重写它?

Som*_*omy 5

我意识到这与您之前的帖子有关,其中 GMB 在 MySQL 中提供了一个解决方案。这是您需要执行的操作。

select avg(case when y.playerID is not null then 1 else 0 end) as accuracy_ratio
 from x
left join y 
    on  y.playerID = x.playerID
    and y.name     = x.name
    and y.team     = x.team
Run Code Online (Sandbox Code Playgroud)