我一直在尝试编写一个R脚本来查询Impala数据库.这是对数据库的查询:
select columnA, max(columnB) from databaseA.tableA where columnC in (select distinct(columnC) from databaseB.tableB ) group by columnA order by columnA
Run Code Online (Sandbox Code Playgroud)
当我手动运行此查询时(通过impala-shell读取:在Rscript之外),我可以获取表内容.但是,当通过R脚本尝试相同时,我收到以下错误:
[1] "HY000 140 [Cloudera][ImpalaODBC] (140) Unsupported query."
[2] "[RODBC] ERROR: Could not SQLExecDirect 'select columnA, max(columnB) from databaseA.tableA where columnC in (select distinct(columnC) from databaseB.tableB ) group by columnA order by columnA'
closing unused RODBC handle 1
Run Code Online (Sandbox Code Playgroud)
为什么在通过R尝试时查询失败?我该如何解决这个问题?提前致谢 :)
编辑1:
连接脚本如下所示:
library("RODBC");
connection <- odbcConnect("Impala");
query <- "select columnA, max(columnB) from databaseA.tableA where columnC in (select distinct(columnC) from databaseB.tableB ) group by columnA order by columnA";
data <- sqlQuery(connection,query);
Run Code Online (Sandbox Code Playgroud)
您需要安装相关驱动,请查看以下链接
我遇到了同样的问题,我所要做的就是更新 ODBC 驱动程序。
另外,如果您可以使用用户名和密码更新 odbcConnect
connection <- odbcConnect("Impala");
Run Code Online (Sandbox Code Playgroud)
到
connection <- odbcConnect("Impala", uid="root", pwd="password")
Run Code Online (Sandbox Code Playgroud)