Uma*_*war 5 apache-spark apache-spark-sql
在Spark shell上执行以下查询时,我遇到分区错误:
预期只有分区修剪谓词:((((isnotnull(tenant_suite#478)&& isnotnull(DS#477))&&(DS#477> = 2017-06-01))&&(DS#477 <= 2017-06-25 ))&&(tenant_suite#478 = SAMS_CORESITE))
不确定抛出的错误是什么.有人可以帮我这个吗?
SELECT
A.*
FROM
(-----------SUBQUERY 1
SELECT *
FROM
T2 -- PARTITION COLUMNS ARE DS AND TENANT_SUITE
WHERE
DS BETWEEN '2017-06-01'AND '2017-06-25'--date_sub(TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP())),1)
AND tenant_suite = 'CORESITE'
) a
JOIN
( -----------SUBQUERY 1
SELECT
concat(concat(visid_high,'-',visid_low),'-',visit_num) AS VISIT_ID
,concat(visid_high,'-',visid_low) AS VISITOR_ID
,MAX(DS) AS EVENT_DT
FROM
T2 -- PARTITION COLUMNS ARE DS AND TENANT_SUITE
WHERE
tenant_suite = 'CORESITE'
AND DS BETWEEN '2017-06-01'AND '2017-06-25' --date_sub(TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP())),1)
GROUP BY concat(concat(visid_high,'-',visid_low),'-',visit_num),concat(visid_high,'-',visid_low)
) B
ON A.VISIT_ID = B.VISIT_ID
AND A.VISITOR_ID = B.VISITOR_ID
AND A.VISIT_DT = B.EVENT_DT
group by a.VISIT_DT;
Run Code Online (Sandbox Code Playgroud)
在Spark分区列中区分大小写 - Spark Jira Issue
| 归档时间: |
|
| 查看次数: |
1240 次 |
| 最近记录: |