有谁知道 Apache Spark SQL 达到与标准 SQLqualify() + rnk 或 row_number 语句相同结果的最佳方法吗?
例如:
I want my final result to be a new Spark Dataframe with the 3 most recent records (as determined by statement_date descending) for each of the 100 unique account_numbers, therefore 300 final records in total.
In standard Teradata SQL, I can do the following:
select * from statement_data
qualify row_number ()
over(partition by acct_id order …
Run Code Online (Sandbox Code Playgroud) sql row-number window-functions apache-spark apache-spark-sql