Spark数据帧1-:
+------+-------+---------+----+---+-------+
|city |product|date |sale|exp|wastage|
+------+-------+---------+----+---+-------+
|city 1|prod 1 |9/29/2017|358 |975|193 |
|city 1|prod 2 |8/25/2017|50 |687|201 |
|city 1|prod 3 |9/9/2017 |236 |431|169 |
|city 2|prod 1 |9/28/2017|358 |975|193 |
|city 2|prod 2 |8/24/2017|50 |687|201 |
|city 3|prod 3 |9/8/2017 |236 |431|169 |
+------+-------+---------+----+---+-------+
Run Code Online (Sandbox Code Playgroud)
Spark数据框2-:
+------+-------+---------+----+---+-------+
|city |product|date |sale|exp|wastage|
+------+-------+---------+----+---+-------+
|city 1|prod 1 |9/29/2017|358 |975|193 |
|city 1|prod 2 |8/25/2017|50 |687|201 |
|city 1|prod 3 |9/9/2017 |230 |430|160 |
|city 1|prod 4 |9/27/2017|350 |90 |190 |
|city 2|prod 2 …Run Code Online (Sandbox Code Playgroud) 从Spark Data框架中查找每个城市的上个月销售额
|City| Month |Sale|
+----+----------- +----- +
| c1| JAN-2017| 49 |
| c1| FEB-2017| 46 |
| c1| MAR-2017| 83 |
| c2| JAN-2017| 59 |
| c2| MAY-2017| 60 |
| c2| JUN-2017| 49 |
| c2| JUL-2017| 73 |
+----+-----+----+-------
Run Code Online (Sandbox Code Playgroud)
所需的解决方案是
|City| Month |Sale |previous_sale|
+----+-----+-------+-------------+--------
| c1| JAN-2017| 49| NULL |
| c1| FEB-2017| 46| 49 |
| c1| MAR-2017| 83| 46 |
| c2| JAN-2017| 59| NULL |
| c2| MAY-2017| 60| 59 | …Run Code Online (Sandbox Code Playgroud)