我有一个火花作业的以下输入数据(在Parquet中):
Person (millions of rows)
+---------+----------+---------------+---------------+
| name | location | start | end |
+---------+----------+---------------+---------------+
| Person1 | 1230 | 1478630000001 | 1478630000010 |
| Person2 | 1230 | 1478630000002 | 1478630000012 |
| Person2 | 1230 | 1478630000013 | 1478630000020 |
| Person3 | 3450 | 1478630000001 | 1478630000015 |
+---------+----------+---------------+---------------+
Event (millions of rows)
+----------+----------+---------------+
| event | location | start_time |
+----------+----------+---------------+
| Biking | 1230 | 1478630000005 |
| Skating | 1230 | 1478630000014 | …
Run Code Online (Sandbox Code Playgroud)