mlw*_*lwh 6 python timestamp dataframe pandas
标题可能有点令人困惑,所以这里是一个例子:
从:
id | timestamp
1 | 2015-12-02 00:00:00
1 | 2015-12-03 00:00:00 <--- latest for id 1
2 | 2015-12-02 00:00:00
2 | 2015-12-04 00:00:00
2 | 2015-12-06 00:00:00 <--- latest for id 2
Run Code Online (Sandbox Code Playgroud)
对此:
id | timestamp
1 | 2015-12-03 00:00:00
2 | 2015-12-06 00:00:00
Run Code Online (Sandbox Code Playgroud)
使用nth
In [599]: df.groupby('id', as_index=False).nth(-1)
Out[599]:
id timestamp
1 1 2015-12-03 00:00:00
4 2 2015-12-06 00:00:00
Run Code Online (Sandbox Code Playgroud)
理想情况下,max因为您需要最新日期。
In [601]: df.groupby('id', as_index=False).max()
Out[601]:
id timestamp
0 1 2015-12-03 00:00:00
1 2 2015-12-06 00:00:00
Run Code Online (Sandbox Code Playgroud)
另外,tail正如评论中提到的
In [602]: df.groupby('id').tail(1)
Out[602]:
id timestamp
1 1 2015-12-03 00:00:00
4 2 2015-12-06 00:00:00
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
3723 次 |
| 最近记录: |