Ton*_*Ton 3 python dataframe pandas
我试图弄清楚每分钟列出"f"列中的条目的次数.
import pandas as pd
import datetime as dt
f= ['f0001',
'f0001',
'f0001',
'f0001',
'f0020',
'f0008',
'f0001',
'f0005',
'f3203',
'f0002',
'f0002',
'f0001',
'f0201',
'f0001',
'f0439',
'f0233',
'f0008',
'f0003',
'f0009',
'f0005']
dates = ['20130101100103', '20130101100110',
'20130101100125', '20130101100133',
'20130101100100', '20130101100200',
'20130101100200', '20130101100200',
'20130101100200', '20130101100200',
'20130101100200', '20130101100300',
'20130101100300', '20130101100300',
'20130101100300', '20130101100400',
'20130101100400', '20130101100400',
'20130101100400', '20130101100400']
d = {'date': dates}
data = pd.DataFrame(d)
data['user'] = f
data.date = data.date.apply(str)
data.date = data.date.apply(lambda x:
dt.datetime.strptime(x,'%Y%m%d%H%M%S'))
s = data.groupby([data.date.map(lambda t: t.minute)]).count()
但到目前为止我所得到的只是下面的内容
s
date user
date
1 5 5
2 6 6
3 4 4
4 5 5
你快到了.您只需要添加data['user']到groupby子句中.
data.groupby([[data.date.dt.minute, data['user']]).count().rename(columns={'date':'count'}).reset_index()
Run Code Online (Sandbox Code Playgroud)
输出:
date user count
0 1 f0001 4
1 1 f0020 1
2 2 f0001 1
3 2 f0002 2
4 2 f0005 1
5 2 f0008 1
6 2 f3203 1
7 3 f0001 2
8 3 f0201 1
9 3 f0439 1
10 4 f0003 1
11 4 f0005 1
12 4 f0008 1
13 4 f0009 1
14 4 f0233 1
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
307 次 |
| 最近记录: |