Cin*_*ndy 5 python group-by crosstab dataframe pandas
我想分析每辆汽车的统计数据,这些统计数据是修理的和新的。数据样本为:
Name IsItNew ControlDate
Car1 True 31/01/2018
Car2 True 28/02/2018
Car1 False 15/03/2018
Car2 True 16/04/2018
Car3 True 30/04/2018
Car2 False 25/05/2018
Car1 False 30/05/2018
Run Code Online (Sandbox Code Playgroud)
因此,我应该groupby按名称命名,如果有Falsein IsItNew列,则应该设置,False以及第一个日期(False发生的时间)。
我尝试groupby了nunique():
df = df.groupby(['Name','IsItNew', 'ControlDate' ])['Name'].nunique()
Run Code Online (Sandbox Code Playgroud)
但是,它返回每个组中唯一项的计数。
我怎样才能只接收分组的唯一项目而无任何计数?
Actual result is:
Name IsItNew ControlDate
Car1 True 31/01/2018 1
False 15/03/2018 1
30/05/2018 1
Car2 True 28/02/2018 1
16/04/2018 1
False 25/05/2018 1
Car3 True 30/04/2018 1
Expected Result is:
Name IsItNew ControlDate
Car1 False 15/03/2018
Car2 False 25/05/2018
Car3 True 30/04/2018
Run Code Online (Sandbox Code Playgroud)
我会很感激任何想法。谢谢)
首先将列转换为日期时间 by to_datetime,然后按 3 列排序DataFrame.sort_values,最后按列Names by获取第一行DataFrame.drop_duplicates:
df['ControlDate'] = pd.to_datetime(df['ControlDate'])
df = df.sort_values(['Name','IsItNew', 'ControlDate']).drop_duplicates('Name')
print (df)
Name IsItNew ControlDate
2 Car1 False 2018-03-15
5 Car2 False 2018-05-25
4 Car3 True 2018-04-30
Run Code Online (Sandbox Code Playgroud)
编辑:
print (df)
Name IsItNew ControlDate
0 Car1 True 31/01/2018
1 Car2 True 28/02/2018
2 Car1 False 15/03/2018
3 Car2 True 16/04/2018
4 Car3 True 30/04/2018
5 Car2 False 25/05/2018
6 Car1 False 30/05/2018
7 Car3 True 20/10/2019
8 Car3 True 30/04/2017
#set to datetimes
df['ControlDate'] = pd.to_datetime(df['ControlDate'])
#sorting by 3 columns
df1 = df.sort_values(['Name','IsItNew', 'ControlDate'])
#create Series for replace
s = df1.drop_duplicates('Name', keep='last').set_index('Name')['ControlDate']
#filter by Falses
df2 = df1.drop_duplicates('Name').copy()
#replace True rows by last timestamp
df2.loc[df2['IsItNew'], 'ControlDate'] = df2.loc[df2['IsItNew'], 'Name'].map(s)
print (df2)
Name IsItNew ControlDate
2 Car1 False 2018-03-15
5 Car2 False 2018-05-25
8 Car3 True 2019-10-20
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
78 次 |
| 最近记录: |