OcM*_*RUS 6 python grouping mean pandas
我有数据框:
time_to_rent = {'rentId': {0: 43.0, 1: 87.0, 2: 140.0, 3: 454.0, 4: 1458.0}, 'creditCardId': {0: 40, 1: 40, 2: 40, 3: 40, 4: 40}, 'createdAt': {0: Timestamp('2020-08-24 16:13:11.850216'), 1: Timestamp('2020-09-10 10:47:31.748628'), 2: Timestamp('2020-09-13 15:29:06.077622'), 3: Timestamp('2020-09-24 08:08:39.852348'), 4: Timestamp('2020-10-19 08:54:09.891518')}, 'updatedAt': {0: Timestamp('2020-08-24 20:26:31.805939'), 1: Timestamp('2020-09-10 20:05:18.759421'), 2: Timestamp('2020-09-13 18:38:10.044112'), 3: Timestamp('2020-09-24 08:53:22.512533'), 4: Timestamp('2020-10-19 17:10:09.110038')}, 'rent_time': {0: Timedelta('0 days 04:13:19.955723'), 1: Timedelta('0 days 09:17:47.010793'), 2: Timedelta('0 days 03:09:03.966490'), 3: Timedelta('0 days 00:44:42.660185'), 4: Timedelta('0 days 08:15:59.218520')}}
Run Code Online (Sandbox Code Playgroud)
按列“creditCardId”聚合数据框并具有“rent_time”平均值的想法。理想的输出应该是:
creditCardId rent_time mean
40 0 days 05:08:10.562342
Run Code Online (Sandbox Code Playgroud)
如果我运行代码:
print (time_to_rent['rent_time'].mean())
Run Code Online (Sandbox Code Playgroud)
它工作正常,我有“0 days 05:08:10.562342”作为输出。但是当我尝试按以下方式分组时:
time_to_rent.groupby('creditCardId', as_index=False)[['rent_time']].mean()
Run Code Online (Sandbox Code Playgroud)
我收到错误回复:
~\anaconda3\lib\site-packages\pandas\core\groupby\generic.py in _cython_agg_blocks(self, how, alt, numeric_only, min_count)
1093
1094 if not (agg_blocks or split_frames):
-> 1095 raise DataError("No numeric types to aggregate")
1096
1097 if split_items:
DataError: No numeric types to aggregate
Run Code Online (Sandbox Code Playgroud)
如果我使用命令:
time_to_rent = time_to_rent.groupby('creditCardId', as_index=False)[['rent_time']]
Run Code Online (Sandbox Code Playgroud)
它只返回“<pandas.core.groupby.generic.DataFrameGroupBy object at 0x000000000B5F2EE0>”
你能帮我理解我的错误在哪里吗?
这不是你的错误,可能是 Pandas 中的一个错误,因为Timedelta
可以平均。解决方法是apply
:
time_to_rent.groupby('creditCardId')['rent_time'].apply(lambda x: x.mean())
Run Code Online (Sandbox Code Playgroud)
输出:
creditCardId
40 0 days 05:08:10.562342200
Name: rent_time, dtype: timedelta64[ns]
Run Code Online (Sandbox Code Playgroud)