c.g*_*rey 10 python dictionary nested-lists
如何对字典列表中的重复元素求和?
样品清单:
data = [
[
{'user': 1, 'rating': 0},
{'user': 2, 'rating': 10},
{'user': 1, 'rating': 20},
{'user': 3, 'rating': 10}
],
[
{'user': 4, 'rating': 4},
{'user': 2, 'rating': 80},
{'user': 1, 'rating': 20},
{'user': 1, 'rating': 10}
],
]
Run Code Online (Sandbox Code Playgroud)
预期输出:
op = [
[
{'user': 1, 'rating': 20},
{'user': 2, 'rating': 10},
{'user': 3, 'rating': 10}
],
[
{'user': 4, 'rating': 4},
{'user': 2, 'rating': 80},
{'user': 1, 'rating': 30},
],
]
Run Code Online (Sandbox Code Playgroud)
与pandas:
>>> import pandas as pd
>>> [pd.DataFrame(dicts).groupby('user', as_index=False, sort=False).sum().to_dict(orient='records') for dicts in data]
[[{'user': 1, 'rating': 20},
{'user': 2, 'rating': 10},
{'user': 3, 'rating': 10}],
[{'user': 4, 'rating': 4},
{'user': 2, 'rating': 80},
{'user': 1, 'rating': 30}]]
Run Code Online (Sandbox Code Playgroud)
你可以试试:
\n\nfrom itertools import groupby\n\nresult = []\nfor lst in data:\n sublist = sorted(lst, key=lambda d: d['user'])\n grouped = groupby(sublist, key=lambda d: d['user'])\n result.append([\n {'user': name, 'rating': sum([d['rating'] for d in group])}\n for name, group in grouped])\n\n# Sort the `result` `rating` wise:\nresult = [sorted(sub, key=lambda d: d['rating']) for sub in result]\n\n# %%timeit\n# 7.54 \xc2\xb5s \xc2\xb1 220 ns per loop (mean \xc2\xb1 std. dev. of 7 runs, 100000 loops each)\nRun Code Online (Sandbox Code Playgroud)\n\n更新(更有效的解决方案):
\n\nresult = []\nfor lst in data:\n visited = {}\n for d in lst:\n if d['user'] in visited:\n visited[d['user']]['rating'] += d['rating'] \n else:\n visited[d['user']] = d\n\n result.append(sorted(visited.values(), key=lambda d: d['rating']))\n\n# %% timeit\n# 2.5 \xc2\xb5s \xc2\xb1 54 ns per loop (mean \xc2\xb1 std. dev. of 7 runs, 100000 loops each)\nRun Code Online (Sandbox Code Playgroud)\n\n结果:
\n\n# print(result)\n[\n [\n {'user': 2, 'rating': 10},\n {'user': 3, 'rating': 10},\n {'user': 1, 'rating': 20}\n ],\n [\n {'user': 4, 'rating': 4},\n {'user': 1, 'rating': 30},\n {'user': 2, 'rating': 80}\n ]\n]\nRun Code Online (Sandbox Code Playgroud)\n