emh*_*emh 24 python list python-2.7
我正在使用Python 2.7,我正在尝试重复删除列表列表并合并重复项的值.
现在我有:
original_list = [['a', 1], ['b', 1], ['a', 1], ['b', 1], ['b', 2], ['c', 2], ['b', 3]]
Run Code Online (Sandbox Code Playgroud)
我想匹配每个嵌套列表的第一个元素,然后添加第二个元素的值.我想最终得到这个(最终列表的顺序无关紧要):
ideal_output = [['a', 2], ['b', 7], ['c', 2]]
Run Code Online (Sandbox Code Playgroud)
到目前为止,我有一些代码将根据每个嵌套列表的第一个元素找到重复值:
for item in original_list:
matches = -1
for x in original_list:
if (item[0] == x[0]):
matches += 1
if matches >= 1:
if item[0] not in duplicates_list:
duplicates_list.append(item[0])
Run Code Online (Sandbox Code Playgroud)
从这里我需要搜索original_list中的所有duplicates_list项并添加值,但我不确定最好的方法是什么.
Mar*_*eed 30
很多好的答案,但他们都使用了比我更多的代码,所以这是我的看法,因为它的价值:
totals = {}
for k,v in original_list:
totals[k] = totals.get(k,0) + v
# totals = {'a': 2, 'c': 2, 'b': 7}
Run Code Online (Sandbox Code Playgroud)
一旦你有这样的词典,从这些答案中的任何一个,你可以items用来获得元组列表:
totals.items()
# => [('a', 2), ('c', 2), ('b', 7)]
Run Code Online (Sandbox Code Playgroud)
并映射list元组以获取列表列表:
map(list, totals.items())
# => [['a', 2], ['c', 2], ['b', 7]]
Run Code Online (Sandbox Code Playgroud)
如果你想按顺序排序:
sorted(map(list, totals.items()))
# => [['a', 2], ['b', 7], ['c', 2]]
Run Code Online (Sandbox Code Playgroud)
Mac*_*Gol 14
>>> from collections import Counter
>>> lst = [['a', 1], ['b', 1], ['a', 1], ['b', 1], ['b', 2], ['c', 2], ['b', 3]]
>>> c = Counter(x for x, c in lst for _ in xrange(c))
Counter({'b': 7, 'a': 2, 'c': 2})
>>> map(list, c.iteritems())
[['a', 2], ['c', 2], ['b', 7]]
Run Code Online (Sandbox Code Playgroud)
或者,不重复每个项目(a, b)b次(@hcwhsa):
>>> from collections import Counter
>>> lst = [['a', 1], ['b', 1], ['a', 1], ['b', 1], ['b', 2], ['c', 2], ['b', 3]]
>>> c = sum((Counter(**{k:v}) for k, v in lst), Counter())
Counter({'b': 7, 'a': 2, 'c': 2})
>>> map(list, c.iteritems())
[['a', 2], ['c', 2], ['b', 7]]
Run Code Online (Sandbox Code Playgroud)
alk*_*lko 13
用途collections.Counter:
from collections import Counter
original_list = [['a', 1], ['b', 1], ['a', 1], ['b', 1], ['b', 2], ['c', 2], ['b', 3]]
result = Counter()
for k, v in original_list:
result.update({k:v})
map(list, result.items())
# [['a', 2], ['c', 2], ['b', 7]]
Run Code Online (Sandbox Code Playgroud)
所以,很多答案,观点和赞成.我甚Nice answer至从无到有赢得了我的第一次(在过去的两天里,我做了很多值得更多研究和努力的答案).鉴于此,我决定使用从头开始编写的简单脚本,至少完成一些研究和测试解决方案的性能.为了大小,不要直接包含代码.
每个函数都以它的作者命名,很容易找到问题所在. thefourtheye现在的解决方案等于马克里德之一,并以原始形式进行评估,itertools.groupby基于解决方案的第2 种状态.
每个测试几次(样本),每个样本依次调用几个函数迭代.我评估了样品时间的最小值,最大值和标准偏差.
在这里,我们进行了10次探测测试.
testing: thefourtheye, kroolik2, void, kroolik, alko, reed, visser
10 samples
10 iterations each
author min avg max stddev
reed 0.00000 0.00000 0.00000 0.00000
visser 0.00000 0.00150 0.01500 0.00450
thefourtheye 0.00000 0.00160 0.01600 0.00480
thefourtheye2 0.00000 0.00310 0.01600 0.00620
alko 0.00000 0.00630 0.01600 0.00772
void 0.01500 0.01540 0.01600 0.00049
kroolik2 0.04700 0.06430 0.07800 0.00831
kroolik 0.32800 0.34380 0.37500 0.01716
Run Code Online (Sandbox Code Playgroud)
看看底部两行:此时kroolik解决方案被取消资格,因为任何合理数量的样本*迭代将被执行数小时.这是最后的测试.我手动添加了一些upvotes到ouptut:
testing: thefourtheye, kroolik2, void, kroolik, alko, reed, visser
100 samples
1000 iterations each
author upvotes min avg max stddev
reed [20] 0.06200 0.08174 0.15600 0.01841
thefourtheye [5] 0.06200 0.09971 0.20300 0.01911
visser [6] 0.10900 0.12392 0.23500 0.02263
thefourtheye2 0.25000 0.29674 0.89000 0.07183
alko [11] 0.56200 0.62309 1.04700 0.08438
void [3] 1.50000 1.65480 2.39100 0.18721
kroolik [14] [DSQ]
Run Code Online (Sandbox Code Playgroud)
the*_*eye 10
If the order doesnt matter, you can use this
original_list = [['a', 1], ['b', 1], ['a', 1], ['b', 1], ['b', 2], ['c', 2], ['b', 3]]
myDict = {}
for first, second in original_list:
myDict[first] = myDict.get(first, 0) + second
result = [[key, value] for key, value in myDict.items()]
print result
Run Code Online (Sandbox Code Playgroud)
Or you can use groupby and the code becomes a oneliner
original_list = [['a', 1], ['b', 1], ['a', 1], ['b', 1], ['b', 2], ['c', 2], ['b', 3]]
from itertools import groupby
print [[key, sum(item[1] for item in list(group))]
for key, group in groupby(sorted(original_list), lambda x:x[0])]
Run Code Online (Sandbox Code Playgroud)
Output
[['a', 2], ['b', 7], ['c', 2]]
Run Code Online (Sandbox Code Playgroud)
你可以使用collections.defaultdict:
original_list = [['a', 1], ['b', 1], ['a', 1], ['b', 1], ['b', 2], ['c', 2], ['b', 3]]
import collections
data = collections.defaultdict(list)
for item in original_list:
data[item[0]].append(item[1])
output = {key: sum(values) for key, values in data.items()}
print output
# gives: {'a': 2, 'c': 2, 'b': 7}
Run Code Online (Sandbox Code Playgroud)
我知道这很难看,但我试着用1个班轮实现它很有趣:
map(list, set(([(x[0], sum([i[1] for i in original_list if i[0]==x[0]])) for x in original_list])))
Run Code Online (Sandbox Code Playgroud)
输出:
[['a', 2], ['b', 7], ['c', 2]]
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
14283 次 |
| 最近记录: |