在Python中减去两个列表

wic*_*ich 44 python collections list

在Python中,如何减去两个非唯一的无序列表?假设我们有a = [0,1,2,1,0]b = [0, 1, 1]我想这样做c = a - b,并有c成为[2, 0][0, 2]顺序并不重要,对我来说.如果a不包含b中的所有元素,则应抛出异常.

请注意,这与套装不同!我对找到a和b中元素集的区别不感兴趣,我对a和b中元素的实际集合之间的差异感兴趣.

我可以用for循环来做这个,在a中查找b的第一个元素,然后从a中删除元素,然后从a中删除等等.但这对我没有吸引力,它会非常低效(O(n^2)时间顺序)虽然及时做到这一点应该没问题O(n log n).

Dyn*_* Fu 56

我知道"for"不是你想要的,但它简单明了:

for x in b:
  a.remove(x)
Run Code Online (Sandbox Code Playgroud)

或者如果b可能不在的成员a使用:

for x in b:
  if x in a:
    a.remove(x)
Run Code Online (Sandbox Code Playgroud)

  • 如果在循环之前添加`c = list(a)`然后从`c`中删除项,则总共将有三行.在我看来,这可能是清晰可读的. (9认同)
  • 但是对于大型列表来说这是非常低效的,不是吗? (3认同)
  • 实际上@jkp,列表理解返回`[None,None,None]` (2认同)
  • @Kimvais:确实如此,但是`a`将是`[2,0]`. (2认同)

Dav*_*rby 30

Python 2.7和3.2将添加collections.Counter类,它是一个将元素映射到元素出现次数的字典.这可以用作多重集.

根据文档,你应该可以做这样的事情(未经测试,因为我没有安装任何版本).

from collections import Counter
a = Counter(0,1,2,1)
b = Counter(0,1,1)

print a - b  # ignores items in b missing in a

# check every element in a is in b
# a[key] returns 0 if key not in a, instead of raising an exception
assert all(a[key] > b[key] for key in b)
Run Code Online (Sandbox Code Playgroud)

编辑:

由于您坚持使用2.5,因此可以尝试导入它并定义您自己的版本(如果失败).这样一来,如果没有,你肯定会得到最新版本,如果没有,你会回到工作版本.如果将来转换为C实现,您还将受益于速度改进.

try:
   from collections import Counter
except ImportError:
    class Counter(dict):
       ...
Run Code Online (Sandbox Code Playgroud)

您可以在此处找到当前的Python源代码.

  • 它应该是`Counter([0,1,1])`而不是`Counter(0,1,1)`. (4认同)
  • 应该是'a [key]> = b [key]`而不是`a [key]> b [key]` (3认同)
  • 只是想提一下,我彻底修改了这个答案,以使代码正常工作(至少有两个错误和一个更微妙的错误 - 键而不是元素),并在 10 年过去了并且 Python 2 现已 EOL 后对其进行了更新。如果您想查看更改的内容,请检查[修订历史记录](https://stackoverflow.com/posts/2071172/revisions)。 (2认同)

pcv*_*pcv 29

我会以一种更简单的方式做到这一点:

a_b = [e for e in a if not e in b ]
Run Code Online (Sandbox Code Playgroud)

..这是写的,这是错误的 - 只有当项目在列表中是唯一的时才有效.如果是,那就更好用

a_b = list(set(a) - set(b))
Run Code Online (Sandbox Code Playgroud)


jkp*_*jkp 6

我不确定for循环的异议是什么:Python中没有多重集,所以你不能使用内置容器来帮助你.

在我看来,任何一行(如果可能的话)都可能很难理解.寻求可读性和KISS.Python不是C :)


eph*_*ent 5

Python 2.7+和3.0有collections.Counter(又名multiset).文档链接到Recipe 576611: Python 2.5的Counter类:

from operator import itemgetter
from heapq import nlargest
from itertools import repeat, ifilter

class Counter(dict):
    '''Dict subclass for counting hashable objects.  Sometimes called a bag
    or multiset.  Elements are stored as dictionary keys and their counts
    are stored as dictionary values.

    >>> Counter('zyzygy')
    Counter({'y': 3, 'z': 2, 'g': 1})

    '''

    def __init__(self, iterable=None, **kwds):
        '''Create a new, empty Counter object.  And if given, count elements
        from an input iterable.  Or, initialize the count from another mapping
        of elements to their counts.

        >>> c = Counter()                           # a new, empty counter
        >>> c = Counter('gallahad')                 # a new counter from an iterable
        >>> c = Counter({'a': 4, 'b': 2})           # a new counter from a mapping
        >>> c = Counter(a=4, b=2)                   # a new counter from keyword args

        '''        
        self.update(iterable, **kwds)

    def __missing__(self, key):
        return 0

    def most_common(self, n=None):
        '''List the n most common elements and their counts from the most
        common to the least.  If n is None, then list all element counts.

        >>> Counter('abracadabra').most_common(3)
        [('a', 5), ('r', 2), ('b', 2)]

        '''        
        if n is None:
            return sorted(self.iteritems(), key=itemgetter(1), reverse=True)
        return nlargest(n, self.iteritems(), key=itemgetter(1))

    def elements(self):
        '''Iterator over elements repeating each as many times as its count.

        >>> c = Counter('ABCABC')
        >>> sorted(c.elements())
        ['A', 'A', 'B', 'B', 'C', 'C']

        If an element's count has been set to zero or is a negative number,
        elements() will ignore it.

        '''
        for elem, count in self.iteritems():
            for _ in repeat(None, count):
                yield elem

    # Override dict methods where the meaning changes for Counter objects.

    @classmethod
    def fromkeys(cls, iterable, v=None):
        raise NotImplementedError(
            'Counter.fromkeys() is undefined.  Use Counter(iterable) instead.')

    def update(self, iterable=None, **kwds):
        '''Like dict.update() but add counts instead of replacing them.

        Source can be an iterable, a dictionary, or another Counter instance.

        >>> c = Counter('which')
        >>> c.update('witch')           # add elements from another iterable
        >>> d = Counter('watch')
        >>> c.update(d)                 # add elements from another counter
        >>> c['h']                      # four 'h' in which, witch, and watch
        4

        '''        
        if iterable is not None:
            if hasattr(iterable, 'iteritems'):
                if self:
                    self_get = self.get
                    for elem, count in iterable.iteritems():
                        self[elem] = self_get(elem, 0) + count
                else:
                    dict.update(self, iterable) # fast path when counter is empty
            else:
                self_get = self.get
                for elem in iterable:
                    self[elem] = self_get(elem, 0) + 1
        if kwds:
            self.update(kwds)

    def copy(self):
        'Like dict.copy() but returns a Counter instance instead of a dict.'
        return Counter(self)

    def __delitem__(self, elem):
        'Like dict.__delitem__() but does not raise KeyError for missing values.'
        if elem in self:
            dict.__delitem__(self, elem)

    def __repr__(self):
        if not self:
            return '%s()' % self.__class__.__name__
        items = ', '.join(map('%r: %r'.__mod__, self.most_common()))
        return '%s({%s})' % (self.__class__.__name__, items)

    # Multiset-style mathematical operations discussed in:
    #       Knuth TAOCP Volume II section 4.6.3 exercise 19
    #       and at http://en.wikipedia.org/wiki/Multiset
    #
    # Outputs guaranteed to only include positive counts.
    #
    # To strip negative and zero counts, add-in an empty counter:
    #       c += Counter()

    def __add__(self, other):
        '''Add counts from two counters.

        >>> Counter('abbb') + Counter('bcc')
        Counter({'b': 4, 'c': 2, 'a': 1})


        '''
        if not isinstance(other, Counter):
            return NotImplemented
        result = Counter()
        for elem in set(self) | set(other):
            newcount = self[elem] + other[elem]
            if newcount > 0:
                result[elem] = newcount
        return result

    def __sub__(self, other):
        ''' Subtract count, but keep only results with positive counts.

        >>> Counter('abbbc') - Counter('bccd')
        Counter({'b': 2, 'a': 1})

        '''
        if not isinstance(other, Counter):
            return NotImplemented
        result = Counter()
        for elem in set(self) | set(other):
            newcount = self[elem] - other[elem]
            if newcount > 0:
                result[elem] = newcount
        return result

    def __or__(self, other):
        '''Union is the maximum of value in either of the input counters.

        >>> Counter('abbb') | Counter('bcc')
        Counter({'b': 3, 'c': 2, 'a': 1})

        '''
        if not isinstance(other, Counter):
            return NotImplemented
        _max = max
        result = Counter()
        for elem in set(self) | set(other):
            newcount = _max(self[elem], other[elem])
            if newcount > 0:
                result[elem] = newcount
        return result

    def __and__(self, other):
        ''' Intersection is the minimum of corresponding counts.

        >>> Counter('abbb') & Counter('bcc')
        Counter({'b': 1})

        '''
        if not isinstance(other, Counter):
            return NotImplemented
        _min = min
        result = Counter()
        if len(self) < len(other):
            self, other = other, self
        for elem in ifilter(self.__contains__, other):
            newcount = _min(self[elem], other[elem])
            if newcount > 0:
                result[elem] = newcount
        return result


if __name__ == '__main__':
    import doctest
    print doctest.testmod()
Run Code Online (Sandbox Code Playgroud)

然后你就可以写了

 a = Counter([0,1,2,1,0])
 b = Counter([0, 1, 1])
 c = a - b
 print list(c.elements())  # [0, 2]
Run Code Online (Sandbox Code Playgroud)