Python- Eratosthenes-紧凑型Python的筛选

Par*_*gue 3 python algorithm performance primes

这是我使用Eratosthenes筛选寻找素数的代码.

list = [i for i in range(2, int(raw_input("Compute primes up to what number? "))+1)]  

for i in list:
    for a in list:
            if a!=i and a%i == 0:
                list.remove(a)
Run Code Online (Sandbox Code Playgroud)

试图找到一种方法将那些嵌套的循环压缩成某种生成器或理解,但似乎你不能使用理解将函数应用于列表.我尝试使用地图和过滤器,但我似乎无法做到正确.

想到这样的事情:

print map(list.remove(a), filter(lambda a, i: (a%i ==0 and a!=i), [(a, i) for i in list for a in list])
Run Code Online (Sandbox Code Playgroud)

显然不会有十几个原因.如果我只是使用该代码的过滤器部分:

filter(lambda a, i: (a%i ==0 and a!=i), **[(a, i) for i in list for a in list]**
Run Code Online (Sandbox Code Playgroud)

将两个变量放入lambda的正确方法是什么?(a,i)使它成为一个元组,但我想提交'a'和'i'作为自变量放入lambda.

我最终解决了这个问题:

print sorted(set([i for i in range(2, int(raw_input("Compute primes up to what number? "))+1)]).difference(a for i in l for a in l if a!=i and a%i == 0))
Run Code Online (Sandbox Code Playgroud)

sen*_*rle 15

首先要注意的是,你所写的不是eratosthenes的筛子.看看有多少循环是一个完全天真的eratosthenes筛子执行:

def sieve1(n):
    loops = 0
    numbers = set(range(2, n))
    for i in range(2, int(n ** 0.5) + 1):
        for j in range(i * 2, n, i):
            numbers.discard(j)
            loops += 1
    return sorted(numbers), loops
Run Code Online (Sandbox Code Playgroud)

测试:

>>> sieve1(100)
([2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 
  43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97], 
 178)
Run Code Online (Sandbox Code Playgroud)

包含100个数字的178个循环(不包括排序).这可以通过一些小的改变来改善:

def sieve2(n):
    loops = 0
    numbers = range(0, n)
    for prime in numbers:
        if prime < 2:
            continue
        elif prime > n ** 0.5:
            break
        for i in range(prime ** 2, n, prime):
            numbers[i] = 0
            loops += 1
    return [x for x in numbers if x > 1], loops
Run Code Online (Sandbox Code Playgroud)

测试:

>>> sieve2(100)
([2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 
  43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97], 
 102)
Run Code Online (Sandbox Code Playgroud)

100个数字的102个循环(不包括末尾的过滤器).现在看看你的:

def sieve3(n):
    loops = 0
    numbers = range(2, n)
    for i in numbers:
        for j in numbers:
            if j != i and j % i == 0:
                numbers.remove(j)
            loops += 1
    return numbers, loops
Run Code Online (Sandbox Code Playgroud)

测试:

>>> sieve3(100)
([2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 
  43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97], 
 663)
Run Code Online (Sandbox Code Playgroud)

它变得更糟:

>>> [sieve1(x)[1] for x in [100, 1000, 10000]]
[178, 2978, 41723]
>>> [sieve2(x)[1] for x in [100, 1000, 10000]]
[102, 1409, 16979]
>>> [sieve3(x)[1] for x in [100, 1000, 10000]]
[663, 28986, 1523699]
Run Code Online (Sandbox Code Playgroud)

n = 10000,你的实现几乎是100倍的工作!

我的建议是在使其"紧凑"之前创建一个合理的实现.代码高尔夫可以很有趣,但无论长度如何,它都远不如编写高效代码那样具有挑战性或启发性.

也就是说,考虑这个单线程(如果你不计算导入,你可以用它lambda x, y: x - y来代替operator.sub).这实现了第一个算法,但改进很小:

>>> from operator import sub
>>> reduce(sub, (set(range(x ** 2, 100, x)) for x in range(2, int(100 ** 0.5) + 1)), set(range(2, 100)))
set([2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97])
Run Code Online (Sandbox Code Playgroud)


spi*_*piv 5

它并不完全是你的循环的直接翻译,但它非常接近和紧凑:

>>> l = range(2, 101)
>>> sorted(set(l).difference(a for i in l for a in l if a!=i and a%i == 0))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
Run Code Online (Sandbox Code Playgroud)

虽然我建议a > i而不是a != 0更短更快;)

  • 这种实施的运行时间非常短(因此不能合理地称为Eratosthenes筛选).当然,对于小数字来说,这是一种简洁的方法,这就是为什么我没有进行投票.=) (2认同)