Xav*_* Ho 39 optimization primes bit-manipulation sieve-of-eratosthenes python-3.x
我使用Sieve of Eratosthenes和Python 3.1 编写了一个素数生成器.代码在ideone.com上以0.32秒正确且优雅地运行,以生成高达1,000,000的素数.
# from bitstring import BitString
def prime_numbers(limit=1000000):
'''Prime number generator. Yields the series
2, 3, 5, 7, 11, 13, 17, 19, 23, 29 ...
using Sieve of Eratosthenes.
'''
yield 2
sub_limit = int(limit**0.5)
flags = [False, False] + [True] * (limit - 2)
# flags = BitString(limit)
# Step through all the odd numbers
for i in range(3, limit, 2):
if flags[i] is False:
# if flags[i] is True:
continue
yield i
# Exclude further multiples of the current prime number
if i <= sub_limit:
for j in range(i*3, limit, i<<1):
flags[j] = False
# flags[j] = True
Run Code Online (Sandbox Code Playgroud)
问题是,当我尝试生成高达1,000,000,000的数字时,我的内存耗尽.
flags = [False, False] + [True] * (limit - 2)
MemoryError
Run Code Online (Sandbox Code Playgroud)
可以想象,在Python中分配10亿个布尔值(1个字节 4或8个字节(请参阅注释))实际上是不可行的,所以我查看了bitstring.我想,每个标志使用1位将更加节省内存.但是,该程序的性能急剧下降 - 运行时间为24秒,素数高达1,000,000.这可能是由于bitstring的内部实现.
您可以注释/取消注释这三行,以查看我更改为使用BitString的内容,如上面的代码片段.
我的问题是,有没有办法加快我的程序,有或没有bittring?
编辑:请在发布前自行测试代码.我自然不能接受比现有代码慢的答案.
再次编辑:
cas*_*evh 34
您的版本有几个小优化.通过反转True和False的角色,您可以将" if flags[i] is False:" 更改为" if flags[i]:".并且第二个range语句的起始值可以i*i代替i*3.您的原始版本在我的系统上需要0.166秒.通过这些更改,我的系统下面的版本需要0.156秒.
def prime_numbers(limit=1000000):
'''Prime number generator. Yields the series
2, 3, 5, 7, 11, 13, 17, 19, 23, 29 ...
using Sieve of Eratosthenes.
'''
yield 2
sub_limit = int(limit**0.5)
flags = [True, True] + [False] * (limit - 2)
# Step through all the odd numbers
for i in range(3, limit, 2):
if flags[i]:
continue
yield i
# Exclude further multiples of the current prime number
if i <= sub_limit:
for j in range(i*i, limit, i<<1):
flags[j] = True
Run Code Online (Sandbox Code Playgroud)
不过,这对你的记忆问题没有帮助.
进入C扩展世界,我使用了gmpy的开发版本.(免责声明:我是维护者之一.)开发版本称为gmpy2,支持名为xmpz的可变整数.使用gmpy2和以下代码,我的运行时间为0.140秒.限制为1,000,000,000的运行时间为158秒.
import gmpy2
def prime_numbers(limit=1000000):
'''Prime number generator. Yields the series
2, 3, 5, 7, 11, 13, 17, 19, 23, 29 ...
using Sieve of Eratosthenes.
'''
yield 2
sub_limit = int(limit**0.5)
# Actual number is 2*bit_position + 1.
oddnums = gmpy2.xmpz(1)
current = 0
while True:
current += 1
current = oddnums.bit_scan0(current)
prime = 2 * current + 1
if prime > limit:
break
yield prime
# Exclude further multiples of the current prime number
if prime <= sub_limit:
for j in range(2*current*(current+1), limit>>1, prime):
oddnums.bit_set(j)
Run Code Online (Sandbox Code Playgroud)
推动优化并牺牲清晰度,我使用以下代码获得0.107和123秒的运行时间:
import gmpy2
def prime_numbers(limit=1000000):
'''Prime number generator. Yields the series
2, 3, 5, 7, 11, 13, 17, 19, 23, 29 ...
using Sieve of Eratosthenes.
'''
yield 2
sub_limit = int(limit**0.5)
# Actual number is 2*bit_position + 1.
oddnums = gmpy2.xmpz(1)
f_set = oddnums.bit_set
f_scan0 = oddnums.bit_scan0
current = 0
while True:
current += 1
current = f_scan0(current)
prime = 2 * current + 1
if prime > limit:
break
yield prime
# Exclude further multiples of the current prime number
if prime <= sub_limit:
list(map(f_set,range(2*current*(current+1), limit>>1, prime)))
Run Code Online (Sandbox Code Playgroud)
编辑:根据这个练习,我修改了gmpy2接受xmpz.bit_set(iterator).使用以下代码,所有素数少于1,000,000,000的运行时间对于Python 2.7为56秒,对于Python 3.2为74秒.(如评论中所述,xrange速度快于range.)
import gmpy2
try:
range = xrange
except NameError:
pass
def prime_numbers(limit=1000000):
'''Prime number generator. Yields the series
2, 3, 5, 7, 11, 13, 17, 19, 23, 29 ...
using Sieve of Eratosthenes.
'''
yield 2
sub_limit = int(limit**0.5)
oddnums = gmpy2.xmpz(1)
f_scan0 = oddnums.bit_scan0
current = 0
while True:
current += 1
current = f_scan0(current)
prime = 2 * current + 1
if prime > limit:
break
yield prime
if prime <= sub_limit:
oddnums.bit_set(iter(range(2*current*(current+1), limit>>1, prime)))
Run Code Online (Sandbox Code Playgroud)
编辑#2:再试一次!我修改了gmpy2来接受xmpz.bit_set(slice).使用以下代码,Python 2.7和Python 3.2的所有质数小于1,000,000,000的运行时间约为40秒.
from __future__ import print_function
import time
import gmpy2
def prime_numbers(limit=1000000):
'''Prime number generator. Yields the series
2, 3, 5, 7, 11, 13, 17, 19, 23, 29 ...
using Sieve of Eratosthenes.
'''
yield 2
sub_limit = int(limit**0.5)
flags = gmpy2.xmpz(1)
# pre-allocate the total length
flags.bit_set((limit>>1)+1)
f_scan0 = flags.bit_scan0
current = 0
while True:
current += 1
current = f_scan0(current)
prime = 2 * current + 1
if prime > limit:
break
yield prime
if prime <= sub_limit:
flags.bit_set(slice(2*current*(current+1), limit>>1, prime))
start = time.time()
result = list(prime_numbers(1000000000))
print(time.time() - start)
Run Code Online (Sandbox Code Playgroud)
编辑#3:我已经更新了gmpy2以正确支持xmpz位级别的切片.性能没有变化,但是API非常好.我做了一点调整,我的时间缩短了大约37秒.(请参阅编辑#4以了解gmpy2 2.0.0b1中的更改.)
from __future__ import print_function
import time
import gmpy2
def prime_numbers(limit=1000000):
'''Prime number generator. Yields the series
2, 3, 5, 7, 11, 13, 17, 19, 23, 29 ...
using Sieve of Eratosthenes.
'''
sub_limit = int(limit**0.5)
flags = gmpy2.xmpz(1)
flags[(limit>>1)+1] = True
f_scan0 = flags.bit_scan0
current = 0
prime = 2
while prime <= sub_limit:
yield prime
current += 1
current = f_scan0(current)
prime = 2 * current + 1
flags[2*current*(current+1):limit>>1:prime] = True
while prime <= limit:
yield prime
current += 1
current = f_scan0(current)
prime = 2 * current + 1
start = time.time()
result = list(prime_numbers(1000000000))
print(time.time() - start)
Run Code Online (Sandbox Code Playgroud)
编辑#4:我在gmpy2 2.0.0b1中做了一些改变,打破了前面的例子.gmpy2不再将True视为提供1位无限来源的特殊值.应该使用-1代替.
from __future__ import print_function
import time
import gmpy2
def prime_numbers(limit=1000000):
'''Prime number generator. Yields the series
2, 3, 5, 7, 11, 13, 17, 19, 23, 29 ...
using Sieve of Eratosthenes.
'''
sub_limit = int(limit**0.5)
flags = gmpy2.xmpz(1)
flags[(limit>>1)+1] = 1
f_scan0 = flags.bit_scan0
current = 0
prime = 2
while prime <= sub_limit:
yield prime
current += 1
current = f_scan0(current)
prime = 2 * current + 1
flags[2*current*(current+1):limit>>1:prime] = -1
while prime <= limit:
yield prime
current += 1
current = f_scan0(current)
prime = 2 * current + 1
start = time.time()
result = list(prime_numbers(1000000000))
print(time.time() - start)
Run Code Online (Sandbox Code Playgroud)
编辑#5:我对gmpy2 2.0.0b2做了一些改进.您现在可以迭代所有已设置或清除的位.运行时间提高了约30%.
from __future__ import print_function
import time
import gmpy2
def sieve(limit=1000000):
'''Returns a generator that yields the prime numbers up to limit.'''
# Increment by 1 to account for the fact that slices do not include
# the last index value but we do want to include the last value for
# calculating a list of primes.
sieve_limit = gmpy2.isqrt(limit) + 1
limit += 1
# Mark bit positions 0 and 1 as not prime.
bitmap = gmpy2.xmpz(3)
# Process 2 separately. This allows us to use p+p for the step size
# when sieving the remaining primes.
bitmap[4 : limit : 2] = -1
# Sieve the remaining primes.
for p in bitmap.iter_clear(3, sieve_limit):
bitmap[p*p : limit : p+p] = -1
return bitmap.iter_clear(2, limit)
if __name__ == "__main__":
start = time.time()
result = list(sieve(1000000000))
print(time.time() - start)
print(len(result))
Run Code Online (Sandbox Code Playgroud)
好了,这是我的第二个答案,但速度是最重要的我想我不得不提一下bitarray模块-即使它位串的克星:).它非常适合这种情况,因为它不仅是一个C扩展(并且比纯Python更有希望),但它也支持切片分配.但它尚不适用于Python 3.
我甚至没有尝试优化它,我只是重写了bitstring版本.在我的机器上,我得到0.16秒的素数低于一百万.
十亿,它运行得非常好,并在2分31秒内完成.
import bitarray
def prime_bitarray(limit=1000000):
yield 2
flags = bitarray.bitarray(limit)
flags.setall(False)
sub_limit = int(limit**0.5)
for i in xrange(3, limit, 2):
if not flags[i]:
yield i
if i <= sub_limit:
flags[3*i:limit:i*2] = True
Run Code Online (Sandbox Code Playgroud)
好的,这是一个(近乎完整的)全面的基准测试,我今晚已经完成,看看哪个代码运行速度最快.希望有人会发现这个列表很有用.我省略了在我的机器上完成任何需要超过30秒的事情.
我要感谢所有投入的人.我从你的努力中获得了很多洞察力,我希望你也有.
我的机器:AMD ZM-86,2.40 Ghz双核,4GB内存.这是一台HP Touchsmart Tx2笔记本电脑.请注意,虽然我可能已链接到pastebin,但我在自己的计算机上对以下所有内容进行了基准测试.
一旦我能够构建它,我将添加gmpy2基准测试.
所有基准测试都在Python 2.6 x86中进行了测试
返回最多1,000,000的素数列表:( 使用 Python生成器)
塞巴斯蒂安的numpy发电机版(更新) - 121 ms @
Mark's Sieve + Wheel - 154 ms
罗伯特的切片版本 - 159毫秒
我的改进版本切片 - 205毫秒
带有枚举的Numpy发生器 - 249 ms @
马克的基本筛分 - 317毫秒
casevh对我原始解决方案的改进 - 343毫秒
我修改的numpy发生器解决方案 - 407毫秒
我在问题中的原始方法 - 409毫秒
Bitarray解决方案 - 414毫秒@
带有bytearray的纯Python - 1394 ms @
Scott的BitString解决方案 - 6659毫秒@
'@'表示此方法在我的机器设置中最多可生成n <1,000,000,000.
此外,如果您不需要生成器,只需要立即获取整个列表:
来自RosettaCode的numpy解决方案 - 32毫秒@
(numpy解决方案也能够产生高达10亿,这需要61.6259秒.我怀疑内存被交换一次,因此是双倍时间.)
相关问题:在python中列出N以下所有素数的最快方法.
嗨,我也在寻找Python中的代码来尽可能快地生成高达10 ^ 9的质数,这很难因为内存问题.到目前为止,我想出了这个来生成高达10 ^ 6和10 ^ 7(在我的旧机器上分别为0,171s和1,764s)的质数,这似乎比"我的计算机"略快(至少在我的计算机中)改进版本与切片"从以前的帖子,可能是因为我使用n//i-i +1(见下文)而不是len(flags[i2::i<<1])其他代码中的类似.如果我错了,请纠正我.任何改进建议都是非常受欢迎的.
def primes(n):
""" Returns a list of primes < n """
sieve = [True] * n
for i in xrange(3,int(n**0.5)+1,2):
if sieve[i]:
sieve[i*i::2*i]=[False]*((n-i*i-1)/(2*i)+1)
return [2] + [i for i in xrange(3,n,2) if sieve[i]]
Run Code Online (Sandbox Code Playgroud)
在他的一个代码中,泽维尔使用flags[i2::i<<1]和len(flags[i2::i<<1]).我计算的大小明确,使用的事实,i*i..n我们(n-i*i)//2*i的倍数2*i,因为我们想算i*i还我们总结1这样len(sieve[i*i::2*i])等于(n-i*i)//(2*i) +1.这使代码更快.基于上述代码的基本生成器将是:
def primesgen(n):
""" Generates all primes <= n """
sieve = [True] * n
yield 2
for i in xrange(3,int(n**0.5)+1,2):
if sieve[i]:
yield i
sieve[i*i::2*i] = [False]*((n-i*i-1)/(2*i)+1)
for i in xrange(i+2,n,2):
if sieve[i]: yield i
Run Code Online (Sandbox Code Playgroud)
通过一些修改,可以编写一个稍慢版本的代码,以一半大小的筛子开始sieve = [True] * (n//2)并且工作相同n.不确定这将如何反映在内存问题上.作为一个实现的例子,这里是numpy rosetta代码的修改版本(可能更快)从一半大小的筛子开始.
import numpy
def primesfrom3to(n):
""" Returns a array of primes, 3 <= p < n """
sieve = numpy.ones(n/2, dtype=numpy.bool)
for i in xrange(3,int(n**0.5)+1,2):
if sieve[i/2]: sieve[i*i/2::i] = False
return 2*numpy.nonzero(sieve)[0][1::]+1
Run Code Online (Sandbox Code Playgroud)
一个更快,更内存的生成器将是:
import numpy
def primesgen1(n):
""" Input n>=6, Generates all primes < n """
sieve1 = numpy.ones(n/6+1, dtype=numpy.bool)
sieve5 = numpy.ones(n/6 , dtype=numpy.bool)
sieve1[0] = False
yield 2; yield 3;
for i in xrange(int(n**0.5)/6+1):
if sieve1[i]:
k=6*i+1; yield k;
sieve1[ ((k*k)/6) ::k] = False
sieve5[(k*k+4*k)/6::k] = False
if sieve5[i]:
k=6*i+5; yield k;
sieve1[ ((k*k)/6) ::k] = False
sieve5[(k*k+2*k)/6::k] = False
for i in xrange(i+1,n/6):
if sieve1[i]: yield 6*i+1
if sieve5[i]: yield 6*i+5
if n%6 > 1:
if sieve1[i+1]: yield 6*(i+1)+1
Run Code Online (Sandbox Code Playgroud)
或者使用更多代码:
import numpy
def primesgen(n):
""" Input n>=30, Generates all primes < n """
size = n/30 + 1
sieve01 = numpy.ones(size, dtype=numpy.bool)
sieve07 = numpy.ones(size, dtype=numpy.bool)
sieve11 = numpy.ones(size, dtype=numpy.bool)
sieve13 = numpy.ones(size, dtype=numpy.bool)
sieve17 = numpy.ones(size, dtype=numpy.bool)
sieve19 = numpy.ones(size, dtype=numpy.bool)
sieve23 = numpy.ones(size, dtype=numpy.bool)
sieve29 = numpy.ones(size, dtype=numpy.bool)
sieve01[0] = False
yield 2; yield 3; yield 5;
for i in xrange(int(n**0.5)/30+1):
if sieve01[i]:
k=30*i+1; yield k;
sieve01[ (k*k)/30::k] = False
sieve07[(k*k+ 6*k)/30::k] = False
sieve11[(k*k+10*k)/30::k] = False
sieve13[(k*k+12*k)/30::k] = False
sieve17[(k*k+16*k)/30::k] = False
sieve19[(k*k+18*k)/30::k] = False
sieve23[(k*k+22*k)/30::k] = False
sieve29[(k*k+28*k)/30::k] = False
if sieve07[i]:
k=30*i+7; yield k;
sieve01[(k*k+ 6*k)/30::k] = False
sieve07[(k*k+24*k)/30::k] = False
sieve11[(k*k+16*k)/30::k] = False
sieve13[(k*k+12*k)/30::k] = False
sieve17[(k*k+ 4*k)/30::k] = False
sieve19[ (k*k)/30::k] = False
sieve23[(k*k+22*k)/30::k] = False
sieve29[(k*k+10*k)/30::k] = False
if sieve11[i]:
k=30*i+11; yield k;
sieve01[ (k*k)/30::k] = False
sieve07[(k*k+ 6*k)/30::k] = False
sieve11[(k*k+20*k)/30::k] = False
sieve13[(k*k+12*k)/30::k] = False
sieve17[(k*k+26*k)/30::k] = False
sieve19[(k*k+18*k)/30::k] = False
sieve23[(k*k+ 2*k)/30::k] = False
sieve29[(k*k+ 8*k)/30::k] = False
if sieve13[i]:
k=30*i+13; yield k;
sieve01[(k*k+24*k)/30::k] = False
sieve07[(k*k+ 6*k)/30::k] = False
sieve11[(k*k+ 4*k)/30::k] = False
sieve13[(k*k+18*k)/30::k] = False
sieve17[(k*k+16*k)/30::k] = False
sieve19[ (k*k)/30::k] = False
sieve23[(k*k+28*k)/30::k] = False
sieve29[(k*k+10*k)/30::k] = False
if sieve17[i]:
k=30*i+17; yield k;
sieve01[(k*k+ 6*k)/30::k] = False
sieve07[(k*k+24*k)/30::k] = False
sieve11[(k*k+26*k)/30::k] = False
sieve13[(k*k+12*k)/30::k] = False
sieve17[(k*k+14*k)/30::k] = False
sieve19[ (k*k)/30::k] = False
sieve23[(k*k+ 2*k)/30::k] = False
sieve29[(k*k+20*k)/30::k] = False
if sieve19[i]:
k=30*i+19; yield k;
sieve01[ (k*k)/30::k] = False
sieve07[(k*k+24*k)/30::k] = False
sieve11[(k*k+10*k)/30::k] = False
sieve13[(k*k+18*k)/30::k] = False
sieve17[(k*k+ 4*k)/30::k] = False
sieve19[(k*k+12*k)/30::k] = False
sieve23[(k*k+28*k)/30::k] = False
sieve29[(k*k+22*k)/30::k] = False
if sieve23[i]:
k=30*i+23; yield k;
sieve01[(k*k+24*k)/30::k] = False
sieve07[(k*k+ 6*k)/30::k] = False
sieve11[(k*k+14*k)/30::k] = False
sieve13[(k*k+18*k)/30::k] = False
sieve17[(k*k+26*k)/30::k] = False
sieve19[ (k*k)/30::k] = False
sieve23[(k*k+ 8*k)/30::k] = False
sieve29[(k*k+20*k)/30::k] = False
if sieve29[i]:
k=30*i+29; yield k;
sieve01[ (k*k)/30::k] = False
sieve07[(k*k+24*k)/30::k] = False
sieve11[(k*k+20*k)/30::k] = False
sieve13[(k*k+18*k)/30::k] = False
sieve17[(k*k+14*k)/30::k] = False
sieve19[(k*k+12*k)/30::k] = False
sieve23[(k*k+ 8*k)/30::k] = False
sieve29[(k*k+ 2*k)/30::k] = False
for i in xrange(i+1,n/30):
if sieve01[i]: yield 30*i+1
if sieve07[i]: yield 30*i+7
if sieve11[i]: yield 30*i+11
if sieve13[i]: yield 30*i+13
if sieve17[i]: yield 30*i+17
if sieve19[i]: yield 30*i+19
if sieve23[i]: yield 30*i+23
if sieve29[i]: yield 30*i+29
i += 1
mod30 = n%30
if mod30 > 1:
if sieve01[i]: yield 30*i+1
if mod30 > 7:
if sieve07[i]: yield 30*i+7
if mod30 > 11:
if sieve11[i]: yield 30*i+11
if mod30 > 13:
if sieve13[i]: yield 30*i+13
if mod30 > 17:
if sieve17[i]: yield 30*i+17
if mod30 > 19:
if sieve19[i]: yield 30*i+19
if mod30 > 23:
if sieve23[i]: yield 30*i+23
Run Code Online (Sandbox Code Playgroud)
Ps:如果您对如何加速此代码有任何建议,从更改操作顺序到预先计算任何内容,请发表评论.
| 归档时间: |
|
| 查看次数: |
14389 次 |
| 最近记录: |