输入列表可以超过100万个数字.当我使用较小的'重复'运行以下代码时,它很好;
def sample(x):
length = 1000000
new_array = random.sample((list(x)),length)
return (new_array)
def repeat_sample(x):
i = 0
repeats = 100
list_of_samples = []
for i in range(repeats):
list_of_samples.append(sample(x))
return(list_of_samples)
repeat_sample(large_array)
Run Code Online (Sandbox Code Playgroud)
然而,使用诸如上述100的高重复,导致MemoryError.回溯如下;
Traceback (most recent call last):
File "C:\Python31\rnd.py", line 221, in <module>
STORED_REPEAT_SAMPLE = repeat_sample(STORED_ARRAY)
File "C:\Python31\rnd.py", line 129, in repeat_sample
list_of_samples.append(sample(x))
File "C:\Python31\rnd.py", line 121, in sample
new_array = random.sample((list(x)),length)
File "C:\Python31\lib\random.py", line 309, in sample
result = [None] * k
MemoryError
Run Code Online (Sandbox Code Playgroud)
我假设我的内存不足.我不知道如何解决这个问题.
感谢您的时间!
扩展我的评论:
假设您对每个样本所做的处理是计算其均值.
def mean(samplelists):
means = []
n = float(len(samplelists[0]))
for sample in samplelists:
mean = sum(sample)/n
means.append(mean)
return means
calc_means(repeat_sample(large_array))
Run Code Online (Sandbox Code Playgroud)
这会让你在内存中保留所有这些列表.你可以像这样轻松得到它:
def mean(sample, n):
n = float(n)
mean = sum(sample)/n
return mean
def sample(x):
length = 1000000
new_array = random.sample(x, length)
return new_array
def repeat_means(x):
repeats = 100
list_of_means = []
for i in range(repeats):
list_of_means.append(mean(sample(x)))
return list_of_means
repeat_means(large_array)
Run Code Online (Sandbox Code Playgroud)
但这仍然不够好......你只需要构建你的结果列表就可以做到这一切:
import random
def sampling_mean(population, k, times):
# Part of this is lifted straight from random.py
_int = int
_random = random.random
n = len(population)
kf = float(k)
result = []
if not 0 <= k <= n:
raise ValueError, "sample larger than population"
for t in range(times):
selected = set()
sum_ = 0
selected_add = selected.add
for i in xrange(k):
j = _int(_random() * n)
while j in selected:
j = _int(_random() * n)
selected_add(j)
sum_ += population[j]
mean = sum_/kf
result.append(mean)
return result
sampling_mean(x, 1000000, 100)
Run Code Online (Sandbox Code Playgroud)
现在,您的算法可以像这样简化吗?
| 归档时间: |
|
| 查看次数: |
2721 次 |
| 最近记录: |