子集和问题

Alb*_*eal 7 python algorithm subset-sum

最近我对子集求和问题产生了兴趣,该问题是在超集中找到零和子集.我在SO上找到了一些解决方案,此外,我遇到了一个使用动态编程方法的特定解决方案.我根据他的定性描述在python中翻译了他的解决方案.我正在尝试对更大的列表进行优化,这会占用大量的内存.有人可以推荐优化或其他技术来解决这个特殊问题吗?这是我在python中的尝试:

import random
from time import time
from itertools import product

time0 = time()

# create a zero matrix of size a (row), b(col)
def create_zero_matrix(a,b):
    return [[0]*b for x in xrange(a)]

# generate a list of size num with random integers with an upper and lower bound
def random_ints(num, lower=-1000, upper=1000):
    return [random.randrange(lower,upper+1) for i in range(num)]

# split a list up into N and P where N be the sum of the negative values and P the sum of the positive values.
# 0 does not count because of additive identity
def split_sum(A):
    N_list = []
    P_list = []
    for x in A:
        if x < 0:
            N_list.append(x)
        elif x > 0:
            P_list.append(x)
    return [sum(N_list), sum(P_list)]

# since the column indexes are in the range from 0 to P - N
# we would like to retrieve them based on the index in the range N to P
# n := row, m := col
def get_element(table, n, m, N):
    if n < 0:
        return 0
    try:
        return table[n][m - N]
    except:
        return 0

# same definition as above
def set_element(table, n, m, N, value):
    table[n][m - N] = value

# input array
#A = [1, -3, 2, 4]
A = random_ints(200)

[N, P] = split_sum(A)

# create a zero matrix of size m (row) by n (col)
#
# m := the number of elements in A
# n := P - N + 1 (by definition N <= s <= P)
#
# each element in the matrix will be a value of either 0 (false) or 1 (true)
m = len(A)
n = P - N + 1;
table = create_zero_matrix(m, n)

# set first element in index (0, A[0]) to be true
# Definition: Q(1,s) := (x1 == s). Note that index starts at 0 instead of 1.
set_element(table, 0, A[0], N, 1)

# iterate through each table element
#for i in xrange(1, m): #row
#    for s in xrange(N, P + 1): #col
for i, s in product(xrange(1, m), xrange(N, P + 1)):
    if get_element(table, i - 1, s, N) or A[i] == s or get_element(table, i - 1, s - A[i], N):
        #set_element(table, i, s, N, 1)
        table[i][s - N] = 1

# find zero-sum subset solution
s = 0
solution = []
for i in reversed(xrange(0, m)):
    if get_element(table, i - 1, s, N) == 0 and get_element(table, i, s, N) == 1:
        s = s - A[i]
        solution.append(A[i])

print "Solution: ",solution

time1 = time()

print "Time execution: ", time1 - time0
Run Code Online (Sandbox Code Playgroud)

小智 5

我不太确定你的解决方案是精确的还是PTA(多时间近似).

但是,正如有人指出的那样,这个问题确实是NP-Complete.

意思是,每个已知(精确)算法都对输入的大小具有指数时间行为.

意思是,如果你可以在.01纳秒内处理1个操作,那么对于59个元素的列表,它将采取:

2^59 ops -->     2^59     seconds -->     2^26      years -->      1 year
            --------------           ---------------
            10.000.000.000           3600 x 24 x 365
Run Code Online (Sandbox Code Playgroud)

您可以找到启发式算法,它可以让您轻松找到多项式时间内的精确解.

另一方面,如果使用集合中数字值的边界来限制问题(到另一个),则问题复杂度会降低到多项式时间.但即使这样,消耗的内存空间也将是非常高阶的多项式.
消耗的内存将远远大于内存中的几千兆字节.甚至比硬盘上的几个tera字节大得多.

(这是针对集合中元素值的界限的小值)

可能是您的动态编程算法的情况.

在我看来,在构建初始化矩阵时,你使用的是1000的界限.

你可以尝试一个较小的界限.那就是......如果你的输入始终由小值组成.

祝好运!


sko*_*rks 5

Hacker News 上有人提出了以下解决方案,我非常喜欢。它恰好在 python 中:):

def subset_summing_to_zero (activities):
  subsets = {0: []}
  for (activity, cost) in activities.iteritems():
      old_subsets = subsets
      subsets = {}
      for (prev_sum, subset) in old_subsets.iteritems():
          subsets[prev_sum] = subset
          new_sum = prev_sum + cost
          new_subset = subset + [activity]
          if 0 == new_sum:
              new_subset.sort()
              return new_subset
          else:
              subsets[new_sum] = new_subset
  return []
Run Code Online (Sandbox Code Playgroud)

我花了几分钟的时间,它工作得很好。