将整数列表划分为总和相等的 K 个子列表

Question

将整数列表划分为总和相等的 K 个子列表

Bsh*_*Bsh 6 python algorithm combinations python-3.x

类似的问题有1和2，但答案没有帮助。假设我们有一个整数列表。我们希望找到K不相交的列表，使它们完全覆盖给定的列表，并且都具有相同的总和。例如，如果A = [4, 3, 5, 6, 4, 3, 1]然后K = 2答案应该是：

[[3, 4, 6], [1, 3, 4, 5]]
or
[[4, 4, 5], [1, 3, 3, 6]]

Run Code Online (Sandbox Code Playgroud)

我编写了一个代码，该代码仅在以下情况下才有效：K = 2对于较小的列表作为输入，但对于较大的列表，它可以正常工作，因为代码的复杂性很高，操作系统会终止任务。我的代码是：

def subarrays_equal_sum(l):
    from itertools import combinations

    if len(l) < 2 or sum(l) % 2 != 0:
        return []
    l = sorted(l)
    list_sum = sum(l)
    all_combinations = []
    for i in range(1, len(l)):
        all_combinations += (list(combinations(l, i)))

    combinations_list = [i for i in all_combinations if sum(i) == list_sum / 2]
    if not combinations_list:
        return []
    final_result = []
    for i in range(len(combinations_list)):
        for j in range(i + 1, len(combinations_list)):
            first = combinations_list[i]
            second = combinations_list[j]
            concat = sorted(first + second)
            if concat == l and [list(first), list(second)] not in final_result:
                final_result.append([list(first), list(second)])

    return final_result

Run Code Online (Sandbox Code Playgroud)

任何值的答案都K可以在这里找到。但是如果我们传递参数A = [4, 3, 5, 6, 4, 3, 1]和K = 2，他们的代码只返回[[5, 4, 3, 1],[4, 3, 6]]，而我的代码返回所有可能的列表，即

[[[3, 4, 6], [1, 3, 4, 5]], [[4, 4, 5], [1, 3, 3, 6]]]

我的问题是：

如何提高代码的复杂性和成本？
如何使我的代码可以处理任何值k？

Answer 1

bti*_*lly 4

这是处理重复项的解决方案。

首先，如前所述，寻找任何解决方案的问题都是 NP 完全问题。所以有些情况下，这会搅动很长时间才意识到没有。我已经应用合理的启发法来限制这种情况发生的频率。启发式方法可以改进。但请注意，有些情况下根本不起作用。

该解决方案的第一步是获取一个数字列表并将其转换为[(value1, repeat), (value2, repeat), ...]. 其中一种启发法要求首先按绝对值降序排序值，然后按值递减排序。这是因为我尝试首先使用第一个元素，并且我们期望一堆小的剩余数字仍然可以给我们总和。

接下来，我将尝试将其拆分为具有正确目标总和的可能最大子集以及所有剩余元素。

然后，我将把剩余的部分分成一个可能的最大剩余子集，该子集不大于第一个，以及之后产生的子集。

递归地执行此操作，我们就找到了解决方案。我们将其返回到链条上。

但是，这就是棘手的地方，我不会通过查看组合来进行分割。相反，我将像使用通常的子集和伪多项式算法一样使用动态编程，只不过我将使用它来构造一个可以进行分割的数据结构。该数据结构将包含以下字段：

value是该元素的值。
repeat是我们在子集和中使用它的次数。
skip是我们拥有但未在子集总和中使用它的副本数量。
tail是这些解决方案的尾部。
prev还有一些我们做了其他事情的其他解决方案。

这是一个构造此数据结构的类，它具有将元素拆分为子集的方法，并且元素仍可用于进一步拆分。

from collections import namedtuple

class RecursiveSums (
      namedtuple('BaseRecursiveSums',
                 ['value', 'repeat', 'skip', 'tail', 'prev'])):

    def sum_and_rest(self):
        if self.tail is None:
            if self.skip:
                yield ([self.value] * self.repeat, [(self.value, self.skip)])
            else:
                yield ([self.value] * self.repeat, [])
        else:
            for partial_sum, rest in self.tail.sum_and_rest():
                for _ in range(self.repeat):
                    partial_sum.append(self.value)
                if self.skip:
                    rest.append((self.value, self.skip))
                yield (partial_sum, rest)
        if self.prev is not None:
            yield from self.prev.sum_and_rest()

Run Code Online (Sandbox Code Playgroud)

您可能需要多看几次才能了解它是如何工作的。

接下来，请记住我说过我使用启发式方法尝试在小元素之前使用大元素。这是我们需要进行比较的一些代码。

class AbsComparator(int):
    def __lt__ (self, other):
        if abs(int(self)) < abs(int(other)):
            return True
        elif abs(other) < abs(self):
            return False
        else:
            return int(self) < int(other)

def abs_lt (x, y):
    return AbsComparator(x) < AbsComparator(y)

Run Code Online (Sandbox Code Playgroud)

我们需要这两种表格。用于直接比较的函数，用于Pythonkey函数参数的类sort。有关后者的更多信息，请参阅使用比较器函数进行排序。

现在是该方法的核心。这会找到所有分割成子集（不大于bound我们使用的比较指标）的方法，并将剩余元素分割成更多。

这个想法与子集和的动态规划方法相同https://www.geeksforgeeks.org/count-of-subsets-with-sum-equal-to-x/，除了两个主要区别。首先，我们不是在计算答案，而是在构建数据结构。第二个是我们的键是(partial_sum, bound_index)这样我们知道我们bound当前是否满足，如果不满足，我们知道接下来要比较哪个元素来测试它。

def lexically_maximal_subset_rest (elements, target, bound=None):
    """
        elements = [(value, count), (value, count), ...]
            with largest absolute values first.
        target = target sum
        bound = a lexical bound on the maximal subset.
    """
    # First let's deal with all of the trivial cases.
    if 0 == len(elements):
        if 0 == target:
            yield []
    elif bound is None or 0 == len(bound):
        # Set the bound to something that trivially works.
        yield from lexically_maximal_subset_rest(elements, target, [abs(elements[0][0]) + 1])
    elif abs_lt(bound[0], elements[0][0]):
        pass # we automatically use more than the bound.
    else:
        # The trivial checks are done.

        bound_satisfied = (bound[0] != elements[0][0])

        # recurse_by_sum will have a key of (partial_sum, bound_index).
        # If the bound_index is None, the bound is satisfied.
        # Otherwise it will be the last used index in the bound.
        recurse_by_sum = {}
        # Populate it with all of the ways to use the first element at least once.
        (init_value, init_count) = elements[0]
        for i in range(init_count):
            if not bound_satisfied:
                if len(bound) <= i or abs_lt(bound[i], init_value):
                    # Bound exceeded.
                    break
                elif abs_lt(init_value, bound[i]):
                    bound_satisfied = True
            if bound_satisfied:
                key = (init_value * (i+1), None)
            else:
                key = (init_value * (i+1), i)

            recurse_by_sum[key] = RecursiveSums(
                init_value, i+1, init_count-i-1, None, recurse_by_sum.get(key))

        # And now we do the dynamic programming thing.
        for j in range(1, len(elements)):
            value, repeat = elements[j]
            next_recurse_by_sum = {}
            for key, tail in recurse_by_sum.items():
                partial_sum, bound_index = key
                # Record not using this value at all.
                next_recurse_by_sum[key] = RecursiveSums(
                    value, 0, repeat, tail, next_recurse_by_sum.get(key))
                # Now record the rest.
                for i in range(1, repeat+1):
                    if bound_index is not None:
                        # Bounds check.
                        if len(bound) <= bound_index + i:
                            break # bound exceeded.
                        elif abs_lt(bound[bound_index + i], value):
                            break # bound exceeded.
                        elif abs_lt(value, bound[bound_index + i]):
                            bound_index = None # bound satisfied!
                    if bound_index is None:
                        next_key = (partial_sum + value * i, None)
                    else:
                        next_key = (partial_sum + value * i, bound_index + i)

                    next_recurse_by_sum[next_key] = RecursiveSums(
                        value, i, repeat - i, tail, next_recurse_by_sum.get(next_key))
            recurse_by_sum = next_recurse_by_sum

        # We now have all of the answers in recurse_by_sum, but in several keys.
        # Find all that may have answers.
        bound_index = len(bound)
        while 0 < bound_index:
            bound_index -= 1
            if (target, bound_index) in recurse_by_sum:
                yield from recurse_by_sum[(target, bound_index)].sum_and_rest()
        if (target, None) in recurse_by_sum:
            yield from recurse_by_sum[(target, None)].sum_and_rest()

Run Code Online (Sandbox Code Playgroud)

现在我们实施其余部分。

def elements_split (elements, target, k, bound=None):
    if 0 == len(elements):
        if k == 0:
            yield []
    elif k == 0:
        pass # still have elements left over.
    else:
        for (subset, rest) in lexically_maximal_subset_rest(elements, target, bound):
            for answer in elements_split(rest, target, k-1, subset):
                answer.append(subset)
                yield answer

def subset_split (raw_elements, k):
    total = sum(raw_elements)
    if 0 == (total % k):
        target = total // k
        counts = {}
        for e in sorted(raw_elements, key=AbsComparator, reverse=True):
            counts[e] = 1 + counts.get(e, 0)
        elements = list(counts.items())
        yield from elements_split(elements, target, k)

Run Code Online (Sandbox Code Playgroud)

这是使用您的列表进行的演示，加倍。我们将其分成 4 等份。在我的笔记本电脑上，它在 0.084 秒内找到所有 10 个解决方案。

n = 0
for s in subset_split([4, 3, 5, 6, 4, 3, 1]*2, 4):
    n += 1
    print(n, s)

Run Code Online (Sandbox Code Playgroud)

所以...没有性能保证。但这通常应该能够很快地找到每个分割的分割。当然，通常也存在指数数量的分裂。例如，如果您获取列表的 16 个副本并尝试将其分成 32 组，则在我的笔记本电脑上大约需要 8 分钟才能找到所有 224082 个解决方案。

如果我不尝试处理负面因素，这可能会加快很多。（使用更便宜的比较，删除所有超出的部分总和，target以避免计算动态规划表的大部分内容。）

这是加速版本。对于只有非负数的情况，速度大约是两倍。如果有负数，就会产生错误的结果。

from collections import namedtuple

class RecursiveSums (
      namedtuple('BaseRecursiveSums',
                 ['value', 'repeat', 'skip', 'tail', 'prev'])):

    def sum_and_rest(self):
        if self.tail is None:
            if self.skip:
                yield ([self.value] * self.repeat, [(self.value, self.skip)])
            else:
                yield ([self.value] * self.repeat, [])
        else:
            for partial_sum, rest in self.tail.sum_and_rest():
                for _ in range(self.repeat):
                    partial_sum.append(self.value)
                if self.skip:
                    rest.append((self.value, self.skip))
                yield (partial_sum, rest)
        if self.prev is not None:
            yield from self.prev.sum_and_rest()

def lexically_maximal_subset_rest (elements, target, bound=None):
    """
        elements = [(value, count), (value, count), ...]
            with largest absolute values first.
        target = target sum
        bound = a lexical bound on the maximal subset.
    """
    # First let's deal with all of the trivial cases.
    if 0 == len(elements):
        if 0 == target:
            yield []
    elif bound is None or 0 == len(bound):
        # Set the bound to something that trivially works.
        yield from lexically_maximal_subset_rest(elements, target, [abs(elements[0][0]) + 1])
    elif bound[0] < elements[0][0]:
        pass # we automatically use more than the bound.
    else:
        # The trivial checks are done.

        bound_satisfied = (bound[0] != elements[0][0])

        # recurse_by_sum will have a key of (partial_sum, bound_index).
        # If the bound_index is None, the bound is satisfied.
        # Otherwise it will be the last used index in the bound.
        recurse_by_sum = {}
        # Populate it with all of the ways to use the first element at least once.
        (init_value, init_count) = elements[0]
        for i in range(init_count):
            if not bound_satisfied:
                if len(bound) <= i or bound[i] < init_value:
                    # Bound exceeded.
                    break
                elif init_value < bound[i]:
                    bound_satisfied = True
            if bound_satisfied:
                key = (init_value * (i+1), None)
            else:
                key = (init_value * (i+1), i)

            recurse_by_sum[key] = RecursiveSums(
                init_value, i+1, init_count-i-1, None, recurse_by_sum.get(key))

        # And now we do the dynamic programming thing.
        for j in range(1, len(elements)):
            value, repeat = elements[j]
            next_recurse_by_sum = {}
            for key, tail in recurse_by_sum.items():
                partial_sum, bound_index = key
                # Record not using this value at all.
                next_recurse_by_sum[key] = RecursiveSums(
                    value, 0, repeat, tail, next_recurse_by_sum.get(key))
                # Now record the rest.
                for i in range(1, repeat+1):
                    if target < partial_sum + value * i:
                        break # these are too big.

                    if bound_index is not None:
                        # Bounds check.
                        if len(bound) <= bound_index + i:
                            break # bound exceeded.
                        elif bound[bound_index + i] < value:
                            break # bound exceeded.
                        elif value < bound[bound_index + i]:
                            bound_index = None # bound satisfied!
                    if bound_index is None:
                        next_key = (partial_sum + value * i, None)
                    else:
                        next_key = (partial_sum + value * i, bound_index + i)

                    next_recurse_by_sum[next_key] = RecursiveSums(
                        value, i, repeat - i, tail, next_recurse_by_sum.get(next_key))
            recurse_by_sum = next_recurse_by_sum

        # We now have all of the answers in recurse_by_sum, but in several keys.
        # Find all that may have answers.
        bound_index = len(bound)
        while 0 < bound_index:
            bound_index -= 1
            if (target, bound_index) in recurse_by_sum:
                yield from recurse_by_sum[(target, bound_index)].sum_and_rest()
        if (target, None) in recurse_by_sum:
            yield from recurse_by_sum[(target, None)].sum_and_rest()

def elements_split (elements, target, k, bound=None):
    if 0 == len(elements):
        if k == 0:
            yield []
    elif k == 0:
        pass # still have elements left over.
    else:
        for (subset, rest) in lexically_maximal_subset_rest(elements, target, bound):
            for answer in elements_split(rest, target, k-1, subset):
                answer.append(subset)
                yield answer

def subset_split (raw_elements, k):
    total = sum(raw_elements)
    if 0 == (total % k):
        target = total // k
        counts = {}
        for e in sorted(raw_elements, key=AbsComparator, reverse=True):
            counts[e] = 1 + counts.get(e, 0)
        elements = list(counts.items())
        yield from elements_split(elements, target, k)

n = 0
for s in subset_split([4, 3, 5, 6, 4, 3, 1]*16, 32):
    n += 1
    print(n, s)

Run Code Online (Sandbox Code Playgroud)

归档时间：	2 年，7 月前
查看次数：	758 次
最近记录：	2 年，7 月前