寻找最长重叠范围

Question

寻找最长重叠范围

我有一个列表中的范围，如：

ranges = [(1, 50), (49, 70), (75, 85), (84, 88), (87, 92)]

Run Code Online (Sandbox Code Playgroud)

我想找到可以从这些构建的最长范围（当它们相互重叠时）。

预期输出：

[(1, 70), (75, 92)]

Run Code Online (Sandbox Code Playgroud)

我有一个解决方案，但是它太复杂了，我相信一定有一个更简单的解决方案来解决这个问题

我的解决方案：

def overlap(x, y):
    return range(max(x[0], y[0]), min(x[-1], y[-1]) + 1)

ranges = [(1, 50), (49, 70), (75, 85), (84, 88), (87, 92)]

beg, end = min([x[0] for x in ranges]), 0
for i in ranges:
    if i[0] == beg:
        end = i[1]
while beg:
    for _ in ranges:
        for i in ranges:
            if i[1] > end and overlap(i, [beg, end]):
                end = i[1]
    print(beg, end)
    try:
        beg = min([x[0] for x in ranges if x[0] > end])
        for i in ranges:
            if i[0] == beg:
                end = i[1]
    except ValueError:
        beg = None

Run Code Online (Sandbox Code Playgroud)

输出：

1 70
75 92

Run Code Online (Sandbox Code Playgroud)

Answer 1

Pat*_*ugh 13

我认为您可以按范围的开头对输入进行排序，然后遍历它们。在每个项目中，它要么被添加到当前范围（如果开始小于当前范围的结束），要么我们产生当前范围并开始累积一个新范围：

def overlaps(ranges):
    ranges = sorted(ranges)  # If our inputs are garunteed sorted, we can skip this
    it = iter(ranges)
    try:
        curr_start, curr_stop = next(it)
        # overlaps = False  # If we want to exclude output ranges not produced by overlapping input ranges
    except StopIteration:
        return
    for start, stop in it:
        if curr_start <= start <= curr_stop:  # Assumes intervals are closed
            curr_stop = max(curr_stop, stop)
            # overlaps = True
        else:
            # if overlaps:
            yield curr_start, curr_stop
            curr_start, curr_stop = start, stop
            # overlaps = False
    # if overlaps:
    yield curr_start, curr_stop

print(list(overlaps([(1, 50), (49, 70), (75, 85), (84, 88), (87, 92)])))
# [(1, 70), (75, 92)]

print(list(overlaps([(20, 30), (5, 10), (1, 7), (12, 21)])))
# [(1, 10), (12, 30)]

Run Code Online (Sandbox Code Playgroud)

Answer 2

Chr*_*yle 8

您可以使用 zip 对每个范围对的所有起始值和结束值进行分组。如果起始值低于前一个结束值，则存在重叠，因此删除起始值和结束值。我们正在使用一个 int 来跟踪每个低和高列表中的哪个索引，我们正在寻找低索引总是比高索引高一个。


#split the numbers in to the low and high part of each range
#and set the index position for each of them
ranges = [(1, 50), (49, 70), (75, 85), (84, 88), (87, 92)]
low, high = [list(nums) for nums in zip(*ranges)] 
l, h = 1, 0

#Iterate over the ranges and remove when there is an overlap if no over lap move the pointers
while l < len(low) and h < len(high):
    if low[l] < high[h]:
        del low[l]
        del high[h]
    else:
        l +=1
        h +=1

#zip the low and high back into ranges
new_ranges = list(zip(low, high))
print(new_ranges)

Run Code Online (Sandbox Code Playgroud)

输出

[(1, 70), (75, 92)]

Run Code Online (Sandbox Code Playgroud)

Answer 3

Par*_*erD 6

可以使用functools.reduce：

from functools import reduce

ranges = [(1, 50), (45, 47), (49, 70), (75, 85), (84, 88), (87, 92)]

reducer = (
    lambda acc, el: acc[:-1:] + [(min(*acc[-1], *el), max(*acc[-1], *el))]
    if acc[-1][1] > el[0]
    else acc + [el]
)
print(reduce(reducer, ranges[1::], [ranges[0]]))

Run Code Online (Sandbox Code Playgroud)

给出：

[(1, 70), (75, 92)]

Run Code Online (Sandbox Code Playgroud)

很难用语言表达，但它用来reduce遍历范围。如果范围中的最后一个元组和下一个提供的元组重叠 ( if acc[-1][1] > el[0])，它会从(min, max)两者的中创建一个新范围，然后将这个新的组合范围替换为它后面的内容 ( acc[:-1:] + [(min, max)])，否则只需将新范围添加到末尾 ( acc + [el]) .

编辑：在查看其他答案后，更新为比较两个范围的最小值/最大值，而不仅仅是第一个和最后一个

归档时间：	5 年，8 月前
查看次数：	1323 次
最近记录：	5 年，8 月前