将多个列表一起排序

Bub*_*aya 11 python sorting

我有相同长度的列表a, b, ,... 。c我想按照排序获得的顺序对所有这些进行排序a,即我可以执行装饰-排序-取消装饰模式

a, b, c = map(list, zip(*sorted(zip(a, b, c))))
Run Code Online (Sandbox Code Playgroud)

或类似的东西。但是,我希望列表就地排序(我假设将sorted传递给它的临时迭代器中的所有内容拉到临时列表中,然后zip填充到三个输出列表中,因此输入中的每个数据都被不必要地复制两次)无需创建临时对象。所以我的意思不是

a_sorted, b_sorted, c_sorted = map(list, zip(*sorted(zip(a, b, c))))
a[:] = a_sorted
b[:] = b_sorted
c[:] = c_sorted
Run Code Online (Sandbox Code Playgroud)

我怎样才能做到这一点?

Kel*_*ndy 11

我认为“不创建临时对象”是不可能的,特别是因为Python中“一切都是对象”。

\n

如果您自己实现某种排序算法,您可以获得 O(1) 空间/对象数量,但如果您想要 O(n log n) 时间和稳定性,那就很困难。如果您不关心稳定性(似乎很可能,因为您说要排序,a但实际上却按a,bc排序),堆排序相当简单:

\n
def sort_together_heapsort(a, b, c):\n    n = len(a)\n    def swap(i, j):\n        a[i], a[j] = a[j], a[i]\n        b[i], b[j] = b[j], b[i]\n        c[i], c[j] = c[j], c[i]\n    def siftdown(i):\n        while (kid := 2*i+1) < n:\n            imax = kid if a[kid] > a[i] else i\n            kid += 1\n            if kid < n and a[kid] > a[imax]:\n                imax = kid\n            if imax == i:\n                return\n            swap(i, imax)\n            i = imax\n    for i in range(n // 2)[::-1]:\n        siftdown(i)\n    while n := n - 1:\n        swap(0, n)\n        siftdown(0)\n
Run Code Online (Sandbox Code Playgroud)\n

无论如何,如果有人只想节省一些内存,可以通过就地装饰(构建元组并将它们存储在 中a)来完成:

\n
def sort_together_decorate_in_a(a, b, c):\n    for i, a[i] in enumerate(zip(a, b, c)):\n        pass\n    a.sort()\n    for i, [a[i], b[i], c[i]] in enumerate(a):\n        pass\n
Run Code Online (Sandbox Code Playgroud)\n

或者,如果您相信它将list.sort按顺序请求元素的键(至少在 CPython 中确实如此,18 年前key引入参数时就已经这样做了,而且我怀疑会继续这样做):

\n
def sort_together_iter_key(a, b, c):\n    it = iter(a)\n    b.sort(key=lambda _: next(it))\n    it = iter(a)\n    c.sort(key=lambda _: next(it))\n    a.sort()\n
Run Code Online (Sandbox Code Playgroud)\n

使用三个包含 100,000 个元素的列表测试内存和时间:

\n
def sort_together_heapsort(a, b, c):\n    n = len(a)\n    def swap(i, j):\n        a[i], a[j] = a[j], a[i]\n        b[i], b[j] = b[j], b[i]\n        c[i], c[j] = c[j], c[i]\n    def siftdown(i):\n        while (kid := 2*i+1) < n:\n            imax = kid if a[kid] > a[i] else i\n            kid += 1\n            if kid < n and a[kid] > a[imax]:\n                imax = kid\n            if imax == i:\n                return\n            swap(i, imax)\n            i = imax\n    for i in range(n // 2)[::-1]:\n        siftdown(i)\n    while n := n - 1:\n        swap(0, n)\n        siftdown(0)\n
Run Code Online (Sandbox Code Playgroud)\n

笔记:

\n
    \n
  • 第二个解决方案是您的缩短/改进版本,不需要临时变量和列表转换。
  • \n
  • 带后缀的解决方案_X是采用任意多个列表作为参数的版本。
  • \n
  • @a_guest 来自他们的回答。在运行时方面,它目前受益于我的数据是随机的,因为这不会暴露该解决方案的最坏情况复杂度 O(m * n\xc2\xb2),其中 m 是列表的数量,n 是长度每个列表的。
  • \n
\n

使用 10 个包含 100,000 个元素的列表来测试内存和时间:

\n
def sort_together_decorate_in_a(a, b, c):\n    for i, a[i] in enumerate(zip(a, b, c)):\n        pass\n    a.sort()\n    for i, [a[i], b[i], c[i]] in enumerate(a):\n        pass\n
Run Code Online (Sandbox Code Playgroud)\n

整个代码(在线尝试!):

\n
import tracemalloc as tm\nfrom random import random\nfrom timeit import timeit\n\ndef sort_together_sorted_zip(a, b, c):\n    a_sorted, b_sorted, c_sorted = map(list, zip(*sorted(zip(a, b, c))))\n    a[:] = a_sorted\n    b[:] = b_sorted\n    c[:] = c_sorted\n\ndef sort_together_sorted_zip_2(a, b, c):\n    a[:], b[:], c[:] = zip(*sorted(zip(a, b, c)))\n\ndef sort_together_sorted_zip_X(*lists):\n    sorteds = zip(*sorted(zip(*lists)))\n    for lst, lst[:] in zip(lists, sorteds):\n        pass\n\ndef sort_together_decorate_in_a(a, b, c):\n    for i, a[i] in enumerate(zip(a, b, c)):\n        pass\n    a.sort()\n    for i, [a[i], b[i], c[i]] in enumerate(a):\n        pass\n\ndef sort_together_decorate_in_first_X(*lists):\n    first = lists[0]\n    for i, first[i] in enumerate(zip(*lists)):\n        pass\n    first.sort()\n    for i, values in enumerate(first):\n        for lst, lst[i] in zip(lists, values):\n            pass\n\ndef sort_together_iter_key(a, b, c):\n    it = iter(a)\n    b.sort(key=lambda _: next(it))\n    it = iter(a)\n    c.sort(key=lambda _: next(it))\n    a.sort()\n\ndef sort_together_iter_key_X(*lists):\n    for lst in lists[1:]:\n        it = iter(lists[0])\n        lst.sort(key=lambda _: next(it))\n    lists[0].sort()\n\ndef sort_together_heapsort(a, b, c):\n    n = len(a)\n    def swap(i, j):\n        a[i], a[j] = a[j], a[i]\n        b[i], b[j] = b[j], b[i]\n        c[i], c[j] = c[j], c[i]\n    def siftdown(i):\n        while (kid := 2*i+1) < n:\n            imax = kid if a[kid] > a[i] else i\n            kid += 1\n            if kid < n and a[kid] > a[imax]:\n                imax = kid\n            if imax == i:\n                return\n            swap(i, imax)\n            i = imax\n    for i in range(n // 2)[::-1]:\n        siftdown(i)\n    while n := n - 1:\n        swap(0, n)\n        siftdown(0)\n\ndef sort_together_heapsort_X(*lists):\n    a = lists[0]\n    n = len(a)\n    def swap(i, j):\n        for lst in lists:\n            lst[i], lst[j] = lst[j], lst[i]\n    def siftdown(i):\n        while (kid := 2*i+1) < n:\n            imax = kid if a[kid] > a[i] else i\n            kid += 1\n            if kid < n and a[kid] > a[imax]:\n                imax = kid\n            if imax == i:\n                return\n            swap(i, imax)\n            i = imax\n    for i in range(n // 2)[::-1]:\n        siftdown(i)\n    while n := n - 1:\n        swap(0, n)\n        siftdown(0)\n\ndef sort_together_heapsort_opti(a, b, c):\n    # Avoid inner functions and range-loop to minimize memory.\n    # Makes it faster, too. But duplicates code. Not recommended.\n    n = len(a)\n    i0 = n // 2 - 1\n    while i0 >= 0:\n        i = i0\n        while (kid := 2*i+1) < n:\n            imax = kid if a[kid] > a[i] else i\n            kid += 1\n            if kid < n and a[kid] > a[imax]:\n                imax = kid\n            if imax == i:\n                break\n            a[i], a[imax] = a[imax], a[i]\n            b[i], b[imax] = b[imax], b[i]\n            c[i], c[imax] = c[imax], c[i]\n            i = imax\n        i0 -= 1\n    while n := n - 1:\n        a[0], a[n] = a[n], a[0]\n        b[0], b[n] = b[n], b[0]\n        c[0], c[n] = c[n], c[0]\n        i = 0\n        while (kid := 2*i+1) < n:\n            imax = kid if a[kid] > a[i] else i\n            kid += 1\n            if kid < n and a[kid] > a[imax]:\n                imax = kid\n            if imax == i:\n                break\n            a[i], a[imax] = a[imax], a[i]\n            b[i], b[imax] = b[imax], b[i]\n            c[i], c[imax] = c[imax], c[i]\n            i = imax\n\ndef sort_together_heapsort_opti_X(*lists):\n    # Avoid inner functions and range-loop to minimize memory.\n    # Makes it faster, too. But duplicates code. Not recommended.\n    a = lists[0]\n    n = len(a)\n    i0 = n // 2 - 1\n    while i0 >= 0:\n        i = i0\n        while (kid := 2*i+1) < n:\n            imax = kid if a[kid] > a[i] else i\n            kid += 1\n            if kid < n and a[kid] > a[imax]:\n                imax = kid\n            if imax == i:\n                break\n            for lst in lists:\n                lst[i], lst[imax] = lst[imax], lst[i]\n            i = imax\n        i0 -= 1\n    while n := n - 1:\n        for lst in lists:\n            lst[0], lst[n] = lst[n], lst[0]\n        i = 0\n        while (kid := 2*i+1) < n:\n            imax = kid if a[kid] > a[i] else i\n            kid += 1\n            if kid < n and a[kid] > a[imax]:\n                imax = kid\n            if imax == i:\n                break\n            for lst in lists:\n                lst[i], lst[imax] = lst[imax], lst[i]\n            i = imax\n\ndef sort_multi_by_a_guest_X(a, *lists):\n    indices = list(range(len(a)))\n    indices.sort(key=lambda i: a[i])\n    a.sort()\n    for lst in lists:\n        for i, j in enumerate(indices):\n            while j < i:\n                j = indices[j]\n            lst[i], lst[j] = lst[j], lst[i]\n\nfuncs = [\n    sort_together_sorted_zip,\n    sort_together_sorted_zip_2,\n    sort_together_sorted_zip_X,\n    sort_together_decorate_in_a,\n    sort_together_decorate_in_first_X,\n    sort_multi_by_a_guest_X,\n    sort_together_iter_key,\n    sort_together_iter_key_X,\n    sort_together_heapsort,\n    sort_together_heapsort_X,\n    sort_together_heapsort_opti,\n    sort_together_heapsort_opti_X,\n]\n\nn = 100000\na0 = [random() for _ in range(n)]\nb0 = [x + 1 for x in a0]\nc0 = [x + 2 for x in a0]\n\nfor _ in range(3):\n    for func in funcs:\n\n        a, b, c = a0[:], b0[:], c0[:]\n        time = timeit(lambda: func(a, b, c), number=1)\n        assert a == sorted(a0)\n        assert b == sorted(b0)\n        assert c == sorted(c0)\n\n        a, b, c = a0[:], b0[:], c0[:]\n        tm.start()\n        func(a, b, c)\n        memory = tm.get_traced_memory()[1] \n        tm.stop()\n\n        print(f\'{memory:10,} bytes  {int(time * 1e3):4} ms  {func.__name__}\')\n    print()\n
Run Code Online (Sandbox Code Playgroud)\n