Zer*_*eus 54 python sorting python-2.x python-3.x
我试图复制(如果可能改善)的Python 2.x的在3.x的排序行为,使双方可订购的类型,如int,float等进行排序预期,并且互相unorderable类型的输出中进行分组.
这是我正在谈论的一个例子:
>>> sorted([0, 'one', 2.3, 'four', -5]) # Python 2.x
[-5, 0, 2.3, 'four', 'one']
Run Code Online (Sandbox Code Playgroud)
>>> sorted([0, 'one', 2.3, 'four', -5]) # Python 3.x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: str() < int()
Run Code Online (Sandbox Code Playgroud)
我之前尝试过这个,使用一个关键参数的类sorted()(参见
为什么这个用于排序异构序列的关键类表现奇怪?)从根本上被打破了,因为它的方法是
如BrenBarn的出色答案所解释的那样,可以导致不及物处理.
我最初在没有尝试编码的情况下拒绝的一种天真的方法是使用返回(type, value)元组的键函数:
def motley(value):
return repr(type(value)), value
Run Code Online (Sandbox Code Playgroud)
但是,这不符合我的要求.首先,它打破了相互可订购类型的自然顺序:
>>> sorted([0, 123.4, 5, -6, 7.89])
[-6, 0, 5, 7.89, 123.4]
>>> sorted([0, 123.4, 5, -6, 7.89], key=motley)
[7.89, 123.4, -6, 0, 5]
Run Code Online (Sandbox Code Playgroud)
其次,当输入包含两个具有相同本质上不可共享类型的对象时,它会引发异常:
>>> sorted([{1:2}, {3:4}], key=motley)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: dict() < dict()
Run Code Online (Sandbox Code Playgroud)
...这无疑是Python 2.x和3.x中的标准行为 - 但理想情况下我希望将这些类型组合在一起(我并不特别关心它们的排序,但它似乎与Python保证稳定排序,保持原始顺序).
我可以通过特殊的方法解决数字类型中的第一个问题:
from numbers import Real
from decimal import Decimal
def motley(value):
numeric = Real, Decimal
if isinstance(value, numeric):
typeinfo = numeric
else:
typeinfo = type(value)
return repr(typeinfo), value
Run Code Online (Sandbox Code Playgroud)
......尽可能地发挥作用:
>>> sorted([0, 'one', 2.3, 'four', -5], key=motley)
[-5, 0, 2.3, 'four', 'one']
Run Code Online (Sandbox Code Playgroud)
...但不考虑可能存在其他可以相互订购的不同(可能是用户定义的)类型的事实,当然仍然会因本质上无法解决的类型而失败:
>>> sorted([{1:2}, {3:4}], key=motley)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: dict() < dict()
Run Code Online (Sandbox Code Playgroud)
是否有解决了另一种方法既任意的,不同的,但是,相互订购类型的问题,并认为本质unorderable类型的?
Bas*_*els 35
愚蠢的想法:制作第一个传递来划分可以在彼此之间进行比较的组中的所有不同项目,对各个组进行排序并最终将它们连接起来.我假设一个项目与一个组的所有成员相当,如果它与一个组的第一个成员相当.像这样的东西(Python3):
import itertools
def python2sort(x):
it = iter(x)
groups = [[next(it)]]
for item in it:
for group in groups:
try:
item < group[0] # exception if not comparable
group.append(item)
break
except TypeError:
continue
else: # did not break, make new group
groups.append([item])
print(groups) # for debugging
return itertools.chain.from_iterable(sorted(group) for group in groups)
Run Code Online (Sandbox Code Playgroud)
这将在可悲的情况下具有二次运行时间,没有任何项目具有可比性,但我想唯一的方法是确切地知道所有可能的组合.对于任何试图对一长串无法解决的项目(如复数)进行排序的人来说,将二次行为视为应得的惩罚.在一些混合了一些字符串和一些整数的情况下,速度应该与正常排序的速度相似.快速测试:
In [19]: x = [0, 'one', 2.3, 'four', -5, 1j, 2j, -5.5, 13 , 15.3, 'aa', 'zz']
In [20]: list(python2sort(x))
[[0, 2.3, -5, -5.5, 13, 15.3], ['one', 'four', 'aa', 'zz'], [1j], [2j]]
Out[20]: [-5.5, -5, 0, 2.3, 13, 15.3, 'aa', 'four', 'one', 'zz', 1j, 2j]
Run Code Online (Sandbox Code Playgroud)
它似乎也是一种"稳定的排序",因为这些组是按照遇到无法比较的项目的顺序形成的.
Mar*_*ers 31
这个答案旨在忠实地在Python 3中重新创建Python 2排序顺序.
实际的Python 2实现非常复杂,但是在实例有机会实现正常的比较规则之后,它object.c会default_3way_compare做最后的回退.这是在个别类型有机会比较之后(通过__cmp__或__lt__钩子).
在包装器中将该函数实现为纯Python,再加上模拟规则的异常(dict以及特定的复数),我们在Python 3中提供了相同的Python 2排序语义:
from numbers import Number
# decorator for type to function mapping special cases
def per_type_cmp(type_):
try:
mapping = per_type_cmp.mapping
except AttributeError:
mapping = per_type_cmp.mapping = {}
def decorator(cmpfunc):
mapping[type_] = cmpfunc
return cmpfunc
return decorator
class python2_sort_key(object):
_unhandled_types = {complex}
def __init__(self, ob):
self._ob = ob
def __lt__(self, other):
_unhandled_types = self._unhandled_types
self, other = self._ob, other._ob # we don't care about the wrapper
# default_3way_compare is used only if direct comparison failed
try:
return self < other
except TypeError:
pass
# hooks to implement special casing for types, dict in Py2 has
# a dedicated __cmp__ method that is gone in Py3 for example.
for type_, special_cmp in per_type_cmp.mapping.items():
if isinstance(self, type_) and isinstance(other, type_):
return special_cmp(self, other)
# explicitly raise again for types that won't sort in Python 2 either
if type(self) in _unhandled_types:
raise TypeError('no ordering relation is defined for {}'.format(
type(self).__name__))
if type(other) in _unhandled_types:
raise TypeError('no ordering relation is defined for {}'.format(
type(other).__name__))
# default_3way_compare from Python 2 as Python code
# same type but no ordering defined, go by id
if type(self) is type(other):
return id(self) < id(other)
# None always comes first
if self is None:
return True
if other is None:
return False
# Sort by typename, but numbers are sorted before other types
self_tname = '' if isinstance(self, Number) else type(self).__name__
other_tname = '' if isinstance(other, Number) else type(other).__name__
if self_tname != other_tname:
return self_tname < other_tname
# same typename, or both numbers, but different type objects, order
# by the id of the type object
return id(type(self)) < id(type(other))
@per_type_cmp(dict)
def dict_cmp(a, b, _s=object()):
if len(a) != len(b):
return len(a) < len(b)
adiff = min((k for k in a if a[k] != b.get(k, _s)), key=python2_sort_key, default=_s)
if adiff is _s:
# All keys in a have a matching value in b, so the dicts are equal
return False
bdiff = min((k for k in b if b[k] != a.get(k, _s)), key=python2_sort_key)
if adiff != bdiff:
return python2_sort_key(adiff) < python2_sort_key(bdiff)
return python2_sort_key(a[adiff]) < python2_sort_key(b[bdiff])
Run Code Online (Sandbox Code Playgroud)
我在Python 2中实现了处理字典排序,因为类型本身通过__cmp__钩子支持.我自然也坚持使用Python 2对键和值的排序.
我还为复数添加了特殊的大小写,因为当您尝试对这些进行排序时,Python 2会引发异常:
>>> sorted([0.0, 1, (1+0j), False, (2+3j)])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: no ordering relation is defined for complex numbers
Run Code Online (Sandbox Code Playgroud)
如果要完全模拟Python 2行为,则可能必须添加更多特殊情况.
如果你想对复杂的数字进行排序,你需要始终将它们与非数字组放在一起; 例如:
# Sort by typename, but numbers are sorted before other types
if isinstance(self, Number) and not isinstance(self, complex):
self_tname = ''
else:
self_tname = type(self).__name__
if isinstance(other, Number) and not isinstance(other, complex):
other_tname = ''
else:
other_tname = type(other).__name__
Run Code Online (Sandbox Code Playgroud)
一些测试用例:
>>> sorted([0, 'one', 2.3, 'four', -5], key=python2_sort_key)
[-5, 0, 2.3, 'four', 'one']
>>> sorted([0, 123.4, 5, -6, 7.89], key=python2_sort_key)
[-6, 0, 5, 7.89, 123.4]
>>> sorted([{1:2}, {3:4}], key=python2_sort_key)
[{1: 2}, {3: 4}]
>>> sorted([{1:2}, None, {3:4}], key=python2_sort_key)
[None, {1: 2}, {3: 4}]
Run Code Online (Sandbox Code Playgroud)
Fre*_*d S 10
这里没有运行Python 3,但也许这样的东西可行.测试是否在"值"上进行"小于"比较会创建异常,然后执行"某事"来处理该情况,例如将其转换为字符串.
当然,如果列表中的其他类型不是同一类型但可以相互订购,那么您仍需要更多特殊处理.
from numbers import Real
from decimal import Decimal
def motley(value):
numeric = Real, Decimal
if isinstance(value, numeric):
typeinfo = numeric
else:
typeinfo = type(value)
try:
x = value < value
except TypeError:
value = repr(value)
return repr(typeinfo), value
>>> print sorted([0, 'one', 2.3, 'four', -5, (2+3j), (1-3j)], key=motley)
[-5, 0, 2.3, (1-3j), (2+3j), 'four', 'one']
Run Code Online (Sandbox Code Playgroud)