如何正确排序内部带有数字的字符串?

Mic*_*hal 101 python regex sorting string

可能重复:
Python是否具有用于字符串自然排序的内置函数?

我有一个包含数字的字符串列表,我找不到对它们进行排序的好方法.
例如,我得到这样的东西:

something1
something12
something17
something2
something25
something29
Run Code Online (Sandbox Code Playgroud)

用这个sort()方法.

我知道我可能需要以某种方式提取数字,然后对列表进行排序,但我不知道如何以最简单的方式进行.

unu*_*tbu 199

也许您正在寻找人类排序(也称为自然排序):

import re

def atoi(text):
    return int(text) if text.isdigit() else text

def natural_keys(text):
    '''
    alist.sort(key=natural_keys) sorts in human order
    http://nedbatchelder.com/blog/200712/human_sorting.html
    (See Toothy's implementation in the comments)
    '''
    return [ atoi(c) for c in re.split(r'(\d+)', text) ]

alist=[
    "something1",
    "something12",
    "something17",
    "something2",
    "something25",
    "something29"]

alist.sort(key=natural_keys)
print(alist)
Run Code Online (Sandbox Code Playgroud)

产量

['something1', 'something2', 'something12', 'something17', 'something25', 'something29']
Run Code Online (Sandbox Code Playgroud)

PS.我改变了我的答案,使用Toothy的自然排序实现(在这里的评论中发布),因为它比我原来的答案要快得多.


如果您希望使用浮点数对文本进行排序,那么您需要将正则表达式从匹配整数(即(\d+))的正则表达式更改为匹配浮点数的正则表达式:

import re

def atof(text):
    try:
        retval = float(text)
    except ValueError:
        retval = text
    return retval

def natural_keys(text):
    '''
    alist.sort(key=natural_keys) sorts in human order
    http://nedbatchelder.com/blog/200712/human_sorting.html
    (See Toothy's implementation in the comments)
    float regex comes from https://stackoverflow.com/a/12643073/190597
    '''
    return [ atof(c) for c in re.split(r'[+-]?([0-9]+(?:[.][0-9]*)?|[.][0-9]+)', text) ]

alist=[
    "something1",
    "something2",
    "something1.0",
    "something1.25",
    "something1.105"]

alist.sort(key=natural_keys)
print(alist)
Run Code Online (Sandbox Code Playgroud)

产量

['something1', 'something1.0', 'something1.105', 'something1.25', 'something2']
Run Code Online (Sandbox Code Playgroud)

  • @painfulenglish:我修改了上面的帖子,以展示如何使用浮点数自然地排序文本. (4认同)
  • 您知道如何将其扩展到数字为浮点数的情况吗?例如,something1.0,something 1.25,something2.0。 (2认同)