如何将列表拆分为以分隔符开头的子列表?

Ros*_*aha 5 python split list

  • 我想将列表拆分为以分隔符开头的子列表
    • 必须保留定界符
    • 分隔符必须是每个子列表的第一个字符

例子:

delimiter = "x" 
input = ["x","a","x","x",1,2,3,"a","a","x","e"]
output = [["x","a"], ["x"], ["x",1,2,3,"a","a"], ["x","e"]]
Run Code Online (Sandbox Code Playgroud)
  • 除了下面的解决方案,请参阅Python 根据分隔符单词拆分列表
    • 问题类似,但预期输出略有不同。例如,那里的顶级解决方案返回并在索引 0 处为空列表。

Sim*_*ink 5

步骤 1:查找variable列表中出现的索引:

idx = [ix for ix, val in enumerate(input) if val==variable]
Run Code Online (Sandbox Code Playgroud)

第 2 步:使用列表切片生成子列表:

res = [input[i:j] for i,j in zip(idx, idx[1:]+[len(input)])]
Run Code Online (Sandbox Code Playgroud)

输出

print(res)
# [['x', 'a'], ['x'], ['x', 1, 2, 3, 'a', 'a'], ['x', 'e']]
Run Code Online (Sandbox Code Playgroud)


Tre*_*ney 2

    \n
  • input是一个Python方法,不要将其用作变量名
  • \n
  • 如果列表中的第一个字符不是分隔符,此解决方案也将起作用\n
      \n
    • 鉴于:[\'a\', \'b\', \'c\', \'x\', \'a\', \'x\', \'x\', 1, 2, 3, \'a\', \'a\', \'x\', \'e\', \'x\']
    • \n
    • 回到:[[\'a\', \'b\', \'c\'], [\'x\', \'a\'], [\'x\'], [\'x\', 1, 2, 3, \'a\', \'a\'], [\'x\', \'e\'], [\'x\']]
    • \n
    \n
  • \n
\n
from typing import List  # for type annotations\n\n\ndef sublist_by_delimiter(flat_list: list, delimiter: str) -> List[list]:\n    result = list()  # main list\n    chunk = list()  # inner list to \n    len_flat_list = len(flat_list)\n    for i, v in enumerate(flat_list, 1):  # iterate through t, begin enumerating at 1\n        if (v == delimiter) & (i != 1):  # except for the first delimiter \n            result.append(chunk)  # append chunk to result\n            chunk = [v]  # create new chunk beginning with v\n            if i == len_flat_list:  # if the last value in the list is delimiter\n                result.append(chunk)\n        elif (i == len_flat_list):  # for the last list in lines\n            chunk.append(v)  # append that line to inner\n            result.append(chunk)  # append chunk to result\n        else:\n            chunk.append(v)  # append each v to chunk where v isn\'t delimiter\n            \n    return result\n            \n\nt = [\'x\', \'a\', \'x\', \'x\', 1, 2, 3, \'a\', \'a\', \'x\', \'e\', \'x\']  # an extra x has been added at the end for testing\ndelim = \'x\'\nsublist_by_delimiter(t, delim)\n[[\'x\', \'a\'], [\'x\'], [\'x\', 1, 2, 3, \'a\', \'a\'], [\'x\', \'e\'], [\'x\']]\n
Run Code Online (Sandbox Code Playgroud)\n

使用collections.defaultdict

\n
    \n
  • 从 python 3.7 开始,dicts保证根据插入进行排序,因此dict.values()返回时将进行排序。
  • \n
  • 对于任何想要拥有段字典的人来说,此解决方案是一个不错的选择\n
      \n
    • 改成return list(dd.values())return dd
    • \n
    \n
  • \n
  • 如果列表中的第一个字符不是分隔符,此解决方案也将起作用\n
      \n
    • 鉴于:[\'a\', \'b\', \'c\', \'x\', \'a\', \'x\', \'x\', 1, 2, 3, \'a\', \'a\', \'x\', \'e\', \'x\']
    • \n
    • 回到:[[\'a\', \'b\', \'c\'], [\'x\', \'a\'], [\'x\'], [\'x\', 1, 2, 3, \'a\', \'a\'], [\'x\', \'e\'], [\'x\']]
    • \n
    \n
  • \n
\n
from collection import defaultdict\n\ndef sublist_by_delimiter(flat_list: list, delimiter: str) -> List[list]:\n    dd = defaultdict(list)\n    counter = 0\n    for v in flat_list:\n        if v == delimiter:\n            counter += 1\n            dd[counter].append(v)\n        else:\n            dd[counter].append(v)\n    return list(dd.values())\n\n\nsublist_by_delimiter(t, \'x\')\n[[\'x\', \'a\'], [\'x\'], [\'x\', 1, 2, 3, \'a\', \'a\'], [\'x\', \'e\'], [\'x\']]\n
Run Code Online (Sandbox Code Playgroud)\n

使用dict

\n
    \n
  • 3.61 s \xc2\xb1 9.85 ms per loop (mean \xc2\xb1 std. dev. of 7 runs, 1 loop each)对于 25M 元素列表\n
      \n
    • defaultdict3.74 s \xc2\xb1 53.7 ms
    • \n
    \n
  • \n
  • KeyError如前所述,如果第一个字符不是分隔符,则此解决方案将产生 a 。
  • \n
\n
def sublist_by_delimiter(flat_list: list, delimiter: str) -> List[list]:\n    dd = dict(list)\n    counter = 0\n    for v in flat_list:\n        if v == delimiter:\n            counter += 1\n            if dd.get(counter) == None:\n                dd[counter] =  [v]\n        else:\n            dd[counter].append(v)\n    return list(dd.values())\n
Run Code Online (Sandbox Code Playgroud)\n