Python在开始和结束关键字模式中将列表拆分为子列表

Leo*_*ead 10 python split list sublist

如果我有一个清单,请说:

lst = ['foo', 'bar', '!test', 'hello', 'world!', 'word']
Run Code Online (Sandbox Code Playgroud)

如果有一个字符!,我将如何返回给出的列表:

lst = ['foo', 'bar', ['test', 'hello', 'world'], 'word']
Run Code Online (Sandbox Code Playgroud)

我很难找到解决方案.这是我尝试过的一种方法:

def define(lst):
    for index, item in enumerate(lst):
        if item[0] == '!' and lst[index+2][-1] == '!':
            temp = lst[index:index+3]
            del lst[index+1:index+2]
            lst[index] = temp
    return lst
Run Code Online (Sandbox Code Playgroud)

任何帮助将不胜感激.

Aza*_*kov 12

假设没有元素以!类似的方式开始和结束'!foo!'.

首先,我们可以编写辅助谓词

def is_starting_element(element):
    return element.startswith('!')


def is_ending_element(element):
    return element.endswith('!')
Run Code Online (Sandbox Code Playgroud)

然后我们可以编写生成器函数(因为它们很棒)

def walk(elements):
    elements = iter(elements)  # making iterator from passed iterable
    for position, element in enumerate(elements):
        if is_starting_element(element):
            yield [element[1:], *walk(elements)]
        elif is_ending_element(element):
            yield element[:-1]
            return
        else:
            yield element
Run Code Online (Sandbox Code Playgroud)

测试:

>>> lst = ['foo', 'bar', '!test', 'hello', 'world!', 'word']
>>> list(walk(lst))
['foo', 'bar', ['test', 'hello', 'world'], 'word']
>>> lst = ['foo', 'bar', '!test', '!hello', 'world!', 'word!']
>>> list(walk(lst))
['foo', 'bar', ['test', ['hello', 'world'], 'word']]
>>> lst = ['hello!', 'world!']
>>> list(walk(lst))
['hello']
Run Code Online (Sandbox Code Playgroud)

我们可以从过去的例子看是否有更多的闭合装置不是打开那些剩余的封闭元件将被忽略(这是因为我们return从发电机荷兰国际集团).因此,如果lst具有无效的签名(开始和结束元素之间的差异不等于零),那么我们可能会有一些不可预测的行为.作为摆脱这种情况的一种方法,我们可以在处理之前验证给定的数据,并在数据无效时引发错误.

我们可以编写类似的验证器

def validate_elements(elements):
    def get_sign(element):
        if is_starting_element(element):
            return 1
        elif is_ending_element(element):
            return -1
        else:
            return 0

    signature = sum(map(get_sign, elements))
    are_elements_valid = signature == 0
    if not are_elements_valid:
        error_message = 'Data is invalid: '
        if signature > 0:
            error_message += ('there are more opening elements '
                              'than closing ones.')
        else:
            error_message += ('there are more closing elements '
                              'than opening ones.')
        raise ValueError(error_message)
Run Code Online (Sandbox Code Playgroud)

测试

>>> lst = ['!hello', 'world!']
>>> validate_elements(lst)  # no exception raised, data is valid
>>> lst = ['!hello', '!world']
>>> validate_elements(lst)
...
ValueError: Data is invalid: there are more opening elements than closing ones.
>>> lst = ['hello!', 'world!']
>>> validate_elements(lst)
...
ValueError: Data is invalid: there are more closing elements than opening ones.
Run Code Online (Sandbox Code Playgroud)

最后我们可以像验证一样编写函数

def to_sublists(elements):
    validate_elements(elements)
    return list(walk(elements))
Run Code Online (Sandbox Code Playgroud)

测试

>>> lst = ['foo', 'bar', '!test', 'hello', 'world!', 'word']
>>> to_sublists(lst)
['foo', 'bar', ['test', 'hello', 'world'], 'word']
>>> lst = ['foo', 'bar', '!test', '!hello', 'world!', 'word!']
>>> to_sublists(lst)
['foo', 'bar', ['test', ['hello', 'world'], 'word']]
>>> lst = ['hello!', 'world!']
>>> to_sublists(lst)
...
ValueError: Data is invalid: there are more closing elements than opening ones.
Run Code Online (Sandbox Code Playgroud)

编辑

如果我们想处理其开始和结尾的元素!一样'!bar!',我们可以通过修改walk使用功能itertools.chain类似

from itertools import chain


def walk(elements):
    elements = iter(elements)
    for position, element in enumerate(elements):
        if is_starting_element(element):
            yield list(walk(chain([element[1:]], elements)))
        elif is_ending_element(element):
            element = element[:-1]
            yield element
            return
        else:
            yield element
Run Code Online (Sandbox Code Playgroud)

我们还需要通过修改get_sign功能来完成验证

def get_sign(element):
    if is_starting_element(element):
        if is_ending_element(element):
            return 0
        return 1
    if is_ending_element(element):
        return -1
    return 0
Run Code Online (Sandbox Code Playgroud)

测试

>>> lst = ['foo', 'bar', '!test', '!baz!', 'hello', 'world!', 'word']
>>> to_sublists(lst)
['foo', 'bar', ['test', ['baz'], 'hello', 'world'], 'word']
Run Code Online (Sandbox Code Playgroud)