使用多个分隔符提取文本

jez*_*ael 18 python regex string list

我有分隔符的字符串列表AB:

L = ['sgfgfqds A aaa','sderas B ffff','eeee','sdsdfd A rrr']
Run Code Online (Sandbox Code Playgroud)

并需要:

L1 = [['aaa'], ['ffff'], ['eeee'], ['rrr']] 
Run Code Online (Sandbox Code Playgroud)

我试过用:

L1 = [re.findall(r'(?<=A)(.*)$', inputtext) for inputtext in L]
print (L1)
Run Code Online (Sandbox Code Playgroud)

但是,它返回以下内容:

[[' aaa'], [], [], [' rrr']] 
Run Code Online (Sandbox Code Playgroud)

如何获得所需的输出?

wil*_*elm 21

你可以用它re.split来破坏你的字符串'A''B':

>>> L1 = [re.split(r'[AB] *', inputtext)[-1] for inputtext in L]
>>> L1
['aaa', 'ffff', 'eeee', 'rrr']
Run Code Online (Sandbox Code Playgroud)

  • 我想如果你的'A`和'B'总是用空格括起来,你最好使用`r'+ [AB] +'`(或`r'\ s + [AB]\s +'`). (5认同)

Rah*_*K P 6

替代建议没有regex.

[[i] for i in ' '.join(L).split(' ') if i.count(i[0]) == len(i) and len(i) > 1]
Run Code Online (Sandbox Code Playgroud)

结果

 [['aaa'], ['ffff'], ['eeee'], ['rrr']]
Run Code Online (Sandbox Code Playgroud)


tri*_*eee 6

您可以使用split返回列表的事实,即使它找不到分隔符.

 L1 = [[x.split(' A ')[-1].split(' B ')[-1]] for x in L]
Run Code Online (Sandbox Code Playgroud)