使用Separator变量在Python中拆分字符串

bsi*_*qui 2 python regex

我正在尝试编写一个函数来分割给定分隔符的字符串.我已经看到了使用正则表达式忽略所有特殊字符的类似问题的答案,但我希望能够传入一个分隔符变量.

到目前为止我有:

def split_string(source, separators): 
    source_list = source
    for separator in separators:
        if separator in source_list:
                source_list.replace(separator, ' ') 
    return source_list.split()
Run Code Online (Sandbox Code Playgroud)

但它并没有删除分隔符

mgi*_*son 5

正则表达式解决方案(对我来说)似乎很容易:

import re
def split_string(source,separators):
    return re.split('[{0}]'.format(re.escape(separators)),source)
Run Code Online (Sandbox Code Playgroud)

例:

>>> import re
>>> def split_string(source,separators):
...     return re.split('[{0}]'.format(re.escape(separators)),source)
... 
>>> split_string("the;foo: went to the store",':;')
['the', 'foo', ' went to the store']
Run Code Online (Sandbox Code Playgroud)

在这里使用正则表达式的原因是,如果您希望' '在分隔符中使用正则表达式,这仍然有效...


一个替代方案(我认为我更喜欢),你可以有多字符分隔符:

def split_string(source,separators):
    return re.split('|'.join(re.escape(x) for x in separators),source)
Run Code Online (Sandbox Code Playgroud)

在这种情况下,多字符分隔符事物作为某种非字符串可迭代(例如元组或列表)传递,但单字符分隔符仍然可以作为单个字符串传入.

>>> def split_string(source,separators):
...     return re.split('|'.join(re.escape(x) for x in separators),source)
... 
>>> split_string("the;foo: went to the store",':;')
['the', 'foo', ' went to the store']
>>> split_string("the;foo: went to the store",['foo','st'])
['the;', ': went to the ', 'ore']
Run Code Online (Sandbox Code Playgroud)

或者,最后,如果你想分割连续的分隔符:

def split_string(source,separators):
    return re.split('(?:'+'|'.join(re.escape(x) for x in separators)+')+',source)
Run Code Online (Sandbox Code Playgroud)

这使:

>>> split_string("Before the rain ... there was lightning and thunder.", " .")
['Before', 'the', 'rain', 'there', 'was', 'lightning', 'and', 'thunder', '']
Run Code Online (Sandbox Code Playgroud)