在python中删除()和[]之间的文本

Tic*_*Tic 29 python python-2.7

我有一长串文字()[]里面的文字.我试图删除括号和括号之间的字符,但我无法弄清楚如何.

该列表与此类似:

x = "This is a sentence. (once a day) [twice a day]"
Run Code Online (Sandbox Code Playgroud)

这个列表不是我正在使用的,但非常相似,而且更短.

谢谢您的帮助.

jva*_*ver 58

您可以使用re.sub函数.

>>> import re 
>>> x = "This is a sentence. (once a day) [twice a day]"
>>> re.sub("([\(\[]).*?([\)\]])", "\g<1>\g<2>", x)
'This is a sentence. () []'
Run Code Online (Sandbox Code Playgroud)

如果要删除[]和(),可以使用以下代码:

>>> import re 
>>> x = "This is a sentence. (once a day) [twice a day]"
>>> re.sub("[\(\[].*?[\)\]]", "", x)
'This is a sentence.  '
Run Code Online (Sandbox Code Playgroud)

重要说明:此代码不适用于嵌套符号

  • @markroxor第一个正则表达式组'('和']'进入组1(用括号括起来)和')'和']'进入组2.,匹配这些组和两组之间的所有字符.匹配后,匹配的部分被组1和2替换,最后的字符串在括号内没有任何内容.第二个正则表达式是自解释的 - >匹配所有内容并用空字符串替换.希望能帮助到你 (3认同)

pra*_*nsg 16

运行此脚本,它甚至可以使用嵌套括号.
使用基本的逻辑测试.

def a(test_str):
    ret = ''
    skip1c = 0
    skip2c = 0
    for i in test_str:
        if i == '[':
            skip1c += 1
        elif i == '(':
            skip2c += 1
        elif i == ']' and skip1c > 0:
            skip1c -= 1
        elif i == ')'and skip2c > 0:
            skip2c -= 1
        elif skip1c == 0 and skip2c == 0:
            ret += i
    return ret

x = "ewq[a [(b] ([c))]] This is a sentence. (once a day) [twice a day]"
x = a(x)
print x
print repr(x)
Run Code Online (Sandbox Code Playgroud)

只是因为你没有运行它,
这是输出:

>>> 
ewq This is a sentence.  
'ewq This is a sentence.  ' 
Run Code Online (Sandbox Code Playgroud)


mbo*_*den 13

这适用于parens.正则表达式将"使用"它匹配的文本,因此它不适用于嵌套的parens.

import re
regex = re.compile(".*?\((.*?)\)")
result = re.findall(regex, mystring)
Run Code Online (Sandbox Code Playgroud)

或者这会找到一组parens ......只需循环找到更多

start = mystring.find( '(' )
end = mystring.find( ')' )
if start != -1 and end != -1:
  result = mystring[start+1:end]
Run Code Online (Sandbox Code Playgroud)

  • 我不知道为什么这个答案标记为正确.要求*删除*文本的问题,而不是返回它.我有同样的需要(删除某些字符之间的文本)和@ jvallver的回答帮助了我. (9认同)
  • 这与OP要求相反 (2认同)

jfs*_*jfs 9

这是一个类似于@pradyunsg的答案的解决方案(它适用于任意嵌套括号):

def remove_text_inside_brackets(text, brackets="()[]"):
    count = [0] * (len(brackets) // 2) # count open/close brackets
    saved_chars = []
    for character in text:
        for i, b in enumerate(brackets):
            if character == b: # found bracket
                kind, is_close = divmod(i, 2)
                count[kind] += (-1)**is_close # `+1`: open, `-1`: close
                if count[kind] < 0: # unbalanced bracket
                    count[kind] = 0  # keep it
                else:  # found bracket to remove
                    break
        else: # character is not a [balanced] bracket
            if not any(count): # outside brackets
                saved_chars.append(character)
    return ''.join(saved_chars)

print(repr(remove_text_inside_brackets(
    "This is a sentence. (once a day) [twice a day]")))
# -> 'This is a sentence.  '
Run Code Online (Sandbox Code Playgroud)

  • 乍一看看起来很复杂,但比我的好(而且绝对是公认的(我的意见)) (2认同)

小智 8

您可以拆分、过滤并再次连接字符串。如果您的括号定义良好,则应该执行以下代码。

import re
x = "".join(re.split("\(|\)|\[|\]", x)[::2])
Run Code Online (Sandbox Code Playgroud)

  • 很晚了,但好多了。:-P (2认同)

Avi*_*aut 5

你可以试试这个。可以去掉括号,里面的内容就存在。

 import re
    x = "This is a sentence. (once a day) [twice a day]"
    x = re.sub("\(.*?\)|\[.*?\]","",x)
    print(x)
Run Code Online (Sandbox Code Playgroud)

预期输出:

This is a sentence. 
Run Code Online (Sandbox Code Playgroud)