And*_*Dog 26
Python的正则表达式模块不默认为多行^匹配,因此您需要明确指定该标志.
r = re.compile(r"^\s+", re.MULTILINE)
r.sub("", "a\n b\n c") # "a\nb\nc"
# or without compiling (only possible for Python 2.7+ because the flags option
# didn't exist in earlier versions of re.sub)
re.sub(r"^\s+", "", "a\n b\n c", flags = re.MULTILINE)
# but mind that \s includes newlines:
r.sub("", "a\n\n\n\n b\n c") # "a\nb\nc"
Run Code Online (Sandbox Code Playgroud)
也可以在模式中包含内联标志:
re.sub(r"(?m)^\s+", "", "a\n b\n c")
Run Code Online (Sandbox Code Playgroud)
更简单的解决方案是避免使用正则表达式,因为原始问题非常简单:
content = 'a\n b\n\n c'
stripped_content = ''.join(line.lstrip(' \t') for line in content.splitlines(True))
# stripped_content == 'a\nb\n\nc'
Run Code Online (Sandbox Code Playgroud)
@AndiDog在他(目前接受的)答案中承认,它连续播出了新的排名.
这是如何解决这个缺陷,这是由同时存在的\n空白和行分隔符引起的.我们需要做的是创建一个仅包含除换行符之外的空白字符的重新类.
我们想要的whitespace and not newline,不能直接在re class中表达.让我们把它重写为not not (whitespace and not newline)ie not(not whitespace or not not newline(谢谢,奥古斯都),not(not whitespace or newline)即[^\S\n]用re符号表示.
所以:
>>> re.sub(r"(?m)^[^\S\n]+", "", " a\n\n \n\n b\n c\nd e")
'a\n\n\n\nb\nc\nd e'
Run Code Online (Sandbox Code Playgroud)
你可以尝试,strip()如果你想要前后移除,或lstrip()前面
>>> s=" string with front spaces and back "
>>> s.strip()
'string with front spaces and back'
>>> s.lstrip()
'string with front spaces and back '
for line in open("file"):
print line.lstrip()
Run Code Online (Sandbox Code Playgroud)
如果你真的想使用正则表达式
>>> import re
>>> re.sub("^\s+","",s) # remove the front
'string with front spaces and back '
>>> re.sub("\s+\Z","",s)
' string with front spaces and back' #remove the back
Run Code Online (Sandbox Code Playgroud)