如何使用python正则表达式替换使用捕获的组？

Question

如何使用python正则表达式替换使用捕获的组？

假设我想the blue dog and blue cat wore blue hats改为the gray dog and gray cat wore blue hats.

随着sed我能做到这一点,如下所示:

$ echo 'the blue dog and blue cat wore blue hats' | sed 's/blue \(dog\|cat\)/gray \1/g'

Run Code Online (Sandbox Code Playgroud)

如何在Python中进行类似的替换？我试过了:

>>> import re
>>> s = "the blue dog and blue cat wore blue hats"
>>> p = re.compile(r"blue (dog|cat)")
>>> p.sub('gray \1',s)
'the gray \x01 and gray \x01 wore blue hats'

Run Code Online (Sandbox Code Playgroud)

Answer 1

mac*_*mac 63

你需要逃避反斜杠:

p.sub('gray \\1', s)

Run Code Online (Sandbox Code Playgroud)

或者你可以像正在使用正则表达式一样使用原始字符串:

p.sub(r'gray \1', s)

Run Code Online (Sandbox Code Playgroud)

第一个行是 re.sub("blue (dog|cat)", "gray \\1", s); (3认同)

Answer 2

jus*_*ile 21

因为我正在寻找类似的答案; 但是想要在替换中使用命名组,我想我会为其他人添加代码:

p = re.compile(r'blue (?P<animal>dog|cat)')
p.sub(r'gray \g<animal>',s)

Run Code Online (Sandbox Code Playgroud)

Answer 3

CAB*_*CAB 6

试试这个:

p.sub('gray \g<1>',s)

Run Code Online (Sandbox Code Playgroud)

不错的选择(+1),但它仍然有效,因为`\ g`不是有效的转义代码.编写代码的安全方法仍应是:`p.sub('gray \\ g <1>',s)` (4认同)
@mac考虑在此处将您的评论添加到您的答案中.它是唯一可以在ipython笔记本中可靠运行的东西. (2认同)

Answer 4

Tho*_*ner 5

离题，对于编号捕获组：

#/usr/bin/env python
import re

re.sub(
    pattern=r'(\d)(\w+)', 
    repl='word: \\2, digit: \\1', 
    string='1asdf'
)

Run Code Online (Sandbox Code Playgroud)

word: asdf, digit: 1

Python使用文字反斜杠和一个基于索引的索引来进行编号的捕获组替换，如本示例所示。因此\1，输入为'\\1'，引用第一个捕获组(\d)和\2第二个捕获组。

归档时间：	14 年，5 月前
查看次数：	39993 次
最近记录：	7 年，11 月前