相关疑难解决方法(0)

除了XHTML自包含标记之外,RegEx匹配开放标记

我需要匹配所有这些开始标记:

<p>
<a href="foo">

Run Code Online (Sandbox Code Playgroud)

但不是这些:

<br />
<hr class="foo" />

Run Code Online (Sandbox Code Playgroud)

我想出了这个,并希望确保我做对了.我只抓住了a-z.

<([a-z]+) *[^/]*?>

Run Code Online (Sandbox Code Playgroud)

我相信它说:

找一个小于,然后
然后,查找(并捕获)az一次或多次
然后找到零个或多个空格
找到任何字符零次或多次,贪婪/,然后
找到一个大于

我有这个权利吗？更重要的是,你怎么看？

html regex xhtml

Jef*_*eff

2012 05-27

1323
推荐指数

36
解决办法

270万
查看次数

Python正则表达式-当它在html标签中时不匹配单词

如果它在 html 标签中，我需要编写与单词不匹配的正则表达式。

这是文本示例：

asdd qwe <a href="http://example.com" title="Some title with word qwe" class="external-link" rel="nofollow">  qwe

Run Code Online (Sandbox Code Playgroud)

我的正则表达式现在看起来像这样：

(?!(\<.+))[^a-zA-Z?????ó????????Ó???](<class="bad-word"(?: style="[^"]+")?>)?(qwe)(<>)?[^a-zA-Z?????ó????????Ó???](?!.+\>)

Run Code Online (Sandbox Code Playgroud)

这有点复杂，但everythink 的工作期望当我在 regex101.com 和 regexr.com 上测试它时，它只匹配 html 标签之后的单词。

知道为什么吗？

编辑：

我不想使用 html 解析器或 DOM 操作，我不想更改这么多代码。

def test_tagged_word_present(self):
    input = 'words <a href="example.com" title="title with word qwe" class="external-link" rel="nofollow"> qwe some other words'
    expected = 'words <a href="example.com" title="title with word qwe" class="external-link" rel="nofollow"><strong class="bad-word" style="color:red">qwe</strong> some other words'
    parser = self.get_test_parser(input, search_word='qwe')
    text = parser.mark_words()
    self.assertEqual(text, expected)

Run Code Online (Sandbox Code Playgroud)

一切正常，除了正则表达式仍然缓存qwe在标题中。

python regex

Cos*_*uee

2015 10-12

-1
推荐指数

1
解决办法

1120
查看次数