我一直在整理我们需要用新内容更新的页面列表(我们正在切换媒体格式).在这个过程中,我正在编写正确拥有新内容的页面.
这是我正在做的一般概念:
一切正常,直到第3个正则表达式模式匹配,我得到以下内容:
'NoneType' object has no attribute 'group'
# only interested in embeded content
pattern = "(<embed .*?</embed>)"
# matches content pointing to our old root
pattern2 = 'data="(http://.*?/media/.*?")'
# matches content pointing to our new root
pattern3 = 'data="(http://.*?/content/.*?")'
matches = re.findall(pattern, filebuffer)
for match in matches:
if len(match) > 0:
urla = re.search(pattern2, match)
if urla.group(1) is not None:
print filename, urla.group(1)
urlb = re.search(pattern3, match)
if urlb.group(1) is not None:
print filename, …
Run Code Online (Sandbox Code Playgroud)