我有一个字符串数组
urls_parts=['week', 'weeklytop', 'week/day']
Run Code Online (Sandbox Code Playgroud)
我需要在我的url中监视这些字符串的包含,所以这个例子只需要每周顶部分触发:
url='www.mysite.com/weeklytop/2'
for part in urls_parts:
if part in url:
print part
Run Code Online (Sandbox Code Playgroud)
但它当然也是由"周"引发的.这样做的方法是什么?
OOps,让我指一下我的问题.当url ='www.mysite.com/week/day/2'和part ='week'时,我需要不触发该代码.触发的唯一网址是part ='week'和url ='www例如,.mysite.com/week/2'或'www.mysite.com/week/2-second'
我就是这样做的.
import re
urls_parts=['week', 'weeklytop', 'week/day']
urls_parts = sorted(urls_parts, key=lambda x: len(x), reverse=True)
rexes = [re.compile(r'{part}\b'.format(part=part)) for part in urls_parts]
urls = ['www.mysite.com/weeklytop/2', 'www.mysite.com/week/day/2', 'www.mysite.com/week/4']
for url in urls:
for i, rex in enumerate(rexes):
if rex.search(url):
print url
print urls_parts[i]
print
break
Run Code Online (Sandbox Code Playgroud)
OUTPUT
www.mysite.com/weeklytop/2
weeklytop
www.mysite.com/week/day/2
week/day
www.mysite.com/week/4
week
Run Code Online (Sandbox Code Playgroud)
按长度排序的建议来自@Roman