所以我正在编写一个程序来搜索关键字的职位发布.我已经有了将整个作业描述转换为单个单词列表,删除空格,特殊字符,使所有内容都小写等的代码.
我想做一些事我可以像"打印一些东西,如果这个列表包含python但不打印它,如果有的话python,VBA.这就是我拥有的东西:
def query_job_posting(url, query_list_include, query_list_exclude):
soup = create_soup(url)
...list formatting functions...
for i in job_description_list:
if any(word in i for word in query_list_include) and not any(exclude in i for exclude in query_list_exclude):
print(url)
Run Code Online (Sandbox Code Playgroud)
job_description list 看起来像这样:
['this',
'is',
'a',
'vba',
'job',
'python']
Run Code Online (Sandbox Code Playgroud)
但它似乎没有起作用.
如果query_list_include=['python']和query_list_exclude=[]随后的URL打印.
如果query_list_exclude=['vba']和query_list_include=[]那么URL不打印.
但是如果我python按照包含和vba排除的方式离开,那么网址仍会打印,即使我手动验证了两者vba并且python都在job_descripton_list
我在哪里错了?
您实际上在列表的每个元素中查找每个单词:
for e in list:
if any(w in e for w in include) and not any(w in e for w in exclude):
print(url)
Run Code Online (Sandbox Code Playgroud)
具体如下:
'this' # do nothing
'is' # do nothing
'a' # do nothing
'job' # do nothing
'python' # print url
Run Code Online (Sandbox Code Playgroud)
您可以使用以下方式验证它
for e in list:
if any(w in e for w in include) and not any(w in e for w in exclude):
print(e, url)
Run Code Online (Sandbox Code Playgroud)
哪个应该打印python <url>.在这种情况下,列表中的"VBA"将不会改变任何内容
从你想要的解释:
url = ...
list = ['this', 'is', 'a', 'job', 'python']
include = ['python']
exclude = ['VBA']
if any(w in list for w in include) and not any(w in list for w in exclude):
print(url)
Out[]: <url>
Run Code Online (Sandbox Code Playgroud)
它从if声明中评估条件:
'python' in list --> True
'VBA' not in list --> True
Run Code Online (Sandbox Code Playgroud)
然后执行 print(url)
| 归档时间: |
|
| 查看次数: |
91 次 |
| 最近记录: |