Jer*_*hez 5 multithreading screen-scraping flask python-3.x
我正在尝试从 Flask 应用程序的html.render()Python 模块运行该方法。requests_html但是,每当我的应用程序代码调用该函数时,我都会收到此错误:RuntimeError: There is no current event loop in thread 'Thread-1'.
这是使用该模块的函数html.render:
def extractor(url):
session = HTMLSession()
r = session.get(url)
soup = bs4.BeautifulSoup(r.text)
found = soup.find_all("a", href=privacy_regex)
if found:
print("Using Default Web Scraping bs4+regex")
found = [tag['href'] for tag in found]
uri = sorted(found, key=rank_url)[-1]
return urljoin(url, uri)
else:
print('Using HTML Rendering')
r.html.render()
links = r.html.absolute_links
privacy_links = [x for x in links if privacy_regex.search(x)]
uri = sorted(privacy_links, key=rank_url)[-1]
return urljoin(url, uri)
Run Code Online (Sandbox Code Playgroud)
这是我的应用程序代码
@app.route('/api', methods=['POST', 'GET'])
def text_output():
url = request.form['url_text']
print(url)
text, domain = url_input_parser(url)
print(text, domain)
Run Code Online (Sandbox Code Playgroud)
任何帮助表示赞赏!非常感谢!
| 归档时间: |
|
| 查看次数: |
300 次 |
| 最近记录: |