bin*_*gru 7 python twisted scrapy python-asyncio
我正在学习用 scrapy 进行 python 抓取。我做了和教程教的完全一样的事情。但我得到了一个错误。请帮忙!
我的Python代码:
import scrapy
class BookSpider(scrapy.Spider):
name = "books"
allowed_domains = ["books.toscrape.com"]
start_urls = ["https://books.toscrape.com"]
def parse(self, response):
books = response.css("article.product_pod")
for book in books:
yield{
"name":book.css("h3 a::text").get(),
"price":book.css(".product_price .price_color::text").get(),
"url": book.css("h3 a").attrib["href"],
}
Run Code Online (Sandbox Code Playgroud)
终端显示
import scrapy
class BookSpider(scrapy.Spider):
name = "books"
allowed_domains = ["books.toscrape.com"]
start_urls = ["https://books.toscrape.com"]
def parse(self, response):
books = response.css("article.product_pod")
for book in books:
yield{
"name":book.css("h3 a::text").get(),
"price":book.css(".product_price .price_color::text").get(),
"url": book.css("h3 a").attrib["href"],
}
Run Code Online (Sandbox Code Playgroud)
ossignal.py 文件:
import signal
signal_names = {}
for signame in dir(signal):
if signame.startswith("SIG") and not signame.startswith("SIG_"):
signum = getattr(signal, signame)
if isinstance(signum, int):
signal_names[signum] = signame
def install_shutdown_handlers(function, override_sigint=True):
"""Install the given function as a signal handler for all common shutdown
signals (such as SIGINT, SIGTERM, etc). If override_sigint is ``False`` the
SIGINT handler won't be install if there is already a handler in place
(e.g. Pdb)
"""
from twisted.internet import reactor
reactor._handleSignals()
signal.signal(signal.SIGTERM, function)
if signal.getsignal(signal.SIGINT) == signal.default_int_handler or override_sigint:
signal.signal(signal.SIGINT, function)
# Catch Ctrl-Break in windows
if hasattr(signal, "SIGBREAK"):
signal.signal(signal.SIGBREAK, function)
Run Code Online (Sandbox Code Playgroud)
小智 19
正如我在评论中指出的,您所描述的问题已经被 scrapy 解决了,并且与它的一个依赖项扭曲有关(一天前,发布了一个新版本,23.8.0这似乎导致了该问题)。
另一位用户通过安装以前版本的twisted 解决了这个问题(请参阅此处)。
基本上,他安装了以下版本的twisted,这解决了他的问题。
pip install Twisted==22.10.0
Run Code Online (Sandbox Code Playgroud)
在问题解决并发布新版本之前,我建议使用以前的版本。