我正在学习 python 网络抓取。当我 scrapy 爬行蜘蛛时,它显示 AttributeError

bin*_*gru 7 python twisted scrapy python-asyncio

我正在学习用 scrapy 进行 python 抓取。我做了和教程教的完全一样的事情。但我得到了一个错误。请帮忙!

我的Python代码:

import scrapy


class BookSpider(scrapy.Spider):
    name = "books"
    allowed_domains = ["books.toscrape.com"]
    start_urls = ["https://books.toscrape.com"]

    def parse(self, response):
        books = response.css("article.product_pod")
                             
        for book in books:
            yield{
                "name":book.css("h3 a::text").get(),
                "price":book.css(".product_price .price_color::text").get(),
                "url": book.css("h3 a").attrib["href"],
            }
Run Code Online (Sandbox Code Playgroud)

终端显示

import scrapy


class BookSpider(scrapy.Spider):
    name = "books"
    allowed_domains = ["books.toscrape.com"]
    start_urls = ["https://books.toscrape.com"]

    def parse(self, response):
        books = response.css("article.product_pod")
                             
        for book in books:
            yield{
                "name":book.css("h3 a::text").get(),
                "price":book.css(".product_price .price_color::text").get(),
                "url": book.css("h3 a").attrib["href"],
            }
Run Code Online (Sandbox Code Playgroud)

ossignal.py 文件:

import signal

signal_names = {}
for signame in dir(signal):
    if signame.startswith("SIG") and not signame.startswith("SIG_"):
        signum = getattr(signal, signame)
        if isinstance(signum, int):
            signal_names[signum] = signame


def install_shutdown_handlers(function, override_sigint=True):
    """Install the given function as a signal handler for all common shutdown
    signals (such as SIGINT, SIGTERM, etc). If override_sigint is ``False`` the
    SIGINT handler won't be install if there is already a handler in place
    (e.g.  Pdb)
    """
    from twisted.internet import reactor

    reactor._handleSignals()
    signal.signal(signal.SIGTERM, function)
    if signal.getsignal(signal.SIGINT) == signal.default_int_handler or override_sigint:
        signal.signal(signal.SIGINT, function)
    # Catch Ctrl-Break in windows
    if hasattr(signal, "SIGBREAK"):
        signal.signal(signal.SIGBREAK, function)
Run Code Online (Sandbox Code Playgroud)

小智 19

正如我在评论中指出的,您所描述的问题已经被 scrapy 解决了并且与它的一个依赖项扭曲有关(一天前,发布了一个新版本,23.8.0这似乎导致了该问题)。

另一位用户通过安装以前版本的twisted 解决了这个问题(请参阅此处)。

基本上,他安装了以下版本的twisted,这解决了他的问题。

pip install Twisted==22.10.0
Run Code Online (Sandbox Code Playgroud)

在问题解决并发布新版本之前,我建议使用以前的版本。