我在使用 scrapy 抓取网站时遇到了一些问题。我按照 scrapy 的教程学习如何抓取网站,我有兴趣在网站“ https://www.leboncoin.fr ”上测试它,但蜘蛛不起作用。所以,我尝试过:
scrapy shell 'https://www.leboncoin.fr'
Run Code Online (Sandbox Code Playgroud)
但是,我没有得到该网站的回应。
$ scrapy shell 'https://www.leboncoin.fr'
2017-05-16 08:31:26 [scrapy.utils.log] INFO: Scrapy 1.3.3 started (bot: all_cote)
2017-05-16 08:31:26 [scrapy.utils.log] INFO: Overridden settings: {'BOT_NAME': 'all_cote', 'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter', 'LOGSTATS_INTERVAL': 0, 'NEWSPIDER_MODULE': 'all_cote.spiders', 'ROBOTSTXT_OBEY': True, 'SPIDER_MODULES': ['all_cote.spiders']}
2017-05-16 08:31:27 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole']
2017-05-16 08:31:27 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2017-05-16 08:31:27 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware'] …
Run Code Online (Sandbox Code Playgroud) 我尝试在 python 3.6.9 环境中安装 pycocotools 时遇到问题。
\n\n我正在 Windows 10 上运行 ubuntu 18.04。我已经创建了一个环境并激活了它。我想安装软件包并使用:
\n\npip install <package> --user\n
Run Code Online (Sandbox Code Playgroud)\n\n当我安装 cython 时,它工作正常,但是当我安装 pycocotools 时:
\n\npip install pycocotools --user\n
Run Code Online (Sandbox Code Playgroud)\n\n我的错误如下:
\n\nDownloading https://files.pythonhosted.org/packages/96/84/9a07b1095fd8555ba3f3d519517c8743c2554a245f9476e5e39869f948d2/pycocotools-2.0.0.tar.gz (1.5MB)\n100% |\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88| 1.5MB 1.9MB/s\nComplete output from command python setup.py egg_info:\nTraceback (most recent call last):\n File "<string>", line 1, in <module>\n File "/tmp/pip-build-mut7_bkf/pycocotools/setup.py", line 2, in <module>\n from Cython.Build import cythonize\nModuleNotFoundError: No module named \'Cython\'\n\n----------------------------------------\nCommand "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-mut7_bkf/pycocotools/\n
Run Code Online (Sandbox Code Playgroud)\n\n我不明白这个错误,因为之前安装了 …