以下代码
class SiteSpider(BaseSpider):
name = "some_site.com"
allowed_domains = ["some_site.com"]
start_urls = [
"some_site.com/something/another/PRODUCT-CATEGORY1_10652_-1__85667",
]
rules = (
Rule(SgmlLinkExtractor(allow=('some_site.com/something/another/PRODUCT-CATEGORY_(.*)', ))),
# Extract links matching 'item.php' and parse them with the spider's method parse_item
Rule(SgmlLinkExtractor(allow=('some_site.com/something/another/PRODUCT-DETAIL(.*)', )), callback="parse_item"),
)
def parse_item(self, response):
.... parse stuff
Run Code Online (Sandbox Code Playgroud)
引发以下错误
Traceback (most recent call last):
File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 1174, in mainLoop
self.runUntilCurrent()
File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 796, in runUntilCurrent
call.func(*call.args, **call.kw)
File "/usr/lib/python2.6/dist-packages/twisted/internet/defer.py", line 318, in callback
self._startRunCallbacks(result)
File "/usr/lib/python2.6/dist-packages/twisted/internet/defer.py", line 424, in _startRunCallbacks
self._runCallbacks()
--- <exception caught …Run Code Online (Sandbox Code Playgroud) 当我从Scrapy教程运行蜘蛛时,我收到以下错误消息:
File "C:\Python26\lib\site-packages\twisted\internet\base.py", line 374, in fireEvent DeferredList(beforeResults).addCallback(self._continueFiring)
File "C:\Python26\lib\site-packages\twisted\internet\defer.py", line 195, in addCallback callbackKeywords=kw)
File "C:\Python26\lib\site-packages\twisted\internet\defer.py", line 186, in addCallbacks self._runCallbacks()
File "C:\Python26\lib\site-packages\twisted\internet\defer.py", line 328, in_runCallbacks self.result = callback(self.result, *args, **kw)
Run Code Online (Sandbox Code Playgroud)
--- <exception caught here>---
File "C:\Python26\lib\site-packages\twisted\internet\base.py", line 387, in _continueFiring callable(*args, **kwargs)
File "C:\Python26\lib\site-packages\twisted\internet\posixbase.py", line 356, in listenTCP p.startListening()
File "C:\Python26\lib\site-packages\twisted\internet\tcp.py", line 858, in startListening raise CannotListenError, (self.interface, self.port, le) twisted.internet.error.CannotListenError: Couldn't listen on any:6023: [Errno 10048]
Only one usage of each socket address (protocol/network address/port) is …Run Code Online (Sandbox Code Playgroud)