Why is my scrapy spider not following the Request callback in my item parse function?

Abe*_*ker 6 python scrapy

I'm scraping a site to check in-stock status of various products. Unfortunately this requires actually clicking "Add to Cart" on the product page and checking the next page's message to determine if stock is available (i.e. it requires parsing two responses).

I followed the excellent documentation for this scenario and wrote my parse function to return a Request object with a callback to my secondary parse function. However, this function rarely gets called. Most products result in only seeing "Before return request" show up in the log, but it does get called properly for a small portion of products.

Any clue what is going wrong here? I've ran out of ideas.

foo/spiders/atlantic_firearms_spider.py:

from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from scrapy.selector import HtmlXPathSelector
from scrapy.http import FormRequest
from foo.items import AtlanticFirearmsItem

import datetime
import re

class AtlanticFirearmsSpider(CrawlSpider):
    name = "atlantic_firearms"
    allowed_domains = ["atlanticfirearms.com"]
    start_urls = [
        "http://www.atlanticfirearms.com"
    ]

    rules = (
        Rule(SgmlLinkExtractor(allow=['detail.html']), callback='parse_product'),
        Rule(SgmlLinkExtractor(allow=[], deny=['/bro', '/news', '/howtobuy', '/component/search', 'askquestion'])),
    )

    def parse_product(self, response):
      hxs = HtmlXPathSelector(response)
      product = AtlanticFirearmsItem()
      add_to_cart = any([hxs.select("descendant-or-self::input[@name = 'addtocart']"),
                         hxs.select("descendant-or-self::input[@value = 'Add to Cart']"),
                         hxs.select("//a[text() = 'Add to Cart']")])
      product['url'] = response.url
      product['as_of_time'] = datetime.datetime.now()

      if add_to_cart:
          # attempt to add to cart to verify availability
          request = FormRequest.from_response(response, formname="addtocartForm", callback=self.parse_add_to_cart)
          request.meta['product'] = product
          print "Before return request"
          return request
      else:
          product['in_stock'] = False
          return product

    def parse_add_to_cart(self, response):
        print "Inside parse_add_to_cart"
        product = response.meta['product']
        hxs = HtmlXPathSelector(response)
        product['in_stock'] = not(hxs.select("//text()[contains(.,'We regret to inform you that this product')]"))
        return product
Run Code Online (Sandbox Code Playgroud)

foo/items.py:

from scrapy.item import Item, Field

class AtlanticFirearmsItem(Item):
    in_stock = Field()
    url = Field()
    as_of_time = Field()
Run Code Online (Sandbox Code Playgroud)

编辑:按要求添加日志文件:

2013-09-21 07:25:14-0500 [scrapy] INFO: Scrapy 0.18.2 started (bot: foo)
2013-09-21 07:25:14-0500 [scrapy] DEBUG: Optional features available: ssl, http11
2013-09-21 07:25:14-0500 [scrapy] DEBUG: Overridden settings: {'SPIDER_MODULES': ['foo.spiders'], 'BOT_NAME': 'foo'}
2013-09-21 07:25:14-0500 [scrapy] DEBUG: Enabled extensions: LogStats, TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState
2013-09-21 07:25:14-0500 [scrapy] DEBUG: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRef
reshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2013-09-21 07:25:14-0500 [scrapy] DEBUG: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2013-09-21 07:25:14-0500 [scrapy] DEBUG: Enabled item pipelines: 
2013-09-21 07:25:14-0500 [atlantic_firearms] INFO: Spider opened
2013-09-21 07:25:14-0500 [atlantic_firearms] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2013-09-21 07:25:14-0500 [scrapy] DEBUG: Telnet console listening on 0.0.0.0:6023
2013-09-21 07:25:14-0500 [scrapy] DEBUG: Web service listening on 0.0.0.0:6080
2013-09-21 07:25:16-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com> (referer: None)
2013-09-21 07:25:16-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.cloudflare.com': <GET http://www.cloudflare.com/email-protection>
2013-09-21 07:25:16-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.constantcontact.com': <GET http://www.constantcontact.com/jmml/email-marketing.jsp>
2013-09-21 07:25:16-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.fdicreative.com': <GET http://www.fdicreative.com/>
2013-09-21 07:25:16-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.redjacketfirearms.com': <GET https://www.redjacketfirearms.com/>
2013-09-21 07:25:17-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/virtuemart/featured-not-published/wolf-ammunition-45acp-500-round-case-detail.
html?Itemid=0> (referer: http://www.atlanticfirearms.com)
Before return request
2013-09-21 07:25:18-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/virtuemart/featured-not-published/vector-arms-sp89-k-style-pistol-9mm-detail.h
tml?Itemid=0> (referer: http://www.atlanticfirearms.com)
Before return request
2013-09-21 07:25:18-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/> (referer: http://www.atlanticfirearms.com)
2013-09-21 07:25:18-0500 [atlantic_firearms] DEBUG: Filtered duplicate request: <GET http://www.atlanticfirearms.com/component/virtuemart/featured-not-published/vector-arms-sp89-k-style-pisto
l-9mm-detail.html?Itemid=0> - no more duplicates will be shown (see DUPEFILTER_CLASS)
2013-09-21 07:25:18-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/virtuemart/featured-not-published/wolf-223-ar15-rifle-ammo-500-round-case-deta
il.html?Itemid=0> (referer: http://www.atlanticfirearms.com)
Before return request
2013-09-21 07:25:18-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/virtuemart/featured-not-published/us-palm-air-save-plate-carrier-detail.html?I
temid=0> (referer: http://www.atlanticfirearms.com)
Before return request
2013-09-21 07:25:19-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/virtuemart/featured-not-published/545-x-39-russian-ak74-ammo-1080-round-case-d
etail.html?Itemid=0> (referer: http://www.atlanticfirearms.com)
Before return request
2013-09-21 07:25:19-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/virtuemart/featured-not-published/red-army-standard-7-62x39mm-360-round-range-
pack-detail.html?Itemid=0> (referer: http://www.atlanticfirearms.com)
Before return request
2013-09-21 07:25:19-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/virtuemart/shipping-rifles/vector-arms-mp5-style-rifle-detail.html?Itemid=0> (
referer: http://www.atlanticfirearms.com)
Before return request
2013-09-21 07:25:19-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/virtuemart/shipping-accessories/wolf-ammunition-for-sale-ak47-detail.html?Item
id=0> (referer: http://www.atlanticfirearms.com)
Before return request
2013-09-21 07:25:20-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/virtuemart/shipping-rifles/dsa-zm4-flat-top-ar15-carbine-dszm4cv1r-detail.html
?Itemid=0> (referer: http://www.atlanticfirearms.com)
Before return request
2013-09-21 07:25:20-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/virtuemart/shipping-accessories/m92-ak47-yugoslavian-7-62x39mm-bolt-hold-open-
metal-mags-pack-of-two-detail.html?Itemid=0> (referer: http://www.atlanticfirearms.com)
Before return request
2013-09-21 07:25:21-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/virtuemart/shipping-rifles/vector-arms-v94-9mm-mp5-style-pistol-full-size-deta
il.html?Itemid=0> (referer: http://www.atlanticfirearms.com)
Before return request
2013-09-21 07:25:21-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/virtuemart/shipping-rifles/zastava-ak-47-m70b1-pap-7-62x39mm-rifles-w-2-hi-cap
-mags-detail.html?Itemid=0> (referer: http://www.atlanticfirearms.com)
Before return request
2013-09-21 07:25:21-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/virtuemart/shipping-rifles/ptr-91-gi-rifle-939-atlanticfirearms-com-detail.htm
l?Itemid=0> (referer: http://www.atlanticfirearms.com)
Before return request
2013-09-21 07:25:21-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/virtuemart/shipping-rifles/pap-m92-7-62x39-pistol-detail.html?Itemid=0> (refer
er: http://www.atlanticfirearms.com)
Before return request
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/content/article/86-static-pages/159-resources.html> (referer: http://www.atlan
ticfirearms.com)
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.atsconsultingcorp.com': <GET http://www.atsconsultingcorp.com/>                                  [52/1905]
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.bullseyemarket.com': <GET http://www.bullseyemarket.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.corilam.com': <GET http://www.corilam.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'chancebrownrealestate.com': <GET http://chancebrownrealestate.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.delsolservices.com': <GET http://www.delsolservices.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.elkhornoutfitters.com': <GET http://www.elkhornoutfitters.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.frontierlogistics.com': <GET http://www.frontierlogistics.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.gpstrackingkey.com': <GET http://www.gpstrackingkey.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.hanshawkennedy.com': <GET http://www.hanshawkennedy.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'worldenv.com': <GET http://worldenv.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'purgexonline.com': <GET http://purgexonline.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'bumpfirestocks.com': <GET http://bumpfirestocks.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.texrestaurantequipment.com': <GET http://www.texrestaurantequipment.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.houston-refinance.com': <GET http://www.houston-refinance.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'johnson-bryan.com': <GET http://johnson-bryan.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'kanesforms.com': <GET http://kanesforms.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.markfoxrealestate.com': <GET http://www.markfoxrealestate.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.mphoa.org': <GET http://www.mphoa.org/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.outfitterwebsites.com': <GET http://www.outfitterwebsites.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'outdoortrailsnetwork.com': <GET http://outdoortrailsnetwork.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.psychologicalriskservices.com': <GET http://www.psychologicalriskservices.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.rcshouston.com': <GET http://www.rcshouston.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.rollingcreekcarwash.com': <GET http://www.rollingcreekcarwash.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'slammc.com': <GET http://slammc.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.texassaltwaterfishingguide.com': <GET http://www.texassaltwaterfishingguide.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.waynepigment.com': <GET http://www.waynepigment.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'bancroftfeldman.com': <GET http://bancroftfeldman.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'elilanddesign.com': <GET http://elilanddesign.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'dpharms.com': <GET http://dpharms.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'contractlandstaff.com': <GET http://contractlandstaff.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'knightsplumbing.com': <GET http://knightsplumbing.com/>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Redirecting (303) to <GET http://www.atlanticfirearms.com/browse-our-products.html> from <POST http://www.atlanticfirearms.com/component/vi
rtuemart/featured-not-published/index.php>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Redirecting (303) to <GET http://www.atlanticfirearms.com/browse-our-products.html> from <POST http://www.atlanticfirearms.com/component/vi
rtuemart/featured-not-published/index.php>
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/component/virtuemart/shipping-rifles/ati-omni-5-56-poly-competition-m4-carbine-detail.ht
ml?Itemid=0> (referer: http://www.atlanticfirearms.com)
Before return request
2013-09-21 07:25:22-0500 [atlantic_firearms] DEBUG: Redirecting (303) to <GET http://www.atlanticfirearms.com/browse-our-products.html> from <POST http://www.atlanticfirearms.com/component/vi
rtuemart/featured-not-published/index.php>
2013-09-21 07:25:23-0500 [atlantic_firearms] DEBUG: Redirecting (303) to <GET http://www.atlanticfirearms.com/browse-our-products.html> from <POST http://www.atlanticfirearms.com/component/vi
rtuemart/featured-not-published/index.php>
2013-09-21 07:25:23-0500 [atlantic_firearms] DEBUG: Redirecting (303) to <GET http://www.atlanticfirearms.com/browse-our-products.html> from <POST http://www.atlanticfirearms.com/component/vi
rtuemart/featured-not-published/index.php>
2013-09-21 07:25:23-0500 [atlantic_firearms] DEBUG: Redirecting (303) to <GET http://www.atlanticfirearms.com/browse-our-products.html> from <POST http://www.atlanticfirearms.com/component/vi
rtuemart/shipping-rifles/index.php>
2013-09-21 07:25:23-0500 [atlantic_firearms] DEBUG: Redirecting (303) to <GET http://www.atlanticfirearms.com/browse-our-products.html> from <POST http://www.atlanticfirearms.com/component/vi
rtuemart/featured-not-published/index.php>
2013-09-21 07:25:23-0500 [atlantic_firearms] DEBUG: Redirecting (303) to <GET http://www.atlanticfirearms.com/browse-our-products.html> from <POST http://www.atlanticfirearms.com/component/vi
rtuemart/shipping-rifles/index.php>
2013-09-21 07:25:23-0500 [atlantic_firearms] DEBUG: Redirecting (303) to <GET http://www.atlanticfirearms.com/browse-our-products.html> from <POST http://www.atlanticfirearms.com/component/vi
rtuemart/shipping-accessories/index.php>
2013-09-21 07:25:23-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/dallas-gun-shop.html> (referer: http://www.atlanticfirearms.com)
2013-09-21 07:25:24-0500 [atlantic_firearms] DEBUG: Redirecting (303) to <GET http://www.atlanticfirearms.com/browse-our-products.html> from <POST http://www.atlanticfirearms.com/component/vi
rtuemart/shipping-accessories/index.php>
2013-09-21 07:25:24-0500 [atlantic_firearms] DEBUG: Redirecting (303) to <GET http://www.atlanticfirearms.com/browse-our-products.html> from <POST http://www.atlanticfirearms.com/component/vi
rtuemart/shipping-rifles/index.php>
2013-09-21 07:25:24-0500 [atlantic_firearms] DEBUG: Redirecting (303) to <GET http://www.atlanticfirearms.com/browse-our-products.html> from <POST http://www.atlanticfirearms.com/component/vi
rtuemart/shipping-rifles/index.php>
2013-09-21 07:25:24-0500 [atlantic_firearms] DEBUG: Redirecting (303) to <GET http://www.atlanticfirearms.com/browse-our-products.html> from <POST http://www.atlanticfirearms.com/component/vi
rtuemart/shipping-rifles/index.php>
2013-09-21 07:25:24-0500 [atlantic_firearms] DEBUG: Redirecting (303) to <GET http://www.atlanticfirearms.com/browse-our-products.html> from <POST http://www.atlanticfirearms.com/component/vi
rtuemart/shipping-rifles/index.php>
2013-09-21 07:25:24-0500 [atlantic_firearms] DEBUG: Crawled (404) <GET http://www.atlanticfirearms.com/component/content/?Itemid=803&id=148> (referer: http://www.atlanticfirearms.com)
2013-09-21 07:25:25-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/houston-texas-gun-shop.html> (referer: http://www.atlanticfirearms.com)
2013-09-21 07:25:25-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/california-gun-shop.html> (referer: http://www.atlanticfirearms.com)
2013-09-21 07:25:25-0500 [atlantic_firearms] DEBUG: Redirecting (303) to <GET http://www.atlanticfirearms.com/browse-our-products.html> from <POST http://www.atlanticfirearms.com/component/vi
rtuemart/shipping-rifles/index.php>
2013-09-21 07:25:25-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/browse-our-products.html> (referer: http://www.atlanticfirearms.com/component/virtuemart
/featured-not-published/vector-arms-sp89-k-style-pistol-9mm-detail.html?Itemid=0)
Inside parse_add_to_cart
2013-09-21 07:25:25-0500 [atlantic_firearms] DEBUG: Scraped from <200 http://www.atlanticfirearms.com/browse-our-products.html>
        {'as_of_time': datetime.datetime(2013, 9, 21, 7, 25, 18, 365559),
         'in_stock': True,
         'url': 'http://www.atlanticfirearms.com/component/virtuemart/featured-not-published/vector-arms-sp89-k-style-pistol-9mm-detail.html?Itemid=0'}
2013-09-21 07:25:25-0500 [atlantic_firearms] DEBUG: Crawled (404) <GET http://www.atlanticfirearms.com/www.atlanticfirearms.com> (referer: http://www.atlanticfirearms.com/dallas-gun-shop.html
)
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/login-or-register/editaddress.html> (referer: http://www.atlanticfirearms.com)
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/privacy-policy.html> (referer: http://www.atlanticfirearms.com)
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/subscribe.html> (referer: http://www.atlanticfirearms.com)
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Crawled (200) <GET http://www.atlanticfirearms.com/links.html> (referer: http://www.atlanticfirearms.com)
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.gunbroker.com': <GET http://www.gunbroker.com/user/dealernetwork.asp>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.auctionarms.com': <GET http://www.auctionarms.com/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.gunsamerica.com': <GET http://www.gunsamerica.com/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.ar15.com': <GET http://www.ar15.com/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.olyarms.com': <GET http://www.olyarms.com/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.cheaperthandirt.com': <GET http://www.cheaperthandirt.com/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.ammoman.com': <GET http://www.ammoman.com/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.ak47.net': <GET http://www.ak47.net/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.atf.treas.gov': <GET http://www.atf.treas.gov/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'caag.state.ca.us': <GET http://caag.state.ca.us/firearms/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.nra.org': <GET http://www.nra.org/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.masterpiecearms.com': <GET http://www.masterpiecearms.com/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'atlantic1.readyhosting.com': <GET http://atlantic1.readyhosting.com/programming/listview.asp?CatId=2>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.vulcanarmament.com': <GET http://www.vulcanarmament.com/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.bushmaster.com': <GET http://www.bushmaster.com/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.rockriverarms.com': <GET http://www.rockriverarms.com/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'dpmsinc.com': <GET http://dpmsinc.com/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.colt.com': <GET http://www.colt.com/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.armalite.com': <GET http://www.armalite.com/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.redstick-firearms.com': <GET http://www.redstick-firearms.com/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.vectorarms.com': <GET http://www.vectorarms.com/indexframe.html>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.arsenalinc.com': <GET http://www.arsenalinc.com/about.htm>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.ak47.com': <GET http://www.ak47.com/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.jldenter.com': <GET http://www.jldenter.com/store/>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.springfield-armory.com': <GET http://www.springfield-armory.com/index.shtml>
2013-09-21 07:25:26-0500 [atlantic_firearms] DEBUG: Filtered offsite request to 'www.dsarms.com': <GET http://www.dsarms.com/>
^C2013-09-21 07:25:26-0500 [scrapy] INFO: Received SIGINT, shutting down gracefully. Send again to force 
^C2013-09-21 07:25:26-0500 [scrapy] INFO: Received SIGINT twice, forcing unclean shutdown
Run Code Online (Sandbox Code Playgroud)

pau*_*rth 24

将我之前的评论作为答案发布.

当您的所有POST请求(来自FormRequest.from_response())被重定向到时http://www.atlanticfirearms.com/browse-our-products.html,您应该设置dont_filter=True:

    if add_to_cart:
        # attempt to add to cart to verify availability
        request = FormRequest.from_response(response, formname="addtocartForm",
                      callback=self.parse_add_to_cart, dont_filter=True)
Run Code Online (Sandbox Code Playgroud)

请参阅Scrapy文档请求:

dont_filter(boolean) - 表示调度程序不应过滤此请求.当您想要多次执行相同的请求时,可以使用此选项来忽略重复过滤器.

此外,您可能希望设置CONCURRENT_REQUESTS = 1在购物车中逐项添加项目(我想知道服务器如何处理并行购物车添加.)