我是python的新手,也是scrapy的新手.我想从维基百科中删除数据,但事情没有成功.每次我做scrapy爬行维基,我总是得到; "TypeError:'WikipediaItem'对象不支持项目分配".我如何解决这个问题并让我成功地从维基百科中获取细节.
无论如何,这是我的代码:
from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from wikipedia.items import WikipediaItem
class WikipediaItem(BaseSpider):
name = "wiki"
allowed_domains = ["wikipedia.org"]
start_urls = ["http://en.wikipedia.org/wiki/Main_Page"]
def parse(self, response):
hxs = HtmlXPathSelector(response)
sites = hxs.select('//table[@id="mp-upper"]/tr')
items = []
for site in sites:
item = WikipediaItem()
item['title'] = site.select('.//a[@class="MainPageBG"]/text()').extract()
item['link'] = site.select('.//a[@class="MainPageBG"]').extract()
item['details'] = site.select('.//p/text()').extract()
items.append(item)
return items
Run Code Online (Sandbox Code Playgroud)
这是我得到的结果:
2013-04-18 23:56:54+0800 [scrapy] INFO: Scrapy 0.14.4 started (bot: wikipedia)
2013-04-18 23:56:54+0800 [scrapy] DEBUG: Enabled extensions: LogStats, TelnetConsole, CloseSpider, WebService, CoreStats, MemoryUsage, SpiderState …Run Code Online (Sandbox Code Playgroud)