jac*_*kkk 5 ruby web-scraping web
我正在尝试用ruby编写一个简单的Web抓取代码.它工作到第29个网址,然后我收到此错误消息:
Run Code Online (Sandbox Code Playgroud)C:/Ruby193/lib/ruby/1.9.1/open-uri.rb:346:in `open_http': 500 Internal Server Er ror (OpenURI::HTTPError) from C:/Ruby193/lib/ruby/1.9.1/open-uri.rb:775:in `buffer_open' from C:/Ruby193/lib/ruby/1.9.1/open-uri.rb:203:in `block in open_loop' from C:/Ruby193/lib/ruby/1.9.1/open-uri.rb:201:in `catch' from C:/Ruby193/lib/ruby/1.9.1/open-uri.rb:201:in `open_loop' from C:/Ruby193/lib/ruby/1.9.1/open-uri.rb:146:in `open_uri' from C:/Ruby193/lib/ruby/1.9.1/open-uri.rb:677:in `open' from C:/Ruby193/lib/ruby/1.9.1/open-uri.rb:33:in `open' from test.rb:24:in `block (2 levels) in <main>' from test.rb:18:in `each' from test.rb:18:in `block in <main>' from test.rb:14:in `each' from test.rb:14:in `<main>'
我的代码:
require 'rubygems'
require 'nokogiri'
require 'open-uri'
aFile=File.new('data.txt', 'w')
ag = 0
for i in 1..40 do
agenzie = ag + 1
#change url parameter
url = "http://www.infotrav.it/dettaglio.do?sort=*RICOVIAGGI*&codAgenzia=" + "#{ ag }"
doc = Nokogiri::HTML(open(url))
aFile=File.open('data.txt', 'a')
aFile.write(doc.at_css("table").text)
aFile.close
end
Run Code Online (Sandbox Code Playgroud)
你有什么想法来解决它吗?谢谢!
如
在这里,让我为您清理一下:
File.open('data.txt', 'w') do |aFile|
(1..40).each do |ag|
url = "http://www.infotrav.it/dettaglio.do?sort=*RICOVIAGGI*&codAgenzia=#{ag}"
response = open(url) rescue nil
next unless response
doc = Nokogiri::HTML(response)
aFile << doc.at_css("table").text
end
end
Run Code Online (Sandbox Code Playgroud)
笔记:
| 归档时间: |
|
| 查看次数: |
3330 次 |
| 最近记录: |