Vla*_*anu 3 ruby ruby-on-rails http download
我正在使用各种XML-over-HTTP Web服务返回大型XML文件(> 2MB).什么是最快的ruby http库来减少'下载'时间?
所需功能:
GET和POST请求
gzip/deflate downloads(Accept-Encoding: deflate, gzip) - 非常重要
我在考虑:
开放式的URI
网:: HTTP
抑制
但你也可以提出其他建议.
PS要解析响应,我使用Nokogiri的pull解析器,所以我不需要像rest-client或hpricot这样的集成解决方案.
The*_*heo 17
您可以使用EventMachine和em-http来传输XML:
require 'rubygems'
require 'eventmachine'
require 'em-http'
require 'nokogiri'
# this is your SAX handler, I'm not very familiar with
# Nokogiri, so I just took an exaple from the RDoc
class SteamingDocument < Nokogiri::XML::SAX::Document
def start_element(name, attrs=[])
puts "starting: #{name}"
end
def end_element(name)
puts "ending: #{name}"
end
end
document = SteamingDocument.new
url = 'http://stackoverflow.com/feeds/question/2833829'
# run the EventMachine reactor, this call will block until
# EventMachine.stop is called
EventMachine.run do
# Nokogiri wants an IO to read from, so create a pipe that it
# can read from, and we can write to
io_read, io_write = IO.pipe
# run the parser in its own thread so that it can block while
# reading from the pipe
EventMachine.defer(proc {
parser = Nokogiri::XML::SAX::Parser.new(document)
parser.parse_io(io_read)
})
# use em-http to stream the XML document, feeding the pipe with
# each chunk as it becomes available
http = EventMachine::HttpRequest.new(url).get
http.stream { |chunk| io_write << chunk }
# when the HTTP request is done, stop EventMachine
http.callback { EventMachine.stop }
end
Run Code Online (Sandbox Code Playgroud)
它可能有点低级,但可能是任何文档大小的最高性能选项.喂它数百兆,它不会填满你的记忆,因为任何非流媒体解决方案(只要你没有保留你正在加载的大部分文件,但这是你的一面).