相关疑难解决方法(0)

Ruby NET :: HTTP在正文之前读取标题(没有HEAD请求)?

我正在使用Net :: HTTP和Ruby来抓取URL.

我不想抓取流媒体音频,例如:http://listen2.openstream.co/334

实际上我只想抓取Html内容,所以没有pdfs,video,txt ..

现在,我将open_timeout和read_timeout都设置为10,所以即使我抓取这些流式音频页面,它们也会超时.

url = 'http://listen2.openstream.co/334'
path = uri.path

req= Net::HTTP::Get.new(path, {'Accept' => '*/*', 'Content-Type' => 'text/plain; charset=utf-8', 'Connection' => 'keep-alive','Accept-Encoding' => 'Identity'})

uri = Addressable::URI.parse(url)   

resp =  Net::HTTP.start(uri.host, uri.inferred_port) do |httpRequest|
    httpRequest.open_timeout = 10
    httpRequest.read_timeout = 10
    #how can I read the headers here before it's streaming the body and then exit b/c the content type is audio?
    httpRequest.request(req)
end
Run Code Online (Sandbox Code Playgroud)

但是,有没有办法检查标题之前我读取http响应的正文,看看它是否是一个音频?我想这样做而不发送单独的HEAD请求.

ruby ruby-on-rails

9
推荐指数
1
解决办法
1083
查看次数

标签 统计

ruby ×1

ruby-on-rails ×1