I have a Sinatra app with a long running process (a web scraper). I'd like the app flush the results of the crawler's progress as the crawler is running instead of at the end.
I've considered forking the request and doing something fancy with ajax but this is a really basic one-pager app that really just needs to output a log to a browser as it's happening. Any suggestions?
从Sinatra 1.3.0开始,您可以使用新的流API:
get '/' do
stream do |out|
out << "foo\n"
sleep 10
out << "bar\n"
end
end
Run Code Online (Sandbox Code Playgroud)
不幸的是,你没有可以简单地刷新的流(这不适用于Rack中间件).从路径块返回的结果可以简单地响应each.然后,Rack处理程序将each使用块调用,并在该块中将正文的给定部分刷新到客户端.
所有机架响应必须始终响应each并始终将字符串传递给给定块.如果你只是返回一个字符串,Sinatra会为你处理这个问题.
一个简单的流媒体示例如下:
require 'sinatra'
get '/' do
result = ["this", " takes", " some", " time"]
class << result
def each
super do |str|
yield str
sleep 0.3
end
end
end
result
end
Run Code Online (Sandbox Code Playgroud)
现在,您只需将所有抓取放入each方法中:
require 'sinatra'
class Crawler
def initialize(url)
@url = url
end
def each
yield "opening url\n"
result = open @url
yield "seaching for foo\n"
if result.include? "foo"
yield "found it\n"
else
yield "not there, sorry\n"
end
end
end
get '/' do
Crawler.new 'http://mysite'
end
Run Code Online (Sandbox Code Playgroud)