Ruby readpartial和read_nonblock没有抛出EOFError

Sid*_*Sid 12 ruby unix nonblocking preforking unicorn

我正在尝试理解并重新创建一个简单的preforking服务器,沿着独角兽的线路,服务器启动分叉4进程,所有进程都在控制套接字上等待(接受).

控制套接字@control_socket绑定到9799并生成4个等待接受连接的worker.对每个工人所做的工作如下


        def spawn_child
            fork do
                $STDOUT.puts "Forking child #{Process.pid}"
                loop do 
                    @client = @control_socket.accept                                        
                    loop do                     
                        request = gets              

                        if request                          
                            respond(@inner_app.call(request))                           
                        else
                            $STDOUT.puts("No Request")
                            @client.close                           
                        end
                    end
                end
            end
        end

我使用了一个非常简单的机架应用程序,它只返回一个状态代码为200的字符串和一个text-type of text/html.

我面临的问题是我的服务器工作正常,当我读取传入的请求(通过在" http:// localhost:9799 " 点击网址)使用gets而不是像readread_partialread_nonblock.当我使用非阻塞读取时,似乎永远不会抛出EOFError,根据我的理解,它意味着它不会接收EOF状态.

这会导致读取loop无法完成.这是完成这项工作的代码片段.


        # Reads a file using IO.read_nonblock
        # Returns end of file when using get but doesn't seem to return 
        # while using read_nonblock or readpartial
                # The fact that the method is named gets is just bad naming, please ignore
        def gets
            buffer = ""         
            i =0
            loop do
                puts "loop #{i}"
                i += 1
                begin
                    buffer << @client.read_nonblock(READ_CHUNK)
                    puts "buffer is #{buffer}"
                rescue  Errno::EAGAIN => e
                    puts "#{e.message}"
                    puts "#{e.backtrace}"
                    IO.select([@client])
                                        retry
                rescue EOFError
                    $STDOUT.puts "-" * 50
                    puts "request data is #{buffer}"    
                    $STDOUT.puts "-" * 50
                    break           
                end
            end
            puts "returning buffer"
            buffer
        end


然而代码工作完美,如果我用一个简单的gets代替read或者read_nonblock或者如果更换IO.select([@client])break.

这是代码工作并返回响应的时间.我打算使用read_nonblock的原因是unicorn使用了一个使用kgio库的等价物来实现non_blocking读取.


def gets
  @client.gets
end

接下来粘贴整个代码.


require 'socket'
require 'builder'
require 'rack'
require 'pry'

module Server   
    class Prefork
        # line break 
        CRLF  = "\r\n"
        # number of workers process to fork
        CONCURRENCY = 4
        # size of each non_blocking read
        READ_CHUNK = 1024

        $STDOUT = STDOUT
        $STDOUT.sync

        # creates a control socket which listens to port 9799
        def initialize(port = 21)
            @control_socket = TCPServer.new(9799)
            puts "Starting server..."
            trap(:INT) {
                exit
            }
        end

        # Reads a file using IO.read_nonblock
        # Returns end of file when using get but doesn't seem to return 
        # while using read_nonblock or readpartial
        def gets
            buffer = ""         
            i =0
            loop do
                puts "loop #{i}"
                i += 1
                begin
                    buffer << @client.read_nonblock(READ_CHUNK)
                    puts "buffer is #{buffer}"
                rescue  Errno::EAGAIN => e
                    puts "#{e.message}"
                    puts "#{e.backtrace}"
                    IO.select([@client])
                                        retry
                rescue EOFError
                    $STDOUT.puts "-" * 50
                    puts "request data is #{buffer}"    
                    $STDOUT.puts "-" * 50
                    break           
                end
            end
            puts "returning buffer"
            buffer
        end

        # responds with the data and closes the connection
        def respond(data)
            puts "request 2 Data is #{data.inspect}"
            status, headers, body = data
            puts "message is #{body}"
            buffer = "HTTP/1.1 #{status}\r\n" \
                     "Date: #{Time.now.utc}\r\n" \
                     "Status: #{status}\r\n" \
                     "Connection: close\r\n"            
            headers.each {|key, value| buffer << "#{key}: #{value}\r\n"}          
            @client.write(buffer << CRLF)
            body.each {|chunk| @client.write(chunk)}            
        ensure 
            $STDOUT.puts "*" * 50
            $STDOUT.puts "Closing..."
            @client.respond_to?(:close) and @client.close
        end

        # The main method which triggers the creation of workers processes
        # The workers processes all wait to accept the socket on the same
        # control socket allowing the kernel to do the load balancing.
        # 
        # Working with a dummy rack app which returns a simple text message
        # hence the config.ru file read.
        def run         
            # copied from unicorn-4.2.1
            # refer unicorn.rb and lib/unicorn/http_server.rb           
            raw_data = File.read("config.ru")           
            app = "::Rack::Builder.new {\n#{raw_data}\n}.to_app"
            @inner_app = eval(app, TOPLEVEL_BINDING)
            child_pids = []
            CONCURRENCY.times do
                child_pids << spawn_child
            end

            trap(:INT) {
                child_pids.each do |cpid|
                    begin 
                        Process.kill(:INT, cpid)
                    rescue Errno::ESRCH
                    end
                end

                exit
            }

            loop do
                pid = Process.wait
                puts "Process quit unexpectedly #{pid}"
                child_pids.delete(pid)
                child_pids << spawn_child
            end
        end

        # This is where the real work is done.
        def spawn_child
            fork do
                $STDOUT.puts "Forking child #{Process.pid}"
                loop do 
                    @client = @control_socket.accept                                        
                    loop do                     
                        request = gets              

                        if request                          
                            respond(@inner_app.call(request))                           
                        else
                            $STDOUT.puts("No Request")
                            @client.close                           
                        end
                    end
                end
            end
        end
    end
end

p = Server::Prefork.new(9799)
p.run

有人可以向我解释为什么读取会因"read_partial"或"read_nonblock"或"read"而失败.我真的很感激一些帮助.

谢谢.

小智 9

首先我想谈谈一些基本知识,EOF意味着文件结束,就像信号会在没有更多数据可以从数据源读取时发送给调用者,例如,打开一个文件并在读取完整个文件后会收到一个EOF,或者只是简单地关闭io流.

然后这四种方法之间存在一些差异

  • gets从流中读取一行,在ruby中它$/用作默认行分隔符,但是你可以将参数作为行分隔符传递,因为如果客户端和服务器不是同一个操作系统,那么行分隔符可能不同,它是一个方法,如果永远不会遇到行分隔符或EOF它将阻塞,并在收到EOF时返回nil,因此gets永远不会遇到EOFError.

  • read(length)从流中读取长度字节,它是一个方法,如果省略了长度,那么它将阻塞直到读取EOF,如果有一个长度则它只返回一次读取了一定数量的数据或满足EOF,并在收到时返回空字符串EOF,所以read永远不会满足EOFError.

  • readpartial(maxlen)从流中读取最多maxlen字节,它将读取可用数据并立即返回,它就像一个急切的版本read,如果数据太大你可以使用readpartial而不是read阻止阻塞,但它仍然是一个方法,它阻止如果没有立即可用的数据,readpartial则会EOFError在收到EOF时提出.

  • read_nonblock(maxlen)是类似的readpartial,但是就像名字所说的那样是一种非阻塞方法,即使没有数据可用它Errno::EAGAIN立即提出它意味着现在没有数据,你应该关心这个错误,通常在Errno::EAGAIN救援条款中应IO.select([conn])首先调用以减少不必要的循环,它会阻止直到conn变得可读,然后retry,如果接收到EOF,read_nonblock则会提高EOFError.

现在让我们看看你的例子,因为我看到你正在做的是首先尝试通过"点击url"读取数据,它只是一个HTTP GET请求,一些文本如"GET/HTTP/1.1\r \n",连接是默认情况下保持在HTTP/1.1中,所以使用readpartialread_nonblock永远不会收到EOF,除非Connection: close在您的请求中添加标头,或更改您的获取方法如下:

buffer = ""
if m = @client.gets
  buffer << m
  break if m.strip == ""
else
  break
end
buffer
Run Code Online (Sandbox Code Playgroud)

你不能read在这里使用,因为你不知道请求包的确切长度,使用大的长度或只是简单地省略会导致阻塞.