捕获TERM并发送QUIT后,Heroku上的Unicorn退出超时

mid*_*idd 90 heroku unicorn

我正在接收运行独角兽和sidekiq的Heroku应用程序的R12退出超时错误.这些错误每天发生1-2次,每当我部署时.我知道我需要转换来自Heroku的关机信号以便独角兽正确响应,但我认为我在下面的unicorn配置中已经这样做了:

worker_processes 3
timeout 30
preload_app true

before_fork do |server, worker|
  Signal.trap 'TERM' do
    puts "Unicorn master intercepting TERM and sending myself QUIT instead. My PID is #{Process.pid}"
    Process.kill 'QUIT', Process.pid
  end

  if defined?(ActiveRecord::Base)
    ActiveRecord::Base.connection.disconnect!
    Rails.logger.info('Disconnected from ActiveRecord')
  end
end

after_fork do |server, worker|
  Signal.trap 'TERM' do
    puts "Unicorn worker intercepting TERM and doing nothing. Wait for master to sent QUIT. My PID is #{Process.pid}"
  end

  if defined?(ActiveRecord::Base)
    ActiveRecord::Base.establish_connection
    Rails.logger.info('Connected to ActiveRecord')
  end

  Sidekiq.configure_client do |config|
    config.redis = { :size => 1 }
  end
end
Run Code Online (Sandbox Code Playgroud)

我的错误周围的日志如下所示:

Stopping all processes with SIGTERM
Unicorn worker intercepting TERM and doing nothing. Wait for master to sent QUIT. My PID is 7
Unicorn worker intercepting TERM and doing nothing. Wait for master to sent QUIT. My PID is 11
Unicorn worker intercepting TERM and doing nothing. Wait for master to sent QUIT. My PID is 15
Unicorn master intercepting TERM and sending myself QUIT instead. My PID is 2
Started GET "/manage"
reaped #<Process::Status: pid 11 exit 0> worker=1
reaped #<Process::Status: pid 7 exit 0> worker=0
reaped #<Process::Status: pid 15 exit 0> worker=2
master complete
Error R12 (Exit timeout) -> At least one process failed to exit within 10 seconds of SIGTERM
Stopping remaining processes with SIGKILL
Process exited with status 137
Run Code Online (Sandbox Code Playgroud)

似乎所有子进程都在超时之前成功获得.主人还活着吗?此外,路由器是否仍然在关闭期间向dyno发送Web请求,如日志中所示?

FWIW,我正在使用Heroku的零停机部署插件(https://devcenter.heroku.com/articles/labs-preboot/).

Win*_*eld 4

我认为您的自定义信号处理是导致此处超时的原因。

编辑:我因不同意 Heroku 的文档而被否决,我想解决这个问题。

配置您的 Unicorn 应用程序以捕获并吞下 TERM 信号是导致应用程序挂起且无法正确关闭的最可能原因。

Heroku 似乎认为,捕获TERM信号并将其转换为QUIT信号是将硬关机转变为正常关机的正确行为。

然而,这样做似乎在某些情况下会带来根本无法关闭的风险——这是这个错误的根源。体验运行 Unicorn 的悬挂测功机的用户应该考虑证据并根据首要原则(而不仅仅是文档)做出自己的决定。

  • Heroku 文档仍然涵盖“[使用 SIGTERM 优雅关闭](https://devcenter.heroku.com/articles/dynos#graceful-shutdown-with-sigterm)”,并且我没有看到提及不再需要这样做这个在雪松栈上。您有在哪里可以找到这个的参考吗? (2认同)