Unicorn USR2重启悬挂问题

axs*_*uul 10 ruby-on-rails unicorn ruby-on-rails-4

尝试使用USR2信号重启Unicorn时,我遇到了这个特殊问题.在干净地重新启动VPS时,我没有遇到向Unicorn发送USR2信号并让它正常重启的问题.然而,如果我再试一次,大约一个小时后,我将留下一个老主人,他们会在新的主人身边停下来阻止新主人开始.然后我被迫杀死了老主人,所以新主人可以开始.如果我重新启动VPS,它会修复它,但一小时后会再次启动问题.我在Rails 4,Ruby 2.0.0上.

unicorn.log

I, [2014-01-07T15:37:37.118523 #19797]  INFO -- : executing ["/srv/rails/current/bin/unicorn", "-c", "/srv/rails/current/config/unicorn.rb", {12=>#<Kgio::UNIXServer:fd 12>}] (in /srv/rails/releases/20140107091945)
I, [2014-01-07T15:37:37.118983 #19797]  INFO -- : forked child re-executing...
I, [2014-01-07T15:37:38.998632 #19797]  INFO -- : inherited addr=/srv/rails/shared/sockets/unicorn.sock fd=12
I, [2014-01-07T15:37:38.999038 #19797]  INFO -- : Refreshing Gem list
I, [2014-01-07T15:37:41.927794 #19967]  INFO -- : Refreshing Gem list
/srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:219:in `pid=': Already running on PID:19967 (or pid=/srv/rails/shared/pids/unicorn.pid is stale) (ArgumentError)
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:151:in `start'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/bin/unicorn:126:in `<top (required)>'
        from /srv/rails/current/bin/unicorn:16:in `load'
        from /srv/rails/current/bin/unicorn:16:in `<main>'
/srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:219:in `pid=': Already running on PID:21250 (or pid=/srv/rails/shared/pids/unicorn.pid is stale) (ArgumentError)
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:151:in `start'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/bin/unicorn:126:in `<top (required)>'
        from /srv/rails/current/bin/unicorn:16:in `load'
        from /srv/rails/current/bin/unicorn:16:in `<main>'
E, [2014-01-07T15:40:46.720131 #20878] ERROR -- : reaped #<Process::Status: pid 21075 exit 1> exec()-ed
E, [2014-01-07T15:40:46.720870 #20878] ERROR -- : master loop error: Already running on PID:21250 (or pid=/srv/rails/shared/pids/unicorn.pid is stale) (ArgumentError)
E, [2014-01-07T15:40:46.723525 #20878] ERROR -- : /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:219:in `pid='
E, [2014-01-07T15:40:46.723671 #20878] ERROR -- : /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:413:in `reap_all_workers'
E, [2014-01-07T15:40:46.723747 #20878] ERROR -- : /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:292:in `join'
E, [2014-01-07T15:40:46.723815 #20878] ERROR -- : /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/bin/unicorn:126:in `<top (required)>'
E, [2014-01-07T15:40:46.723880 #20878] ERROR -- : /srv/rails/current/bin/unicorn:16:in `load'
E, [2014-01-07T15:40:46.723930 #20878] ERROR -- : /srv/rails/current/bin/unicorn:16:in `<main>'
E, [2014-01-07T15:41:13.704700 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
E, [2014-01-07T15:41:13.704901 #21250] ERROR -- : retrying in 0.5 seconds (4 tries left)
E, [2014-01-07T15:41:14.205452 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
E, [2014-01-07T15:41:14.205597 #21250] ERROR -- : retrying in 0.5 seconds (3 tries left)
78.40.124.16, 173.245.49.122 - - [07/Jan/2014 15:41:14] "GET / HTTP/1.0" 200 28697 0.8345
E, [2014-01-07T15:41:14.706179 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
E, [2014-01-07T15:41:14.706335 #21250] ERROR -- : retrying in 0.5 seconds (2 tries left)
E, [2014-01-07T15:41:15.206834 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
E, [2014-01-07T15:41:15.206987 #21250] ERROR -- : retrying in 0.5 seconds (1 tries left)
E, [2014-01-07T15:41:15.707431 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
E, [2014-01-07T15:41:15.707563 #21250] ERROR -- : retrying in 0.5 seconds (0 tries left)
78.40.124.16, 149.154.158.74 - - [07/Jan/2014 15:41:15] "GET / HTTP/1.0" 200 32866 0.4528
E, [2014-01-07T15:41:16.208055 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
/srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/socket_helper.rb:158:in `initialize': Address already in use - "/srv/rails/shared/sockets/unicorn.sock" (Errno::EADDRINUSE)
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/socket_helper.rb:158:in `new'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/socket_helper.rb:158:in `bind_listen'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:255:in `listen'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:801:in `block in bind_new_listeners!'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:801:in `each'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:801:in `bind_new_listeners!'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:146:in `start'
        from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/bin/unicorn:126:in `<top (required)>'
        from /srv/rails/current/bin/unicorn:16:in `load'
        from /srv/rails/current/bin/unicorn:16:in `<main>'
Run Code Online (Sandbox Code Playgroud)

unicorn.rb

deploy_path = "/srv/rails"
RAILS_ENV = ENV['RAILS_ENV'] || "production"

working_directory "#{deploy_path}/current"
pid "#{deploy_path}/shared/pids/unicorn.pid"
stderr_path "#{deploy_path}/shared/log/unicorn.log"

# Listen on a UNIX data socket
listen "#{deploy_path}/shared/sockets/unicorn.sock"
worker_processes 4

# Preload application before forking worker processes
preload_app true

# Restart any workers that haven't responded in 30 seconds
timeout 30

before_fork do |server, worker|
  ##
  # When sent a USR2, Unicorn will suffix its pidfile with .oldbin and
  # immediately start loading up a new version of itself (loaded with a new
  # version of our app). When this new Unicorn is completely loaded
  # it will begin spawning workers. The first worker spawned will check to
  # see if an .oldbin pidfile exists. If so, this means we've just booted up
  # a new Unicorn and need to tell the old one that it can now die. To do so
  # we send it a QUIT.
  #
  # Using this method we get 0 downtime deploys.

  old_pid = "#{server.config[:pid]}.oldbin"

  if File.exists?(old_pid) && server.pid != old_pid
    begin
      Process.kill("QUIT", File.read(old_pid).to_i)
    rescue Errno::ENOENT, Errno::ESRCH => e
      log = File.open(Rails.root.join('log/unicorn.log'), "a")
      log.puts "Error encountered when killing process:\n"
      log.puts "#{e.message}"
      log.close
    end
  end

  # the following is recomended for Rails + "preload_app true"
  # as there's no need for the master process to hold a connection
  if defined?(ActiveRecord::Base)
    ActiveRecord::Base.connection.disconnect!
  end
end

after_fork do |server, worker|
  ##
  # Unicorn master loads the app then forks off workers - because of the way
  # Unix forking works, we need to make sure we aren't using any of the parent's
  # sockets, e.g. db connection

  ActiveRecord::Base.establish_connection
  # Redis and Memcached would go here but their connections are established
  # on demand, so the master never opens a socket

  ##
  # Unicorn master is started as root, which is fine, but let's
  # drop the workers to deployer
  begin
    uid, gid = Process.euid, Process.egid
    user, group = 'deployer', 'deployer'
    target_uid = Etc.getpwnam(user).uid
    target_gid = Etc.getgrnam(group).gid
    worker.tmp.chown(target_uid, target_gid)
    if uid != target_uid || gid != target_gid
      Process.initgroups(user, target_gid)
      Process::GID.change_privilege(target_gid)
      Process::UID.change_privilege(target_uid)
    end
  rescue => e
    if RAILS_ENV == 'development'
      STDERR.puts "couldn't change user, oh well"
    else
      raise e
    end
  end
end
Run Code Online (Sandbox Code Playgroud)

deploy.rb

require 'bundler/capistrano'    # runs a bundle install --deployment

# https://github.com/sstephenson/rbenv/issues/101
set :keep_releases, 10
set :shared_children, shared_children + %w(public/images public/uploads)

# Multistage extension
set :stages, ["production", "staging"]
set :default_stage, "staging"
require 'capistrano/ext/multistage'
require 'underglow/capistrano'

# Whenever crontab updates
set :whenever_environment, defer { stage }
set :whenever_command, "bin/whenever"
require 'whenever/capistrano'

set :application, "rails"
set :user, "deployer"

default_run_options[:pty] = true
default_run_options[:shell] = '/bin/zsh'
set :use_sudo, false

# repository
set :repository,      "XXXXXXXXXXXXXXXXX"
set :branch,          fetch(:branch, "master")  # can specify a branch from `cap -S branch="<branch_name>"`
set :scm,             :git
set :scm_verbose,     true

set :ssh_options, forward_agent: true

set :deploy_to,       "/srv/rails"
set :deploy_via,      :remote_cache

# We're using a rbenv user install, setup the PATH we need to access the rbenv shims
set :default_environment, {
  'PATH' => "$HOME/.rbenv/shims:$HOME/.rbenv/bin:$PATH"
}
Run Code Online (Sandbox Code Playgroud)

有没有人见过这个?

Mic*_*aff 3

这可能对您没有帮助,但这是我为“解决”问题所做的事情。

我从 Unicorn 4.7.0 版本开始遇到这个问题。在 4.7.0 中,pid 文件的写入方式发生了变化,并破坏了我的重启脚本。4.7.0 之前的旧行为是:将 pid 文件移动到 oldpid,写入新 pid,启动工作程序,关闭 master。当然,最后一步是在我的 unicorn.rb 文件中。新的行为是快速删除旧的 pid,并且仅在发生一些繁重的工作后才写入新的 pid。这破坏了我的脚本,因为它无法相信事情是否正确重新启动。这导致我的 sh 脚本尝试重新启动它,导致与现在新启动的 unicorn 进程混淆,并且 eh 脚本启动了“完全启动”进程。两人都以各种方式失败了,所以都退出了,留下一个老主人仍在服务请求。

正如有人已经提到的,我的 unicorn.rb 文件也有一个缺陷,它没有正确设置捆绑器。

升级到最近发布的 Unicorn 4.8.1 解决了这个问题,因为 pid 文件的写入方式与 4.7.0 之前的版本相同。