delayed_job在生产一段时间后停止运行

Joe*_*lio 12 ruby-on-rails delayed-job

在生产中,我们的delayed_job过程因某种原因而死亡.我不确定它是在崩溃还是被操作系统杀死了.我没有在delayed_job.log文件中看到任何错误.

我该怎么做才能解决这个问题?我正在考虑安装monit来监控它,但这只会告诉我它何时死亡.它不会真的告诉我为什么它会死.

有没有办法让它对日志文件更加健谈,所以我可以告诉它为什么会死?

还有其他建议吗?

Luk*_*ick 12

我遇到过extreme_job无声失败的两个原因.第一个是当人们在分叉进程中使用libxml时的实际段错误(这在一段时间后会在邮件列表中弹出).

第二个问题与delayed_job所依赖的守护进程的1.1.0版本存在问题(https://github.com/collectiveidea/delayed_job/issues#issue/81),这可以通过使用轻松解决1.0.10这是我自己的Gemfile中的内容.

记录

在delayed_job中有登录,所以如果工作人员在没有打印错误的情况下死亡,通常是因为它没有抛出异常(例如Segfault)或外部正在杀死进程.

监控

我使用bluepill监视我的延迟作业实例,到目前为止,这已经非常成功地确保了作业仍在运行.为应用程序运行bluepill的步骤非常简单

将bluepill gem添加到Gemfile中:

 # Monitoring
  gem 'i18n' # Not sure why but it complained I didn't have it
  gem 'bluepill'
Run Code Online (Sandbox Code Playgroud)

我创建了一个bluepill配置文件:

app_home = "/home/mi/production"
workers = 5
Bluepill.application("mi_delayed_job", :log_file => "#{app_home}/shared/log/bluepill.log") do |app|
  (0...workers).each do |i|
    app.process("delayed_job.#{i}") do |process|
      process.working_dir = "#{app_home}/current"

      process.start_grace_time    = 10.seconds
      process.stop_grace_time     = 10.seconds
      process.restart_grace_time  = 10.seconds

      process.start_command = "cd #{app_home}/current && RAILS_ENV=production ruby script/delayed_job start -i #{i}"
      process.stop_command  = "cd #{app_home}/current && RAILS_ENV=production ruby script/delayed_job stop -i #{i}"

      process.pid_file = "#{app_home}/shared/pids/delayed_job.#{i}.pid"
      process.uid = "mi"
      process.gid = "mi"
    end
  end
end
Run Code Online (Sandbox Code Playgroud)

然后在我的capistrano部署文件中,我刚刚添加:

# Bluepill related tasks
after "deploy:update", "bluepill:quit", "bluepill:start"
namespace :bluepill do
  desc "Stop processes that bluepill is monitoring and quit bluepill"
  task :quit, :roles => [:app] do
    run "cd #{current_path} && bundle exec bluepill --no-privileged stop"
    run "cd #{current_path} && bundle exec bluepill --no-privileged quit"
  end

  desc "Load bluepill configuration and start it"
  task :start, :roles => [:app] do
    run "cd #{current_path} && bundle exec bluepill --no-privileged load /home/mi/production/current/config/delayed_job.bluepill"
  end

  desc "Prints bluepills monitored processes statuses"
  task :status, :roles => [:app] do
    run "cd #{current_path} && bundle exec bluepill --no-privileged status"
  end
end
Run Code Online (Sandbox Code Playgroud)

希望这有所帮助.