Killing sneaky mongrels

Posted by val
on Thursday, August 16

We found that sometimes monit fails to restart all mongrel instances after deployment and some of them end up running with the pid file gone. Since there is no pid, monit believes the instance is not running so it tries to start a new one on the same port and, of course, fails. Which leads to stale mongrel instances with old code. We’re investigating a long term solution but in the meantime have wrapped the mongrel_rails start script with a replacement which finds and kills the stale mongrel instances before starting a new one.

#!/usr/bin/env ruby

class MongrelController

  def self.run_mongrel(args)
    pid = extract_pid(args)
    kill_stale_process(pid) if pid
    system "/bin/mongrel_rails #{ args.join(' ') }"
  end

  def self.extract_pid(args)
     (args[0] == 'start') && (i = args.index('-P')) && args[i + 1]
  end

  def self.kill_stale_process(pid)
    mongrel_processes(pid).each { |p| process_running?(p) && Process.kill(9, p)  }
  end

  def self.mongrel_processes(pid)
    `ps axww -o 'pid command'`.split(/\n/).inject([]) do |mongrels, process|
      mongrels << process[/^\s*(\d+)/][$1].to_i if process.match(%r{/bin/mongrel_rails\s.*\s-P\s#{ pid }\b})
      mongrels
    end
  end

  def self.process_running?(pid)
    pid && (`ps -p #{ pid }`.split(/\n/).size == 2)
  end

end

MongrelController.run_mongrel(ARGV)
Comments

Leave a response

  1. Joe GrossbergAugust 29, 2007 @ 02:34 PM
    The link to monit is busted; you need the "http://" Any insight into *why* monit leaves some PID-less mongrels around? Does it have this problem with other programs it kills?