We have an application on Heroku that sometimes gets huge influxes of requests at unpredictable times. These are processed as background jobs, and we occasionally encounter huge backlogs in our queue, which results in requests taking many hours to be picked up and processed.

In the past we've had to handle these cases manually by monitoring and increasing the number of background workers as we saw fit. This wasn't sustainable or convenient (especially if it happened in the middle of the night!), so we decided to write a script to automate the scaling of heroku workers based on the current number of queued requests.

Heroku API

To use the Heroku API, you need to get an API key. This is available on the account page of Heroku.

I opted to use the heroku-api gem. To open a new connection:

require 'heroku-api'
API_KEY = "your_key_here"
heroku = Heroku::API.new(:api_key => HEROKU_API_KEY)

To get the current number of workers:


This returns an array of the web and worker dynos currently running on your app (your Dyno configuration.

[{ "app_name"=>"your_app_name",
"pretty_state"=>"up for 14h",
"action"=>"up"}, ...]

There are two different kinds of dynos:

We were interested in scaling the number of worker dynos when their queue became too large. These dynos are identifiable by their process name (“worker.x”):

workers = heroku.get_ps(APP).body.select { |ps| ps["process"] =~ /worker/ }
workers_count = workers.size

Changing the number of workers to 'x' is easy:

heroku.post_ps_scale(APP, 'worker', x)

So using the current queue size, we scale automatically using a script something like this:

q = queued_events_count.to_i
if q > 2000
  if workers_count < 20
    # increase workers to 20 when the queue is very large
    heroku.post_ps_scale(APP, 'worker', 20)
    # Do other stuff, such as notify PagerDuty
elsif (q <= 2000 && q > 1000)
  if workers_count != 12
  # If workers are less than or greater than 12, scale them to 12
  heroku.post_ps_scale(APP, 'worker', 12)
elsif ...

Run this script periodically and the number of workers will scale automatically based on the current load. Hopefully, this will allow you to stay soundly asleep the next time your application gets hit with tons of requests at 3am =)