A Token Bucket Implementation in PHP, using memcached

The “token bucket” algorithm is often used in packet switched computer networks and telecommunications networks to rate-limit or throttle traffic flows. The Wikipedia article in this link provides more information.

We had a need to implement a light-weight rate-limiting system on the login forms which protect our Registrar Console and other internal systems. We have a wide range of security measures in place to prevent brute-force attacks against accounts with weak passwords, such as two-factor authentication, auto-locking of accounts after too many failed logins, and so on, but we had no protection against brute-force password reuse attacks such as the one that recently hit GitHub.

After we received a report through our bug bounty program we decided that we needed to implement rate-limiting at the IP address level. The token bucket approach seemed like a good one, because we would only need to store two pieces of data about each IP address: an integer containing the number of tokens in its bucket, and a timestamp recording when the bucket was last “topped up”.

Rather than use a relational database to store this information we chose to use memcached, since it’s very fast, we already have a memcached system running at each of our operations centres, and the data doesn’t need to be persistent in the long term.

Since our registry system is written in PHP, we needed an implementation of the token bucket algorithm in that language. Some existing implementations already exist, but were not compatible with the (rather old) operating system and PHP version we currently have deployed in production, so we need to roll our own. This was then integrated into the general-purpose “CRC” foundation library which supports our registry software, and which is also used by our some of sister companies.

Here’s the core of the implementation, an object method which returns a boolean if the client has hit a rate limit. The key() method is a way of turning a client identifer such as an IP address into a key that is unique for the specific application. The cache object property is an instance of the memcache class (not to be confused with the memcached class, no, we don’t know why PHP has two memcached client classes either!).

/**
* check if the given client is rate-limited
* @param string $id client ID, such as user ID or IP address
* @param integer $limit
* @param integer $window
* @return boolean true if rate limit exceeded
*/
function check($id, $limit, $window=300) {

  // generate a unique key for this client
  $key = $this->key($id);

  // generate lock key
  $lock_key = "{$key}_lock";

  // get the current time in milliseconds since 1970-01-01T00:00:00.0Z
  // we use milliseconds to give us some granularity in the case where a client is making concurrent requests
  $now = floor(1000 * microtime(true));

  // we discard any buckets older than this
  $floor = floor(1000 * (microtime(true) - $window));

  // attempt to lock the cache, try ten times, sleeping 25ms between each attempt
  $have_lock = false;
  for ($i = 0 ; $i < 10 ; $i++) {
    if ($this->cache->add($lock_key, 1, NULL, 15)) {
      usleep(25000);

    } else {
      $have_lock = true;
      break;

    }
  }

  // unable to obtain lock so fail open
  if (!$have_lock) {
    trigger_error("Unable to obtain lock ob {$key}", E_USER_WARNING);
    return false;
  }

  // get the bucket for this client
  $bucket = $this->cache->get($key);

  // nothing in cache, so initialise
  if (!is_array($bucket) || $bucket['filled'] < $floor) $bucket = array(
    'tokens' => 1000 * $limit, // we use multiples of 1,000 so that floor() doesn't always add zero when there is a very small gap between requests
    'filled' => $now,
  );

  if ($bucket['filled'] < $now) {
    // we need to top the bucket up

    // how many milliseconds have elapsed since the bucket was last topped up?
    $dt = $now - $bucket['filled'];

    // work out the rate (per millisecond)
    $rate = $limit / (1000 * $window);

    // calculate how many tokens to add:
    $new = floor(1000 * $dt * $rate);

    // top up the bucket
    $bucket['tokens'] += $new;
    $bucket['filled']  = $now;
  }

  // decrement the token counter
  $bucket['tokens'] -= 1000;

  // store the updated bucket back in the cache
  $this->cache->set($key, $bucket);

  // free the lock
  $this->cache->delete($lock_key);

  // client is rate limited if they have no tokens left
  return ($bucket['tokens'] < 0);
}

This allows us to implemement a per-IP rate limit on login forms across our system, without having to maintain a lot of data in a relational database.