# Rate limiter Pattern
The rate limiter pattern is a design technique used to control the rate at which incoming requests or actions are processed by a system. It's essentially a mechanism to manage the throughput of a service by limiting the number of requests it can handle within a specific timeframe.
Why is it important?
- Protects from overload: Prevents system crashes due to excessive traffic.
- Fairness: Ensures equal access to resources for all users.
- API abuse prevention: Stops malicious attacks like DDoS.
- Resource management: Optimizes resource utilization.
# Algorithms
- Fixed Window Counter: Simplest approach, counts requests within a fixed time window.
- Sliding Log Window: Maintains a sliding window of fixed size and counts requests within that window.
- Token Bucket: Continuously generates tokens at a fixed rate. Each request consumes a token.
- Leaky Bucket: Similar to Token Bucket but with a leakage rate, allowing for burstiness.
# Token bucket
A token bucket is a container that has pre-defined capacity. Tokens are put in the bucket at preset rates periodically. Once the bucket is full, no more tokens are added.
E.g: the token bucket capacity is 4. The refiller puts 2 tokens into the bucket every second. Once the bucket is full, extra tokens will overflow.
Each request consumes one token. When a request arrives, we check if there are enough tokens in the bucket.
- If there are enough tokens, we take one token out for each request, and the request goes through.
- If there are not enough tokens, the request is dropped.
The token bucket algorithm takes two parameters:
- Bucket size: the maximum number of tokens allowed in the bucket
- Refill rate: number of tokens put into the bucket every second
How many buckets do we need? This varies, and it depends on the rate-limiting rules. Here are a few examples.
- It is usually necessary to have different buckets for different API endpoints. For instance, if a user is allowed to make 1 post per second, add 150 friends per day, and like 5 posts per second, 3 buckets are required for each user.
- If we need to throttle requests based on IP addresses, each IP address requires a bucket
- If the system allows a maximum of 10,000 requests per second, it makes sense to have a global bucket shared by all requests.
Pros:
- The algorithm is easy to implement.
- Memory efficient.
- Token bucket allows a burst of traffic for short periods. A request can go through as long as there are tokens left. Cons:
- Two parameters in the algorithm are bucket size and token refill rate. However, it might be challenging to tune them properly.
# Leaking bucket
Imagine a leaky bucket. Water (packets) is poured into the bucket at an arbitrary rate. However, there's a hole at the bottom, allowing water to leak out at a constant rate. If the water is poured in faster than it can leak out, the bucket will eventually overflow.
# How it Works
In the context of network traffic, the leaky bucket algorithm operates similarly:
- Bucket: Represents a buffer to store incoming packets.
- Input Rate: Packets arrive at the bucket at a variable rate.
- Output Rate: Packets are removed from the bucket and sent at a constant rate.
- Overflow: If the bucket fills up before packets can be removed, incoming packets are typically dropped.
# Key Characteristics
- Smooths traffic: It ensures a constant output rate regardless of the input burstiness.
- Simple implementation: Relatively easy to implement compared to other algorithms.
- Packet loss: Can lead to packet loss if the input rate exceeds the output rate for an extended period.
# Sliding window counter
Redis lua example:
local totalHits = redis.call("INCR", KEYS[1])
local timeToExpire = redis.call("PTTL", KEYS[1])
if timeToExpire <= 0
then
redis.call("PEXPIRE", KEYS[1], tonumber(ARGV[1]))
timeToExpire = tonumber(ARGV[1])
end
return { totalHits, timeToExpire }
const scriptSrc = `...`;
const results: number[] = (await this.redis.call(
'EVAL',
scriptSrc,
1,
key,
ttl * 1000,
)) as number[];
# Bulkhead vs Rate Limiter
Bulkhead
Limit number of concurrent calls at a time
Ex: Allow 5 concurrent calls at a time
Rate Limiter
Limit number total calls in given period of time
Ex: Allow 5 calls every 2 second.