Ways to Rate Limit Requests in Go

development golang

Introduction

Layer 7 DDoS attacks are quite common these days. With a simple script, any one can send thousands of requests to services to exhaust server resources. They can easily take down services and cause disruptions to a business. It is wise to implement strategies to mitigate such attacks using techniques that limit the number of requests a person can send to the service. It is possible to rate limit the user requests by IP address or a user ID, if available, although the former is more popular.

Basically, a service keeps track of the number of requests that have been originating from a single IP address in a given time window. If it exceeds a limit, the subsequent requests will be rejected by the service with an HTTP status code of 429 Too Many Requests. But in order to implement a rate limit feature, there are different algorithms that can be used. In this article, we will look at the different algorithms that one can consider.

Fixed window algorithm

In fixed-window algorithm, limits are set per time frame, such as 1,000 requests per hour or 10 requests per minute and so on. It is easy to implement and does quite well at rate limiting requests, but as the available quota resets, they are subject to spikes at the edges of the window. For example, let’s consider a limit of 1,000 requests per hour. This algorithm allows for a spike of all 1000 requests in the very first minute of the hour and thus overwhelm the service that this algorithm is protecting.

In Go, there are multiple packages available that uses this algorithm:

Sliding window algorithm

In sliding window algorithm, along with the benefits of fixed-window algorithm, the rolling window of time smooths out bursts of requests to the service by adding a weighted count in previous window to the count in current window.

In Go, there are multiple packages available that uses this algorithm:

Token bucket algorithm

The token bucket algorithm maintains a fixed capacity bucket where a tokens are added at a fixed rate. The tokens in most cases is a single request or in some cases, it can be network packets. When the bucket is full, the service is considered to have reached its limit and responds with backpressure. This algorithm is used when there is a case where not all inputs correspond to 1:1 with requests. For instance, a request to a service can make multiple database calls and then return a result. Each database call may take one token.

In Go, there are multiple packages available that uses this algorithm:

Leaky-bucket algorithm

A leaky bucket is similar to a token bucket, but the rate is limited by the amount that can drip or leak out of the bucket. This technique recognises that the system has some degree of finite capacity to hold a request until the service can act on it; any extra simply spills over the edge and is discarded. It is normally used to rate limit requests to load balancers and disk I/O buffers. It is also used in packet-switched computer networks and telecommunications networks in both the traffic policing and traffic shaping of data transmissions.

In Go, there are multiple packages available that uses this algorithm:

I hope this was helpful and if you find any errors or issues in this post, please do let me know in the comments section below.

In case you would like to get notified about more articles like this, please subscribe to my substack.

Get new posts by email

Comments

comments powered by Disqus