If a system is required to handle short bursts of requests from the
same users, the previous post strategy will not work. One solution in
this case is what is called a Token Bucket. Each user starts with n
tokens available, each requests consumes a token. Upon reaching 0
tokens, requests are denied. Tokens are replenished in a steady rate.
This post shows a implementation of rate limiting aimed at averaging 10 requests per second, with bursts of up to 10 requests (refilling at a rate of 1 token per second).
create unlogged table rate_limit( -- 1 id int not null, tokens smallint not null default 9 -- 2 check(tokens >= 0 and tokens <= 10), primary key (id) ); create index on rate_limit using btree(tokens) where tokens < 10; -- 3
1: Unlogged table for faster writes; Please do change all your tables to unlogged without understanding what you lose =);
2: Inserting a new row occurs when a request is processed, tokens
defaults to 9 (allowing 10 requests in total: the current one plus 9
remaining);
3: Partial indexing: keep (indexed) only rows that should be refilled.
Some SQL-fu, run for each request:
\set id random(0, 10000) -- 1 with ins_or_upt as ( insert into rate_limit as x (id) values (:id) -- 2 on conflict(id) do -- 3 update set tokens = x.tokens - 1 where x.id = excluded.id and x.tokens > 0 returning tokens -- 4 ) select tokens from ins_or_upt union all -- 5 select 0 limit 1
1: The above snippet is used with pgbench later, hence the use of
\set command;
2: Using the variable set at line 1;
3: If a row already exists for :id, update instead;
4: Returns the number of remaining tokens, useful for the user;
5: Always return exactly one row containing the remaining tokens for
:id.
We can consume tokens, but they are never refilled (yet). The system's backend could, of course, periodically issue a statement to refill tokens one by one, but that's not this blog's theme. Enter pg_cron, an extension that does just that: periodically executes a statement. After installing it, we just schedule the following:
select cron.schedule( '1 second', $$ update rate_limit set tokens = tokens + 1 where tokens < 10 $$ );
Done. This refills 1 token per second for rows with less than 10 tokens, capping at 10 (full bucket after 10 seconds of inactivity). All of this with no backend code.
Just for fun, I ran the insert script via pgbench using a single connection, single client, single thread, simulating 10k concurrent users.
docker exec -i pg pgbench \ --time=30 \ --client=1 \ --jobs=1 \ --protocol=prepared \ --no-vacuum \ --file=insert.sql
You're right, that's not a proper benchmark. Still, achieving ~20k transactions per second (tps) with my (okayish) setup and stock PostgreSQL configuration makes me wonder why some (NoSQL) database systems have such widespread adoption.
If you are wondering about how to run this yourself, download this files:
Place everything in the same directory and run: docker compose up
and then execute pgbench.