Scaling an API is not just about throwing more servers at the problem. The bottleneck is almost always the database. Getting the fundamentals right — connection pooling, indexed queries, and smart caching — takes you much further than horizontal scaling alone.
Connection Pooling
Each database connection is expensive. Opening a new connection per request kills performance at scale. Use a pool — pg-pool for raw Postgres or the built-in pooling in Prisma/Drizzle. A pool size of 10–20 connections per instance is a reasonable starting point.
import { Pool } from "pg"
const pool = new Pool({
connectionString: process.env.DATABASE_URL,
max: 20,
idleTimeoutMillis: 30_000,
})
export async function query(text: string, params?: unknown[]) {
const client = await pool.connect()
try {
return await client.query(text, params)
} finally {
client.release()
}
}Index the Right Columns
A query that scans a million rows is a query that will eventually bring your API to its knees. Every column that appears in a WHERE, JOIN ON, or ORDER BY clause is a candidate for an index. Use EXPLAIN ANALYZE in psql to find sequential scans that should be index scans.
Pagination Over OFFSET
OFFSET 10000 means Postgres reads and discards 10,000 rows before returning your page. Cursor-based pagination — storing the last seen ID or timestamp — is O(log n) instead and stays fast at any depth.
-- Cursor pagination (much faster than OFFSET)
SELECT * FROM posts
WHERE id > $lastSeenId
ORDER BY id ASC
LIMIT 20Response Caching with Redis
For read-heavy endpoints that rarely change, a cache dramatically reduces database load. Cache the serialized JSON, set a sensible TTL, and invalidate on writes.
async function getPost(id: string) {
const cached = await redis.get(`post:${id}`)
if (cached) return JSON.parse(cached)
const post = await db.query("SELECT * FROM posts WHERE id = $1", [id])
await redis.setex(`post:${id}`, 300, JSON.stringify(post))
return post
}Rate Limiting
Protect your API from abusive clients by rate limiting at the edge (Cloudflare, Nginx) or in-process with a sliding window counter backed by Redis. Returning a Retry-After header helps well-behaved clients back off gracefully.
Closing Thoughts
Measure before you optimize. Add structured logging and a metrics layer (Prometheus, Datadog, or even a simple performance.now() wrapper) so you know exactly where time is being spent before reaching for a new abstraction.