Sizing Connection Pools Per Tenant Tier

Picking one pool size for every tenant either starves your paying customers or buries the database under idle connections, so pool sizing has to follow the tiers you sell. This page sits under connection pooling in multi-tenant systems, and turns plan-level SLAs into concrete pool numbers instead of guesses.

Problem Framing

A connection pool is a fixed budget. PostgreSQL holds a few megabytes of backend memory per connection and the planner and lock manager degrade once active backends exceed the core count, so the server can sustain on the order of a few hundred connections before throughput falls rather than rises. Every tenant tier draws against that same ceiling. If you hand each tenant the same pool, a thousand free-tier accounts can exhaust the budget and leave your enterprise tenants waiting on a checkout queue while their SLA quietly burns.

The hard part is that demand is not uniform. A premium tenant running interactive dashboards needs low, predictable latency on every query; a free tenant doing occasional CRUD can tolerate a few hundred milliseconds of pool wait. Sizing the same way for both wastes connections on tenants who do not notice and rations them from tenants who pay for the difference. The decision matters because it is the point where your commercial tiers — the ones described in subscription and plan enforcement — become a physical resource allocation rather than a marketing table.

The flow below shows how a single per-tier concurrency target turns into a pool size, and where the two failure directions sit: too small produces checkout queueing, too large produces database overload.

Size each tier from its own concurrency target, then bound the result so the sum of all pools stays under the database connection ceiling.

Step-by-Step Guide

1. Measure arrival rate and service time per tier

Before any formula, you need two numbers for each tier: the peak query arrival rate (queries per second) and the mean query service time (seconds, including network round trip). Pull them from your metrics, grouped by tenant tier, at the busiest minute of the week — not the daily average.

-- Peak QPS and mean query time per tenant tier, from a metering table
SELECT t.tier,
       max(per_sec.qps)              AS peak_qps,
       avg(m.duration_ms) / 1000.0   AS mean_service_s
FROM query_metrics m
JOIN tenants t ON t.id = m.tenant_id
LEFT JOIN LATERAL (
  SELECT count(*) AS qps
  FROM query_metrics x
  WHERE x.tenant_id = m.tenant_id
  GROUP BY date_trunc('second', x.started_at)
) per_sec ON true
WHERE m.started_at > now() - interval '7 days'
GROUP BY t.tier;

2. Apply Little's Law to get the steady-state concurrency

Little's Law states that the average number of in-flight requests L equals the arrival rate lambda times the average time each request holds a connection W. That L is the minimum number of connections a tier needs to keep up at steady state — anything less and the queue grows without bound.

import math

def steady_state_connections(peak_qps: float, mean_service_s: float) -> float:
    # Little's Law: L = lambda * W
    return peak_qps * mean_service_s

# Premium tier: 120 QPS, 25 ms average query hold time
L = steady_state_connections(120, 0.025)
print(L)  # 3.0 connections needed to keep up at steady state

3. Add burst headroom on top of the steady-state number

Steady state is the floor, not the answer. Real traffic arrives in bursts, and at exactly L connections the pool wait time goes to infinity as utilization approaches 100%. Add a multiplier so the pool absorbs spikes; tighter SLAs get more headroom. The full size is ceil(L * burst_factor).

def pool_size(peak_qps: float, mean_service_s: float, burst_factor: float) -> int:
    L = peak_qps * mean_service_s
    return max(1, math.ceil(L * burst_factor))

# Premium gets a 2.0x burst factor; free tier gets 1.3x
premium = pool_size(120, 0.025, 2.0)   # ceil(3.0 * 2.0) = 6
free    = pool_size(40,  0.060, 1.3)   # ceil(2.4 * 1.3) = 4
print(premium, free)

4. Decide dedicated vs shared pools by tier

Premium tenants get a dedicated pool each, so one noisy premium tenant cannot starve another and latency stays isolated. Standard and free tenants share a pool per group, because the per-tenant concurrency is low and pooling many of them amortizes the connection budget. The split is the whole point: isolation where it is paid for, density everywhere else.

# Pool plan, keyed by tier. Premium = one pool per tenant; others share.
tiers:
  premium:
    mode: dedicated        # one pool per tenant
    peak_qps: 120
    mean_service_s: 0.025
    burst_factor: 2.0      # pool size 6 per tenant
  standard:
    mode: shared           # one pool for the whole group
    peak_qps: 300          # aggregate across all standard tenants
    mean_service_s: 0.040
    burst_factor: 1.5      # pool size 18 for the group
  free:
    mode: shared
    peak_qps: 200          # aggregate
    mean_service_s: 0.060
    burst_factor: 1.3      # pool size 16 for the group

5. Check the total against the database ceiling

The sum of every dedicated pool plus every shared pool must stay under what the database can serve. If 30 premium tenants at 6 connections each already consume 180 connections, there is little left for the shared tiers, and you either raise the ceiling, cap premium count, or front everything with a transaction-pooling proxy as described in PgBouncer transaction pooling for multi-tenant SaaS.

DB_MAX_CONNECTIONS = 200
RESERVED = 20  # superuser, replication, migrations, monitoring

premium_pool, premium_tenants = 6, 25
standard_pool, free_pool = 18, 16

total = premium_pool * premium_tenants + standard_pool + free_pool
budget = DB_MAX_CONNECTIONS - RESERVED
assert total <= budget, f"over budget: {total} > {budget}"
print(total, "of", budget)  # 184 of 180  -> fails, cut premium count or add a proxy

The worked table below shows how the same formula yields different sizes per tier and where the totals land against a 200-connection server.

Tier	Mode	Peak QPS	Service (ms)	L = lambda x W	Burst	Pool size	Pools	Connections
Premium	Dedicated	120	25	3.0	2.0x	6 / tenant	25	150
Standard	Shared	300	40	12.0	1.5x	18 / group	1	18
Free	Shared	200	60	12.0	1.3x	16 / group	1	16
Reserved	—	—	—	—	—	20	—	20
Total								204 / 200

That total is two over the ceiling, which is the signal to either drop premium to 20 dedicated tenants (120 connections) or move the shared tiers behind a transaction-pooling proxy so their 34 server connections shrink to a handful.

Verification

Assert that each tier's configured pool is at least its steady-state minimum and that the grand total fits the budget. Run this in CI so a config edit that breaks the math fails the build rather than production.

import math

def min_required(peak_qps, mean_service_s):
    return math.ceil(peak_qps * mean_service_s)

def test_each_tier_meets_steady_state(tier_config):
    for name, t in tier_config.items():
        floor = min_required(t["peak_qps"], t["mean_service_s"])
        assert t["pool_size"] >= floor, f"{name}: {t['pool_size']} < {floor}"

def test_total_under_db_ceiling(tier_config, db_max=200, reserved=20):
    total = sum(t["pool_size"] * t["pools"] for t in tier_config.values())
    assert total <= db_max - reserved, f"{total} exceeds {db_max - reserved}"

Under load, a correctly sized pool shows near-zero wait time at the connection acquire step. This HikariCP log line is what a healthy premium pool looks like — active count below the maximum and no pending threads:

HikariPool-premium - Pool stats (total=6, active=3, idle=3, waiting=0)

A waiting value that stays above zero on a tier with a latency SLA means the pool is undersized for its burst factor, not that the database is slow.

Failure Modes & Gotchas

One pool size for every tenant. Symptom: enterprise queries queue while free tenants hold connections. Root cause: a flat pool ignores tier SLAs. Fix: size each tier from its own arrival rate and give premium dedicated pools.
Sizing at exactly L. Symptom: latency spikes the moment traffic bursts even though average utilization looks fine. Root cause: at 100% utilization the queue time diverges. Fix: multiply by a burst factor (1.3x–2x), larger for tighter SLAs.
Sum of pools exceeds the server ceiling. Symptom: FATAL: sorry, too many clients already under peak load. Root cause: per-tier pools sized in isolation without checking the total. Fix: assert the sum against max_connections minus reserved, or front the low tiers with a transaction-pooling proxy.
Using average instead of peak. Symptom: pools that test fine but queue every weekday at 9am. Root cause: arrival rate measured as a daily mean smooths away the busy minute. Fix: size from the peak one-minute QPS, not the average.

FAQ

What pool size should a premium tenant get? Compute its steady-state concurrency with Little's Law — peak QPS times mean query hold time — then multiply by a burst factor of about 2x for a latency-sensitive tier and round up. A tenant at 120 QPS and 25 ms needs 120 x 0.025 = 3 connections at steady state, so a dedicated pool of 6 covers bursts.

Should free-tier tenants share one pool? Yes. Free and standard tenants have low individual concurrency, so a shared pool per tier group packs many tenants into a small connection budget. Reserve dedicated pools for premium tenants whose SLA requires isolation from noisy neighbors.

How much burst headroom is enough? It depends on how spiky the traffic is and how tight the SLA is. A factor of 1.3x suits free tiers that tolerate queueing; 2x or more suits premium tiers where any pool wait shows up as user-visible latency. Validate it by watching the pool's waiting count under real peak load.