Skip to content

Rate limiting

Geeper Relay uses a token-bucket algorithm. Each user has three independent buckets:

BucketConfig keyDefault
Requests per minutedefaults.requests_per_minute60
Tokens per minutedefaults.tokens_per_minute100 000
Tokens per daydefaults.tokens_per_day1 000 000

Each team can optionally have:

BucketSet via
Team tokens per minutePOST /internal/teamstpm_limit
Team tokens per dayPOST /internal/teamsdaily_token_limit

A request consumes from both the user bucket and the team bucket. Either one can reject the request.

rate_limiting:
enabled: true
backend: memory # or "redis"
defaults:
requests_per_minute: 60
tokens_per_minute: 100000
tokens_per_day: 1000000

Buckets are stored in-process. Fast (no network round-trip), but:

  • Not shared across uvicorn workers within the same process (rare issue with --workers > 1)
  • Not shared across replicas — each pod enforces limits independently

Suitable for single-replica deployments and local development.

rate_limiting:
backend: redis

Buckets are stored in Redis with atomic Lua scripts. Shared across all workers and all replicas.

Connect to an external Redis:

Terminal window
RATE_LIMITING__REDIS_URL=redis://user:pass@redis.internal:6379

Override limits for a specific team via the admin API:

Terminal window
curl -X POST http://localhost:8000/internal/teams \
-H "Authorization: Bearer $PROXY_MASTER_KEY" \
-d '{
"name": "data-science",
"tpm_limit": 500000,
"daily_token_limit": 10000000
}'

The tpm_limit and daily_token_limit fields are optional — omit to use the global defaults for that team’s users.

When a bucket is exhausted the proxy returns:

HTTP/1.1 429 Too Many Requests
Retry-After: 47
Content-Type: application/json
{
"error": {
"type": "rate_limit_exceeded",
"message": "Token rate limit exceeded. Retry after 47 seconds.",
"code": 429
}
}

Retry-After is the number of seconds until the bucket refills enough to allow the request.

relay_rate_limit_hits_total{limit_type="requests_per_minute"} 3
relay_rate_limit_hits_total{limit_type="tokens_per_minute"} 12
relay_rate_limit_hits_total{limit_type="tokens_per_day"} 1