API overview & authentication
Geeper Relay exposes three groups of endpoints:
| Group | Path prefix | Auth |
|---|---|---|
| OpenAI-compatible inference | /v1/chat/completions, /v1/embeddings, /v1/models | API key |
| Anthropic Messages API | /v1/messages | API key |
| Admin | /internal/* | Master key |
| Health & metrics | /healthz, /readyz, /metrics | None |
Authentication
Section titled “Authentication”Relay supports two authentication modes simultaneously.
Relay-issued keys
Section titled “Relay-issued keys”Keys issued by Relay start with gr-. Pass them in the Authorization header:
Authorization: Bearer gr-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxOr, for Anthropic-format clients:
x-api-key: gr-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxKeys are issued via the admin API (POST /internal/api-keys) or via Google SSO. See First API key.
Passthrough keys (bring your own)
Section titled “Passthrough keys (bring your own)”When server.allow_passthrough_keys is true (default), any key that does not start with gr- is forwarded directly to the upstream provider. The request still goes through all Relay middleware (PII scrubbing, content policy, rate limiting).
This lets employees point their existing SDK at Relay without being issued a separate key:
export ANTHROPIC_BASE_URL=https://relay.company.com# ANTHROPIC_API_KEY stays as their own key — no changes neededWorks with any provider — Anthropic, OpenAI, Azure, Gemini, etc. The upstream provider authenticates the key; Relay never validates it.
To restrict access to Relay-issued keys only, set allow_passthrough_keys: false in config.
Master key (admin endpoints)
Section titled “Master key (admin endpoints)”The PROXY_MASTER_KEY grants full admin access. Use it only for automation and key provisioning — never distribute it to end users.
Authorization: Bearer <PROXY_MASTER_KEY>Request ID
Section titled “Request ID”Every response includes an x-request-id header with a UUID. Include this in bug reports and log queries.
Error envelope
Section titled “Error envelope”All error responses use a consistent JSON shape:
{ "error": { "type": "rate_limit_exceeded", "message": "Token rate limit exceeded. Retry after 47 seconds.", "code": 429 }}Common error types:
type | HTTP status | Description |
|---|---|---|
authentication_error | 401 | Invalid or missing API key |
content_policy_violation | 400 | Blocked pattern or token limit exceeded |
rate_limit_exceeded | 429 | Token-bucket limit hit |
model_not_allowed | 400 | Model not in allowedModels |
upstream_error | 502 | LLM provider returned an error |
internal_error | 500 | Unexpected proxy error |
Rate limit headers
Section titled “Rate limit headers”On a 429 response:
Retry-After: 47Value is seconds until the rate-limiting bucket refills enough to allow the request.