Rate Limiting

Per-tenant rate limits for API and UI endpoints.

The Bizzlink API enforces per-tenant rate limits to protect backend resources and ensure fair usage across all tenants. Rate limits are applied at the Cloud Gateway level, based on your tenant identity (from your API key).

Rate Limit Tiers

Endpoint	Method	Limit	Burst
`/bizzlink/document/create-and-send/invoice`	POST	60/min	80
`/bizzlink/document/create-and-send/credit-note`	POST	60/min	80
`/bizzlink/document/create-and-send/ubl21v2`	POST	60/min	80
`/bizzlink/document/send/ubl21`	POST	60/min	80
`/bizzlink/document/send/zugferd/xml`	POST	60/min	80
`/bizzlink/document/send/zugferd/pdf`	POST	60/min	80
`/bizzlink/document/{id}/pdf`	GET	30/min	40
`/bizzlink/document/validate`	POST	60/min	80
`/bizzlink/convert/*`	POST	60/min	80
`/bizzlink/webhooks`	POST	10/min	15
`/bizzlink/webhooks/{id}/actions/test`	POST	5/min	5

All other endpoints are not rate-limited at the application level.

Burst allows short spikes above the sustained rate. For example, a 60/min limit with burst 80 means you can send up to 80 requests in a quick burst, but over time the average must stay at or below 60 per minute.

How Tenants Are Identified

Your tenant is identified by the tenant code embedded in your API key.

Each tenant has its own independent rate limit counters per endpoint tier.

Response When Rate Limited

When you exceed the rate limit, the API returns:

HTTP/1.1 429 Too Many Requests
Retry-After: 60
X-RateLimit-Remaining: 0
X-RateLimit-Burst-Capacity: 80
X-RateLimit-Replenish-Rate: 60

Response Headers

Every response from a rate-limited endpoint includes these headers:

Header	Description
`X-RateLimit-Remaining`	Number of requests remaining in the current window
`X-RateLimit-Burst-Capacity`	Maximum burst capacity for this endpoint
`X-RateLimit-Replenish-Rate`	Requests per minute allowed for this endpoint
`Retry-After`	Seconds to wait before retrying (only on `429` responses)

Best Practices

Respect Retry-After — When you receive a 429, wait at least the indicated number of seconds before retrying.
Implement exponential backoff — For batch processing, add increasing delays between retries.
Spread requests over time — Instead of sending 60 documents at once, distribute them evenly across the minute.
Monitor rate limit headers — Check X-RateLimit-Remaining to proactively slow down before hitting the limit.