Rate limits

A rate limit defines the maximum number of requests an application can make to the API within a specified time period. This mechanism is crucial to maintain the overall health and reliability of the services.

Why we use rate limiting

API stability and performance

Prevents a single client from overwhelming servers with excessive requests.
Ensures consistent and predictable response times for all clients.
Protects backend services from unexpected traffic spikes.
Maintains a stable and controlled load pattern on the infrastructure.

Security

Strengthens defense against attack attempts, such as brute force.
Mitigates the risk of distributed denial-of-service (DDoS).
Reduces the impact caused by poorly implemented integrations.
Limits potential damage resulting from compromised API credentials.

Resource allocation

Ensures balanced access to API resources for all clients.
Prevents abusive usage by some from degrading other users' experience.
Allows prioritizing traffic according to business criteria and needs.
Encourages more efficient and sustainable API consumption practices.

How rate limiting works

Each API call counts toward the allowed requests-per-second limit. The default value is 10 RPS per tenant. When the limit is exceeded, the API returns HTTP 429 Too Many Requests.

Confirm with your Onboarding team

Exact limits are configured per tenant. Contact your Onboarding team to confirm the limits applied to your account before sizing for high-throughput integrations.

Handling 429 errors

Request a rate limit increase

If your operation will have an increase in request volume (temporarily or permanently), notify the Unico team to raise the rate limit for your environment. This request should be made before the volume actually increases to avoid your application becoming inoperative.

Review application behavior

Audit your code to identify inefficient API usage patterns.
Check for unintended loops or redundant API calls.
Distribute requests more evenly over time instead of sending them in large bursts.

Implement caching

Cache frequently accessed data that rarely changes.
Use the OAuth2 access token for its full 1-hour TTL — do not call POST /oauth2/token per request.
Implement appropriate cache invalidation strategies.

Use webhooks instead of polling

Subscribe to Webhooks and Events for PROCESS_STATE_FINISHED rather than polling GET /client/v1/process/\{id\}. A polling loop can easily exhaust the GET-process budget on heavy traffic.

What happens when you hit the limit

The platform returns:

HTTP/1.1 429 Too Many Requests
Retry-After: 12

Header	Meaning
`Retry-After`	Number of seconds to wait before retrying. Honor it.

If Retry-After is missing, fall back to exponential backoff (1s, 2s, 4s, 8s, …) capped at 60s.

Concurrency considerations

Rate limits cap requests per minute, but the platform also has concurrency caps at the infrastructure level. Even if you have budget for 100 requests/minute, firing all 100 in the same second is more likely to be rejected than spreading them across the minute.

For high-throughput integrations, target a steady RPS rather than bursts.

Magic Link webhook delivery

Trully has its own delivery quotas for Webhook V2. The webhook server is expected to respond within 1 minute; slower responses are dropped (the user's process is not affected). For consistent processing, acknowledge the webhook fast and process it asynchronously.

What's next

Authentication — token caching strategy.
Webhooks and Events — replacing polling with push.
Error codes > Retry policy — when (and when not) to retry.

Why we use rate limiting​

How rate limiting works​

Handling 429 errors​

What happens when you hit the limit​

Concurrency considerations​

Magic Link webhook delivery​

What's next​