Public Preview: Gravix Layer is currently in public preview. API endpoints, rate limits, and models are being updated frequently. The service is free to try.
Gravix Layer enforces API rate limits to ensure stability, fairness, and security for all users. This page explains how limits work, what happens if you exceed them, and how to design your integration for reliability.
What Are Rate Limits?
Rate limits control how many API requests or tokens you can use in a set time period. They:
- Keep the service stable
- Ensure fair access
- Prevent abuse
Types of Limits
| Abbreviation | What It Means |
|---|
| RPM | Requests per minute |
| RPD | Requests per day |
| TPM | Tokens per minute |
| TPD | Tokens per day |
Free Plan Rate Limits
| Limit Type | Description | Free Plan |
|---|
| RPM | Requests per Minute | 25 |
| RPD | Requests per Day | 1000 |
| TPM | Tokens per Minute | 10,000 |
| TPD | Tokens per Day | 100,000 |
How Limits Work
- Limits are enforced at the organization level (not per user)
- All API keys in your org share the same pool
- Multiple users with different keys count toward the same limits
You hit a limit when either requests or tokens reach the cap. Limits reset at the start of each window (minute/day).
What Happens If You Exceed a Limit?
- The API returns HTTP
429 Too Many Requests
- The response includes a
Retry-After header with wait time
- You should implement automatic retry logic with backoff
Best Practices for Staying Within Limits
- Monitor both requests and tokens
- Use exponential backoff on 429 errors
- Batch multiple operations into single requests
- Cache responses to reduce duplicate calls
- Write concise prompts to minimize token usage
Example: Hitting a Limit
Suppose your limits are:
| Limit Type | Value |
|---|
| RPM | 25 |
| TPM | 10,000 |
If you send 25 requests with 200 tokens each in a minute:
| Metric | Value | Limit | Status |
|---|
| Requests sent | 25 | 25 | Limit reached |
| Tokens used | 5,000 | 10,000 | OK |
Even though you are below the token limit, you still hit the request-per-minute limit and will receive a 429 error if you send more requests in that minute.