The GravixLayer SDK provides dedicated deployment capabilities, allowing you to create isolated, scalable model instances with guaranteed capacity and enterprise-grade features.
Public Preview: Gravix Layer is currently in Public preview. Features are experimental and may have issues or break as ongoing updates to API endpoints and models continue.
Deployments allow you to:
- Create Deployments: Launch dedicated model instances with guaranteed GPU resources
- Manage Deployments: List, retrieve status, and monitor your active deployments
- Scale Deployments: Configure replica counts and GPU allocation for your workloads
- Delete Deployments: Remove deployments to manage costs and resources
- Monitor Performance: Track deployment metrics and resource utilization
Prerequisites
Before deploying models, you need to set up your API key:
API Key Required: You must export your GravixLayer API key in your terminal before creating deployments. All deployment operations are tied to your API key and account.
Set your API key:
Windows (CMD)
Windows (PowerShell)
Linux/macOS
set GRAVIXLAYER_API_KEY=your_api_key_here
$env:GRAVIXLAYER_API_KEY="your_api_key_here"
export GRAVIXLAYER_API_KEY=your_api_key_here
Supported Models
Text Models
The following models are available for dedicated deployment:
| Model Name | Model ID | Provider | Parameters | Context Length |
| Qwen: Qwen2.5-VL-3B-Instruct | qwen2-5-vl-3b-instruct | Qwen | 3B | 32,768 |
| Qwen: Qwen3-4B-Instruct-2507 | qwen3-4b-instruct-2507 | Qwen | 4B | 262,144 |
| Qwen: Qwen3-4B-Thinking-2507 | qwen3-4b-thinking-2507 | Qwen | 4B | 262,144 |
| DeepSeek: DeepSeek-R1-Distill-Qwen-1.5B | deepseek-r1-distill-qwen-1.5b | DeepSeek | 1.5B | 32,768 |
| Qwen: Qwen3-4B | qwen3-4b | Qwen | 4B | 32,768 |
| Qwen: Qwen3-1.7B | qwen3-1.7b | Qwen | 1.7B | 32,768 |
| Qwen: Qwen3-0.6B | qwen3-0.6b | Qwen | 0.6B | 32,768 |
Available Hardware
Currently supported GPU configurations:
| Accelerator | GPU Model | Memory | Pricing |
| NVIDIA T4 | NVIDIA_T4_16GB | 16GB | $0.39/hour |
GPU Count Validation
The --gpu_count parameter only accepts the following values: 1, 2, 4, 8
If you provide any other value, you’ll receive an error:
❌ Error: GPU count must be one of: 1, 2, 4, 8. You provided: 3
Only these GPU counts are supported.