Skip to main content
The GravixLayer SDK provides dedicated deployment capabilities, allowing you to create isolated, scalable model instances with guaranteed capacity and enterprise-grade features.
Public Preview: Gravix Layer is currently in Public preview. Features are experimental and may have issues or break as ongoing updates to API endpoints and models continue.
Deployments allow you to:
  • Create Deployments: Launch dedicated model instances with guaranteed GPU resources
  • Manage Deployments: List, retrieve status, and monitor your active deployments
  • Scale Deployments: Configure replica counts and GPU allocation for your workloads
  • Delete Deployments: Remove deployments to manage costs and resources
  • Monitor Performance: Track deployment metrics and resource utilization

Prerequisites

Before deploying models, you need to set up your API key:
API Key Required: You must export your GravixLayer API key in your terminal before creating deployments. All deployment operations are tied to your API key and account.
Set your API key:
set GRAVIXLAYER_API_KEY=your_api_key_here

Supported Models

Text Models

The following models are available for dedicated deployment:
Model NameModel IDProviderParametersContext Length
Qwen: Qwen2.5-VL-3B-Instructqwen2-5-vl-3b-instructQwen3B32,768
Qwen: Qwen3-4B-Instruct-2507qwen3-4b-instruct-2507Qwen4B262,144
Qwen: Qwen3-4B-Thinking-2507qwen3-4b-thinking-2507Qwen4B262,144
DeepSeek: DeepSeek-R1-Distill-Qwen-1.5Bdeepseek-r1-distill-qwen-1.5bDeepSeek1.5B32,768
Qwen: Qwen3-4Bqwen3-4bQwen4B32,768
Qwen: Qwen3-1.7Bqwen3-1.7bQwen1.7B32,768
Qwen: Qwen3-0.6Bqwen3-0.6bQwen0.6B32,768

Available Hardware

Currently supported GPU configurations:
AcceleratorGPU ModelMemoryPricing
NVIDIA T4NVIDIA_T4_16GB16GB$0.39/hour

GPU Count Validation

The --gpu_count parameter only accepts the following values: 1, 2, 4, 8 If you provide any other value, you’ll receive an error:
❌ Error: GPU count must be one of: 1, 2, 4, 8. You provided: 3
Only these GPU counts are supported.