Overview

The GravixLayer SDK provides dedicated deployment capabilities, allowing you to create isolated, scalable model instances with guaranteed capacity and enterprise-grade features.

Public Preview: Gravix Layer is currently in Public preview. Features are experimental and may have issues or break as ongoing updates to API endpoints and models continue.

Deployments allow you to:

Create Deployments: Launch dedicated model instances with guaranteed GPU resources
Manage Deployments: List, retrieve status, and monitor your active deployments
Scale Deployments: Configure replica counts and GPU allocation for your workloads
Delete Deployments: Remove deployments to manage costs and resources
Monitor Performance: Track deployment metrics and resource utilization

Prerequisites

Before deploying models, you need to set up your API key:

API Key Required: You must export your GravixLayer API key in your terminal before creating deployments. All deployment operations are tied to your API key and account.

Set your API key:

Windows (CMD)
Windows (PowerShell)
Linux/macOS

set GRAVIXLAYER_API_KEY=your_api_key_here

$env:GRAVIXLAYER_API_KEY="your_api_key_here"

export GRAVIXLAYER_API_KEY=your_api_key_here

Supported Models

Text Models

The following models are available for dedicated deployment:

Model Name	Model ID	Provider	Parameters	Context Length
Qwen: Qwen2.5-VL-3B-Instruct	`qwen2-5-vl-3b-instruct`	Qwen	3B	32,768
Qwen: Qwen3-4B-Instruct-2507	`qwen3-4b-instruct-2507`	Qwen	4B	262,144
Qwen: Qwen3-4B-Thinking-2507	`qwen3-4b-thinking-2507`	Qwen	4B	262,144
DeepSeek: DeepSeek-R1-Distill-Qwen-1.5B	`deepseek-r1-distill-qwen-1.5b`	DeepSeek	1.5B	32,768
Qwen: Qwen3-4B	`qwen3-4b`	Qwen	4B	32,768
Qwen: Qwen3-1.7B	`qwen3-1.7b`	Qwen	1.7B	32,768
Qwen: Qwen3-0.6B	`qwen3-0.6b`	Qwen	0.6B	32,768

Available Hardware

Currently supported GPU configurations:

Accelerator	GPU Model	Memory	Pricing
NVIDIA T4	`NVIDIA_T4_16GB`	16GB	$0.39/hour

GPU Count Validation

The --gpu_count parameter only accepts the following values: 1, 2, 4, 8 If you provide any other value, you’ll receive an error:

❌ Error: GPU count must be one of: 1, 2, 4, 8. You provided: 3
Only these GPU counts are supported.

Getting Started

Core Features

Advanced Usage

Resources

Prerequisites

Supported Models

Text Models

Available Hardware

GPU Count Validation

Getting Started

Core Features

Advanced Usage

Resources

​Prerequisites

​Supported Models

​Text Models

​Available Hardware

​GPU Count Validation

Prerequisites

Supported Models

Text Models

Available Hardware

GPU Count Validation