Create Deployment

Deploy AI models on dedicated infrastructure with customizable hardware configurations for production-ready applications.

Check Available Hardware

First, list available GPU options:

CLI
Python SDK
JavaScript SDK

gravixlayer deployments gpu --list

Example Output:

Available GPUs (1 found):

GPU ID: 2d7c7178-aa1d-4b27-840d-ca8c0f35d5b1
Model: NVIDIA T4 16GB
GPU Model Code: NVIDIA_T4_16GB
Memory: 16GB
Link: pcie
Status: available
Pricing: $0.39/hour
Updated: 2025-09-08T01:35:50Z

import os
from gravixlayer import GravixLayer

client = GravixLayer()

# List available hardware
hardware_options = client.deployments.list_hardware()
for hardware in hardware_options:
    print(f"GPU ID: {hardware.gpu_id}")
    print(f"Model: {hardware.gpu_model}")
    print(f"Memory: {hardware.gpu_memory}GB")
    print(f"Pricing: ${hardware.pricing}/hour")
    print(f"Status: {hardware.status}")
    print("---")

import { GravixLayer } from 'gravixlayer';

const client = new GravixLayer({
  apiKey: process.env.GRAVIXLAYER_API_KEY,
});

// List available hardware (using accelerators resource)
const hardwareOptions = await client.accelerators.list();
hardwareOptions.forEach(hardware => {
  console.log(`GPU ID: ${hardware.accelerator_id}`);
  console.log(`Model: ${hardware.gpu_model}`);
  console.log(`Memory: ${hardware.gpu_memory}GB`);
  console.log(`Pricing: ${hardware.pricing}/hour`);
  console.log(`Status: ${hardware.status}`);
  console.log("---");
});

Create a Deployment

Create a new dedicated deployment:

CLI
Python SDK
JavaScript SDK

Basic deployment:

gravixlayer deployments create --deployment_name "test_model" --model_name "qwen3-1.7b" --gpu_model "NVIDIA_T4_16GB" --gpu_count 1 --wait

With additional parameters:

gravixlayer deployments create --deployment_name "deepseek-r1-distill-qwen-1-5b-br9w3e" --gpu_model "NVIDIA_T4_16GB" --gpu_count 2 --min_replicas 1 --max_replicas 3 --model_name "deepseek-r1-distill-qwen-1.5b"

Example Output:

Creating deployment 'test_model' with model 'qwen3-1.7b'...
Deployment ID: 5865969c-a1dc-4509-9651-89758b27c87c
Deployment Name: test_model
Status: creating
Model: qwen3-1.7b
GPU Model: NVIDIA_T4_16GB
GPU Count: 1
Min Replicas: 1
Max Replicas: 1
Created: 2025-09-17T09:16:39.304602Z

⏳ Waiting for deployment 'test_model' to be ready...
   Press Ctrl+C to stop monitoring (deployment will continue in background)
   Status: creating
   Status: running

🚀 Deployment is now ready!

import os
from gravixlayer import GravixLayer

client = GravixLayer()

# Create a basic deployment
deployment = client.deployments.create(
    deployment_name="test_model",
    model_name="qwen3-1.7b",
    gpu_model="NVIDIA_T4_16GB",
    gpu_count=1,
    hw_type="dedicated"
)

print(f"Created deployment: {deployment.deployment_id}")
print(f"Deployment name: {deployment.deployment_name}")
print(f"Status: {deployment.status}")

import { GravixLayer } from 'gravixlayer';

const client = new GravixLayer({
  apiKey: process.env.GRAVIXLAYER_API_KEY,
});

// Create a basic deployment
const deployment = await client.deployments.create({
  deployment_name: "test_model",
  model_name: "qwen3-1.7b",
  gpu_model: "NVIDIA_T4_16GB",
  gpu_count: 1,
  hw_type: "dedicated"
});

console.log(`Created deployment: ${deployment.deployment_id}`);
console.log(`Status: ${deployment.status}`);

Deployment Parameters

Parameter breakdown for readability:

Parameter	Value	Description
`--deployment_name`	`"test_model"`	Unique name for your deployment
`--model_name`	`"qwen3-1.7b"`	Model to deploy
`--gpu_model`	`"NVIDIA_T4_16GB"`	GPU type
`--gpu_count`	`1`	Number of GPUs
`--min_replicas`	`1`	Minimum number of replicas
`--max_replicas`	`3`	Maximum number of replicas
`--wait`	(flag)	Wait for deployment to be ready

Advanced Creation Options

CLI
Python SDK
JavaScript SDK

With Auto-Retry:

gravixlayer deployments create --deployment_name "my-model" --model_name "qwen3-1.7b" --gpu_model "NVIDIA_T4_16GB" --auto-retry

With Wait Flag (Recommended):

gravixlayer deployments create --deployment_name "production_model" --model_name "qwen3-4b-instruct-2507" --gpu_model "NVIDIA_T4_16GB" --gpu_count 2 --min_replicas 1 --wait

import os
from gravixlayer import GravixLayer

client = GravixLayer()

# Create with auto-retry
deployment = client.deployments.create(
    deployment_name="my_model",
    model_name="qwen3-1.7b",
    gpu_model="NVIDIA_T4_16GB",
    gpu_count=2,
    auto_retry=True
)

print(f"Created deployment with unique name: {deployment.name}")

import { GravixLayer } from 'gravixlayer';

const client = new GravixLayer({
  apiKey: process.env.GRAVIXLAYER_API_KEY,
});

// Create with unique name (manual implementation)
const uniqueName = `my_model_${Date.now()}`;
const uniqueDeployment = await client.deployments.create({
  deployment_name: uniqueName,
  model_name: "qwen3-1.7b",
  gpu_model: "NVIDIA_T4_16GB",
  gpu_count: 2,
  min_replicas: 1,
  hw_type: "dedicated"
});

console.log(`Created deployment with unique name: ${uniqueDeployment.deployment_name}`);

Create Parameters Reference

Parameter	Type	Required	Description
`--deployment_name`	string	Yes	Unique name for the deployment
`--model_name`	string	Yes	Model name to deploy
`--gpu_model`	string	Yes	GPU model (e.g., NVIDIA_T4_16GB)
`--gpu_count`	int	No	Number of GPUs (supported: 1, 2, 4, 8)
`--min_replicas`	int	No	Minimum replicas (default: 1)
`--max_replicas`	int	No	Maximum replicas (default: 1)
`--hw_type`	string	No	Hardware type (default: dedicated)
`--auto-retry`	flag	No	Auto-retry with unique name if name exists
`--wait`	flag	No	Wait for deployment to be ready before exiting

Auto-Retry Feature

Use the --auto-retry flag to automatically generate a unique deployment name if the specified name already exists:

gravixlayer deployments create --deployment_name "my-model" --model_name "qwen3-1.7b" --gpu_model "NVIDIA_T4_16GB" --auto-retry

This will create a deployment with a name like my-model-1234abcd if my-model already exists.

Wait for Deployment

Use the --wait flag to monitor deployment status and wait until it’s ready:

gravixlayer deployments create --deployment_name "production-model" --model_name "qwen3-4b-instruct-2507" --gpu_model "NVIDIA_T4_16GB" --wait

This will show real-time status updates until the deployment is ready to use.

Getting Started

Core Features

Advanced Usage

Resources

Create Deployment

Check Available Hardware

Create a Deployment

Deployment Parameters

Advanced Creation Options

Create Parameters Reference

Auto-Retry Feature

Wait for Deployment

Getting Started

Core Features

Advanced Usage

Resources

​Check Available Hardware

​Create a Deployment

​Deployment Parameters

​Advanced Creation Options

​Create Parameters Reference

​Auto-Retry Feature

​Wait for Deployment

Check Available Hardware

Create a Deployment

Deployment Parameters

Advanced Creation Options

Create Parameters Reference

Auto-Retry Feature

Wait for Deployment