Skip to main content
Deploy AI models on dedicated infrastructure with customizable hardware configurations for production-ready applications.

Check Available Hardware

First, list available GPU options:
  • CLI
  • Python SDK
  • JavaScript SDK
gravixlayer deployments gpu --list
Example Output:
Available GPUs (1 found):

GPU ID: 2d7c7178-aa1d-4b27-840d-ca8c0f35d5b1
Model: NVIDIA T4 16GB
GPU Model Code: NVIDIA_T4_16GB
Memory: 16GB
Link: pcie
Status: available
Pricing: $0.39/hour
Updated: 2025-09-08T01:35:50Z

Create a Deployment

Create a new dedicated deployment:
  • CLI
  • Python SDK
  • JavaScript SDK
Basic deployment:
gravixlayer deployments create --deployment_name "test_model" --model_name "qwen3-1.7b" --gpu_model "NVIDIA_T4_16GB" --gpu_count 1 --wait
With additional parameters:
gravixlayer deployments create --deployment_name "deepseek-r1-distill-qwen-1-5b-br9w3e" --gpu_model "NVIDIA_T4_16GB" --gpu_count 2 --min_replicas 1 --max_replicas 3 --model_name "deepseek-r1-distill-qwen-1.5b"
Example Output:
Creating deployment 'test_model' with model 'qwen3-1.7b'...
Deployment ID: 5865969c-a1dc-4509-9651-89758b27c87c
Deployment Name: test_model
Status: creating
Model: qwen3-1.7b
GPU Model: NVIDIA_T4_16GB
GPU Count: 1
Min Replicas: 1
Max Replicas: 1
Created: 2025-09-17T09:16:39.304602Z

⏳ Waiting for deployment 'test_model' to be ready...
   Press Ctrl+C to stop monitoring (deployment will continue in background)
   Status: creating
   Status: running

🚀 Deployment is now ready!

Deployment Parameters

Parameter breakdown for readability:
ParameterValueDescription
--deployment_name"test_model"Unique name for your deployment
--model_name"qwen3-1.7b"Model to deploy
--gpu_model"NVIDIA_T4_16GB"GPU type
--gpu_count1Number of GPUs
--min_replicas1Minimum number of replicas
--max_replicas3Maximum number of replicas
--wait(flag)Wait for deployment to be ready

Advanced Creation Options

  • CLI
  • Python SDK
  • JavaScript SDK
With Auto-Retry:
gravixlayer deployments create --deployment_name "my-model" --model_name "qwen3-1.7b" --gpu_model "NVIDIA_T4_16GB" --auto-retry
With Wait Flag (Recommended):
gravixlayer deployments create --deployment_name "production_model" --model_name "qwen3-4b-instruct-2507" --gpu_model "NVIDIA_T4_16GB" --gpu_count 2 --min_replicas 1 --wait

Create Parameters Reference

ParameterTypeRequiredDescription
--deployment_namestringYesUnique name for the deployment
--model_namestringYesModel name to deploy
--gpu_modelstringYesGPU model (e.g., NVIDIA_T4_16GB)
--gpu_countintNoNumber of GPUs (supported: 1, 2, 4, 8)
--min_replicasintNoMinimum replicas (default: 1)
--max_replicasintNoMaximum replicas (default: 1)
--hw_typestringNoHardware type (default: dedicated)
--auto-retryflagNoAuto-retry with unique name if name exists
--waitflagNoWait for deployment to be ready before exiting

Auto-Retry Feature

Use the --auto-retry flag to automatically generate a unique deployment name if the specified name already exists:
gravixlayer deployments create --deployment_name "my-model" --model_name "qwen3-1.7b" --gpu_model "NVIDIA_T4_16GB" --auto-retry
This will create a deployment with a name like my-model-1234abcd if my-model already exists.

Wait for Deployment

Use the --wait flag to monitor deployment status and wait until it’s ready:
gravixlayer deployments create --deployment_name "production-model" --model_name "qwen3-4b-instruct-2507" --gpu_model "NVIDIA_T4_16GB" --wait
This will show real-time status updates until the deployment is ready to use.
I