Skip to main content
POST
/
deployments
Create Deployment
curl --request POST \
  --url https://api.gravixlayer.com/v1/deployments \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "deployment_name": "my-model-deployment",
  "model_name": "meta-llama/llama-3.1-8b-instruct",
  "gpu_model": "NVIDIA_T4_16GB",
  "gpu_count": 1,
  "min_replicas": 1,
  "max_replicas": 5,
  "hw_type": "nvidia-t4-16gb-pcie_1"
}'
{
  "id": "deploy-123456",
  "deployment_name": "my-model-deployment",
  "model_name": "meta-llama/llama-3.1-8b-instruct",
  "gpu_model": "NVIDIA_T4_16GB",
  "gpu_count": 1,
  "min_replicas": 1,
  "max_replicas": 5,
  "hw_type": "nvidia-t4-16gb-pcie_1",
  "status": "creating",
  "created_at": "2025-10-16T18:00:00Z"
}

Authorizations

Authorization
string
header
required

API key authentication. Get your API key from the Gravix Layer Dashboard.

Body

application/json
deployment_name
string
required

User-facing name for the deployment

model_name
string
required

Model identifier to deploy

gpu_model
string
required

GPU model (e.g., NVIDIA_T4_16GB)

gpu_count
integer
required

Number of GPUs to allocate

min_replicas
integer

Minimum number of replicas

max_replicas
integer

Maximum number of replicas

hw_type
string

Hardware type identifier (e.g., nvidia-t4-16gb-pcie_1)

env
object

Optional environment variables for the deployment

Response

Deployment created successfully

id
string
deployment_name
string
model_name
string
gpu_model
string
gpu_count
integer
min_replicas
integer
max_replicas
integer
hw_type
string
status
string
created_at
string<date-time>
I