Create a dedicated model deployment with specified hardware and scaling settings.
API key authentication. Get your API key from the Gravix Layer Dashboard.
Unique deployment name
Model identifier to deploy
llama3.2-1b-instruct, qwen2-5-vl-3b-instruct, qwen3-4b-instruct-2507, qwen3-4b-thinking-2507, deepseek-r1-distill-qwen-1.5b, qwen3-4b, qwen3-1.7b, qwen3-0.6b GPU hardware model
NVIDIA_T4_16GB Number of GPUs (1, 2, 4, or 8)
1, 2, 4, 8 Minimum replicas for autoscaling
Maximum replicas for autoscaling
Hardware type: "dedicated" or "shared"
dedicated