Test Your Deployment
Use your deployment for inference:- CLI
- Python SDK
- JavaScript SDK
Chat Completions
Once your deployment is running, you can use it like any other model by referencing the deployment name:- CLI
- Python SDK
- JavaScript SDK
Basic Chat:Streaming Chat:With System Message:Text Completion Mode:Streaming Completion:
Advanced Usage Examples
Batch Processing
- Python SDK
- JavaScript SDK
Performance Monitoring
- Python SDK
- JavaScript SDK
Troubleshooting
Common Issues
Deployment Stuck in “Creating” Status:- Wait 5-10 minutes for initialization
- Check hardware availability with
gravixlayer deployments gpu --list - Verify model name is correct
- Ensure deployment status is “running” before making requests
- Verify deployment name matches exactly
- Check API key configuration
- Monitor deployment status and resource usage
- Consider scaling up replicas for higher throughput
- Check if model size matches your use case

