Access GravixLayer’s powerful AI capabilities directly from your terminal with comprehensive command-line tools and options.
Use the command line interface:
CLI Structure: All inference operations now use the chat subcommand: gravixlayer chat [options]
Basic Commands
# Chat completion
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Hello!"
# Text completion
gravixlayer chat --mode completions --model "meta-llama/llama-3.1-8b-instruct" --prompt "The future of AI is"
# Chat with system message
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --system "You are a helpful assistant" --user "Explain AI"
# Streaming chat
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Tell a story" --stream
# Streaming completion
gravixlayer chat --mode completions --model "meta-llama/llama-3.1-8b-instruct" --prompt "Write a poem" --stream
# Generate embeddings
gravixlayer chat --mode embeddings --model "text-embedding-ada-002" --text "Hello world"
CLI Options
Main Commands:
chat: Chat and completion operations
deployments: Deployment management
files: File management
vectors: Vector database operations
Chat Command Options:
--mode: Operation mode (chat, completions, embeddings)
--model: Model to use
--user: User message for chat mode
--prompt: Prompt for completions mode
--text: Text for embeddings mode
--system: System message for chat mode
--stream: Enable streaming output
--max-tokens: Maximum tokens to generate
--temperature: Sampling temperature (0.0-2.0)
--top-p: Nucleus sampling parameter
--frequency-penalty: Frequency penalty (-2.0 to 2.0)
--presence-penalty: Presence penalty (-2.0 to 2.0)
Advanced Parameters
Temperature Control
# More creative responses (higher temperature)
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Write a creative story" --temperature 1.2
# More focused responses (lower temperature)
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Explain quantum physics" --temperature 0.3
Token Limits
# Limit response length
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Summarize AI" --max-tokens 50
# Longer responses
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Write a detailed essay" --max-tokens 500
Penalty Parameters
# Reduce repetition
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Tell me about space" --frequency-penalty 0.5
# Encourage diverse topics
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Discuss technology" --presence-penalty 0.3