Public Preview: Gravix Layer is currently in Public preview. Features are experimental and may have issues or break as ongoing updates to API endpoints and models continue.
Prerequisites
Before starting, make sure you have:- Python 3.7+ or Node.js 14+ installed
- A GravixLayer API key
- The GravixLayer SDK installed
Setup
- Python SDK
- JavaScript SDK
First, install the GravixLayer Python SDK:Set your API key as an environment variable:
Copy
pip install gravixlayer
Copy
export GRAVIXLAYER_API_KEY="your_api_key_here"
First, install the GravixLayer JavaScript SDK:Set your API key as an environment variable:
Copy
npm install gravixlayer
Copy
# Windows (PowerShell)
$env:GRAVIXLAYER_API_KEY="your_api_key_here"
# Windows (Command Prompt)
set GRAVIXLAYER_API_KEY=your_api_key_here
# macOS/Linux
export GRAVIXLAYER_API_KEY="your_api_key_here"
Your First Request
- Python SDK
- JavaScript SDK
Create a new file called Run the script:
main.py:Copy
import os
from gravixlayer import GravixLayer
client = GravixLayer()
completion = client.chat.completions.create(
model="meta-llama/llama-3.1-8b-instruct",
messages=[{"role": "user", "content": "Hello, world!"}]
)
print(completion.choices[0].message.content)
Copy
python main.py
Create a new file called Run the script:
main.mjs:Copy
import { GravixLayer } from 'gravixlayer';
const client = new GravixLayer({
apiKey: process.env.GRAVIXLAYER_API_KEY,
});
const completion = await client.chat.completions.create({
model: "meta-llama/llama-3.1-8b-instruct",
messages: [{"role": "user", "content": "Hello, world!"}]
});
console.log(completion.choices[0].message.content);
Copy
node main.mjs
Chat Completions
Simple conversation with the AI:- Python SDK
- JavaScript SDK
Copy
from gravixlayer import GravixLayer
client = GravixLayer()
response = client.chat.completions.create(
model="meta-llama/llama-3.1-8b-instruct",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is Python?"}
]
)
print(response.choices[0].message.content)
Copy
import { GravixLayer } from 'gravixlayer';
const client = new GravixLayer({
apiKey: process.env.GRAVIXLAYER_API_KEY,
});
const response = await client.chat.completions.create({
model: "meta-llama/llama-3.1-8b-instruct",
messages: [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is JavaScript?"}
]
});
console.log(response.choices[0].message.content);
Completions
Simple text completion from a prompt:- Python SDK
- JavaScript SDK
Copy
from gravixlayer import GravixLayer
client = GravixLayer()
completion = client.completions.create(
model="meta-llama/llama-3.1-8b-instruct",
prompt="The future of artificial intelligence is",
max_tokens=50
)
print(completion.choices[0].text.strip())
Copy
import { GravixLayer } from 'gravixlayer';
const client = new GravixLayer({
apiKey: process.env.GRAVIXLAYER_API_KEY,
});
const completion = await client.completions.create({
model: "meta-llama/llama-3.1-8b-instruct",
prompt: "The future of artificial intelligence is",
maxTokens: 50
});
console.log(completion.choices[0].text.trim());
Streaming Responses
Get responses in real-time:- Python SDK
- JavaScript SDK
Copy
from gravixlayer import GravixLayer
client = GravixLayer()
# Chat streaming
stream = client.chat.completions.create(
model="meta-llama/llama-3.1-8b-instruct",
messages=[{"role": "user", "content": "Tell me a short story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="", flush=True)
print() # New line
# Completions streaming
stream = client.completions.create(
model="meta-llama/llama-3.1-8b-instruct",
prompt="Write a poem about",
max_tokens=100,
stream=True
)
for chunk in stream:
if chunk.choices[0].text is not None:
print(chunk.choices[0].text, end="", flush=True)
print() # New line
Copy
import { GravixLayer } from 'gravixlayer';
const client = new GravixLayer({
apiKey: process.env.GRAVIXLAYER_API_KEY,
});
// Chat streaming
const stream = await client.chat.completions.create({
model: "meta-llama/llama-3.1-8b-instruct",
messages: [{"role": "user", "content": "Tell me a short story"}],
stream: true
});
for await (const chunk of stream) {
if (chunk.choices[0].delta.content !== null) {
process.stdout.write(chunk.choices[0].delta.content);
}
}
console.log(); // New line
// Completions streaming
const completionStream = await client.completions.create({
model: "meta-llama/llama-3.1-8b-instruct",
prompt: "Write a poem about",
maxTokens: 100,
stream: true
});
for await (const chunk of completionStream) {
if (chunk.choices[0].text !== null) {
process.stdout.write(chunk.choices[0].text);
}
}
console.log(); // New line
Embeddings
Generate text embeddings:- Python SDK
- JavaScript SDK
Copy
import os
import json
from gravixlayer import GravixLayer
client = GravixLayer(
api_key=os.environ.get("GRAVIXLAYER_API_KEY"),
)
embedding = client.embeddings.create(
model="baai/bge-large-en-v1.5",
input="Why is the sky blue?",
)
print(json.dumps(embedding.model_dump(), indent=2))
Copy
import { GravixLayer } from 'gravixlayer';
const client = new GravixLayer({
apiKey: process.env.GRAVIXLAYER_API_KEY,
});
const embedding = await client.embeddings.create({
model: "baai/bge-large-en-v1.5",
input: "Why is the sky blue?",
});
console.log(JSON.stringify(embedding, null, 2));
Function Calling
Let AI call your functions:- Python SDK
- JavaScript SDK
Copy
import os
import json
import requests
from gravixlayer import GravixLayer
# Define a simple function
def get_weather(latitude, longitude):
"""Get current temperature for coordinates."""
url = f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}¤t=temperature_2m"
response = requests.get(url)
data = response.json()
return data['current']['temperature_2m']
client = GravixLayer()
# Define the function for the AI
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current temperature for a location",
"parameters": {
"type": "object",
"properties": {
"latitude": {"type": "number"},
"longitude": {"type": "number"}
},
"required": ["latitude", "longitude"]
}
}
}]
# Ask AI to call the function
messages = [{"role": "user", "content": "What's the weather in Paris?"}]
completion = client.chat.completions.create(
model="meta-llama/llama-3.1-8b-instruct",
messages=messages,
tools=tools,
tool_choice="auto"
)
response_message = completion.choices[0].message
messages.append(response_message)
# Check if AI wants to call function
if response_message.tool_calls:
tool_call = response_message.tool_calls[0]
# Call the function
args = json.loads(tool_call.function.arguments)
result = get_weather(args["latitude"], args["longitude"])
# Send result back to AI
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"name": "get_weather",
"content": str(result)
})
# Get final response
final_completion = client.chat.completions.create(
model="meta-llama/llama-3.1-8b-instruct",
messages=messages,
tools=tools
)
print(final_completion.choices[0].message.content)
Copy
// Filename: weather_bot.js
// To run: node weather_bot.js
// Prerequisites: npm install node-fetch gravixlayer
// Ensure API key is set: export GRAVIXLAYER_API_KEY='your-api-key'
import fetch from 'node-fetch';
import { GravixLayer } from 'gravixlayer';
async function get_weather(latitude, longitude) {
console.log(`--- Calling get_weather function for Lat: ${latitude}, Lon: ${longitude} ---`);
const url = `https://api.open-meteo.com/v1/forecast?latitude=${latitude}&longitude=${longitude}¤t=temperature_2m`;
const response = await fetch(url);
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
const data = await response.json();
return data.current.temperature_2m;
}
async function runConversation() {
const apiKey = process.env.GRAVIXLAYER_API_KEY;
if (!apiKey) {
throw new Error("GRAVIXLAYER_API_KEY environment variable not set.");
}
const client = new GravixLayer({
apiKey: apiKey
});
console.log(`User: What's the weather like in Paris, France today?\n`);
console.log("--- Making first API call to model ---");
try {
// First, try to get the model to understand we need weather data
const initialMessages = [
{
role: "system",
content: "You are a helpful assistant. When asked about weather, you should ask for the latitude and longitude coordinates, or if you know the coordinates for a city, you can provide them. For Paris, France, the coordinates are approximately 48.8566° N, 2.3522° E."
},
{
role: "user",
content: "What's the weather like in Paris, France today? I need the current temperature."
}
];
const completion = await client.chat.completions.create({
model: "meta-llama/llama-3.1-8b-instruct",
messages: initialMessages,
max_tokens: 200
});
const responseMessage = completion.choices[0].message;
console.log(`Assistant: ${responseMessage.content}\n`);
// Extract coordinates (Paris coordinates)
const parisLat = 48.8566;
const parisLon = 2.3522;
console.log("--- Getting weather data ---");
const temperature = await get_weather(parisLat, parisLon);
console.log(`Tool Result (Temperature): ${temperature}°C\n`);
console.log("--- Making second API call with weather data ---");
// Now provide the weather data to get a natural response
const finalMessages = [
{
role: "system",
content: "You are a helpful weather assistant. Provide a natural, conversational response about the weather."
},
{
role: "user",
content: "What's the weather like in Paris, France today?"
},
{
role: "assistant",
content: `I can help you with that! Let me check the current weather in Paris.`
},
{
role: "user",
content: `The current temperature in Paris, France is ${temperature}°C. Please give me a natural response about this weather.`
}
];
const finalCompletion = await client.chat.completions.create({
model: "meta-llama/llama-3.1-8b-instruct",
messages: finalMessages,
max_tokens: 150
});
console.log(`\nFinal Assistant Response: ${finalCompletion.choices[0].message.content}`);
} catch (error) {
console.error('❌ Error in conversation:', error.message);
// Fallback: Simple weather check without complex function calling
console.log('\n--- Trying fallback approach ---');
try {
const temperature = await get_weather(48.8566, 2.3522); // Paris coordinates
console.log(`\nFallback Result: The current temperature in Paris is ${temperature}°C`);
const simpleResponse = await client.chat.completions.create({
model: "meta-llama/llama-3.1-8b-instruct",
messages: [
{
role: "user",
content: `The current temperature in Paris, France is ${temperature}°C. Please provide a brief, natural comment about this weather.`
}
],
max_tokens: 100
});
console.log(`Simple Response: ${simpleResponse.choices[0].message.content}`);
} catch (fallbackError) {
console.error('❌ Fallback also failed:', fallbackError.message);
}
}
}
runConversation().catch(console.error);
Building a Simple Chatbot
Create an interactive chatbot with conversation memory:- Python SDK
- JavaScript SDK
Copy
from gravixlayer import GravixLayer
class SimpleChatbot:
def __init__(self):
self.client = GravixLayer()
self.conversation = []
def chat(self, message):
# Add user message to conversation
self.conversation.append({"role": "user", "content": message})
# Create messages with system prompt
messages = [
{"role": "system", "content": "You are a helpful assistant."}
] + self.conversation
# Get response from the model
completion = self.client.chat.completions.create(
model="meta-llama/llama-3.1-8b-instruct",
messages=messages,
temperature=0.7,
max_tokens=150
)
# Add assistant response to conversation
assistant_message = completion.choices[0].message.content
self.conversation.append({"role": "assistant", "content": assistant_message})
return assistant_message
# Use the chatbot
bot = SimpleChatbot()
print(bot.chat("What is Python programming?"))
print(bot.chat("Can you give me an example?"))
Copy
import { GravixLayer } from 'gravixlayer';
class SimpleChatbot {
constructor() {
this.client = new GravixLayer({
apiKey: process.env.GRAVIXLAYER_API_KEY,
});
this.conversation = [];
}
async chat(message) {
// Add user message to conversation
this.conversation.push({ role: "user", content: message });
// Create messages with system prompt
const messages = [
{ role: "system", content: "You are a helpful assistant." },
...this.conversation
];
// Get response from the model
const completion = await this.client.chat.completions.create({
model: "meta-llama/llama-3.1-8b-instruct",
messages: messages,
temperature: 0.7,
maxTokens: 150
});
// Add assistant response to conversation
const assistantMessage = completion.choices[0].message.content;
this.conversation.push({ role: "assistant", content: assistantMessage });
return assistantMessage;
}
}
// Use the chatbot
const bot = new SimpleChatbot();
console.log(await bot.chat("What is JavaScript programming?"));
console.log(await bot.chat("Can you give me an example?"));
Async Operations
- Python SDK
- JavaScript SDK
Copy
import asyncio
import os
from gravixlayer import AsyncGravixLayer
class AsyncChatbot:
def __init__(self):
self.client = AsyncGravixLayer()
async def chat(self, message):
completion = await self.client.chat.completions.create(
model="meta-llama/llama-3.1-8b-instruct",
messages=[{"role": "user", "content": message}],
temperature=0.7
)
return completion.choices[0].message.content
async def chat_multiple(self, messages):
"""Handle multiple messages concurrently"""
tasks = [self.chat(msg) for msg in messages]
return await asyncio.gather(*tasks)
async def main():
bot = AsyncChatbot()
# Single request
response = await bot.chat("What is Python?")
print(f"Single: {response}")
# Multiple concurrent requests
messages = [
"What is machine learning?",
"Explain neural networks",
"What is deep learning?"
]
responses = await bot.chat_multiple(messages)
for i, response in enumerate(responses):
print(f"Response {i+1}: {response[:100]}...")
asyncio.run(main())
Copy
import { GravixLayer } from 'gravixlayer';
class AsyncChatbot {
constructor() {
this.client = new GravixLayer({
apiKey: process.env.GRAVIXLAYER_API_KEY,
});
}
async chat(message) {
const completion = await this.client.chat.completions.create({
model: "meta-llama/llama-3.1-8b-instruct",
messages: [{ role: "user", content: message }],
temperature: 0.7
});
return completion.choices[0].message.content;
}
async chatMultiple(messages) {
// Handle multiple messages concurrently
const tasks = messages.map(msg => this.chat(msg));
return await Promise.all(tasks);
}
}
async function main() {
const bot = new AsyncChatbot();
// Single request
const response = await bot.chat("What is JavaScript?");
console.log(`Single: ${response}`);
// Multiple concurrent requests
const messages = [
"What is machine learning?",
"Explain neural networks",
"What is deep learning?"
];
const responses = await bot.chatMultiple(messages);
responses.forEach((response, i) => {
console.log(`Response ${i+1}: ${response.substring(0, 100)}...`);
});
}
main().catch(console.error);
CLI Usage
Use the command line interface:- Python SDK
- JavaScript SDK
CLI Structure: All inference operations now use the
chat subcommand: gravixlayer chat [options]Copy
# Install globally
pip install gravixlayer
# Chat completion
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Hello!"
# Text completion
gravixlayer chat --mode completions --model "meta-llama/llama-3.1-8b-instruct" --prompt "The future of AI is"
# Chat with system message
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --system "You are a helpful assistant" --user "Explain AI"
# Streaming chat
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Tell a story" --stream
# Streaming completion
gravixlayer chat --mode completions --model "meta-llama/llama-3.1-8b-instruct" --prompt "Write a poem" --stream
Copy
# Install globally
npm install -g gravixlayer
# Chat completion
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Hello!"
# Text completion
gravixlayer chat --mode completions --model "meta-llama/llama-3.1-8b-instruct" --prompt "The future of AI is"
# Chat with system message
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --system "You are a helpful assistant" --user "Explain AI"
# Streaming chat
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Tell a story" --stream
# Streaming completion
gravixlayer chat --mode completions --model "meta-llama/llama-3.1-8b-instruct" --prompt "Write a poem" --stream
# Memory management (Mem0-compatible)
gravixlayer memory add user-123 --message "I prefer dark mode and TypeScript"
gravixlayer memory search user-123 --query "programming preferences" --limit 5
gravixlayer memory list user-123 --limit 10
Deployments
Manage dedicated model deployments:- Python SDK
- JavaScript SDK
Copy
import os
from gravixlayer import GravixLayer
# Initialize the client
client = GravixLayer(api_key=os.environ.get("GRAVIXLAYER_API_KEY"))
# Create a deployment
deployment = client.deployments.create(
deployment_name="custom_model",
model_name="qwen3-1.7b",
gpu_model="NVIDIA_T4_16GB",
gpu_count=1,
min_replicas=1,
hw_type="dedicated"
)
print(f"Created deployment: {deployment.id}")
# List all deployments
deployments = client.deployments.list()
for deployment in deployments:
print(f"Deployment: {deployment.name} - Status: {deployment.status}")
# Delete a deployment
client.deployments.delete(deployment_id="your_deployment_id")
# List available hardware
hardware_options = client.deployments.list_hardware()
for hardware in hardware_options:
print(f"Hardware: {hardware.name} - Memory: {hardware.memory}")
# Get hardware as JSON
hardware_json = client.deployments.list_hardware(format="json")
print(hardware_json)
Copy
import { GravixLayer } from 'gravixlayer';
// Initialize the client
const client = new GravixLayer({
apiKey: process.env.GRAVIXLAYER_API_KEY,
});
// Create a deployment with unique name
const uniqueName = `test-deployment-${Date.now()}`;
const deployment = await client.deployments.create({
deployment_name: uniqueName,
model_name: "qwen3-1.7b",
gpu_model: "NVIDIA_T4_16GB",
gpu_count: 1,
min_replicas: 1,
max_replicas: 1,
hw_type: "dedicated"
});
console.log(`Created deployment: ${deployment.deployment_id || deployment.id}`);
console.log(`Deployment status: ${deployment.status}`);
// List all deployments
const deployments = await client.deployments.list();
console.log(`Found ${deployments.length} deployments:`);
deployments.forEach(deployment => {
console.log(`- Deployment: ${deployment.deployment_name} - Status: ${deployment.status}`);
});
// Delete a deployment (only if we have a valid deployment ID)
if (deployment.deployment_id) {
try {
await client.deployments.delete(deployment.deployment_id);
console.log(`Deleted deployment: ${deployment.deployment_id}`);
} catch (error) {
console.log(`Note: Could not delete deployment (this is normal for testing): ${error.message}`);
}
}
// List available hardware (using accelerators resource)
const hardwareOptions = await client.accelerators.list();
hardwareOptions.forEach(hardware => {
console.log(`Hardware: ${hardware.name || hardware.gpu_type} - Memory: ${hardware.memory || 'N/A'}`);
});
Deployment Benefits
Dedicated deployments provide:- Guaranteed capacity with no cold starts
- Consistent performance and low latency
- Isolated resources for enterprise workloads
- Custom scaling policies and configurations
Conclusion
This tutorial covered the usage of the GravixLayer SDK in both Python and JavaScript SDK:- Chat Completions - Conversational AI
- Text Completions - Prompt-based text generation
- Embeddings - Text similarity and search
- Streaming - Real-time responses
- Function Calling - AI tool integration
- Chatbot - Interactive conversation with memory
- Async Support - High-performance operations
- Deployment Management - Dedicated model instances
- CLI Interface - Command-line usage
- Memory Management - Mem0-compatible memory system (JavaScript only)

