Tutorial

This comprehensive tutorial will guide you through using the GravixLayer SDK for Python and JavaScript SDK. You’ll learn how to perform chat completions, text completions, embeddings, streaming, function calling, and more.

Public Preview: Gravix Layer is currently in Public preview. Features are experimental and may have issues or break as ongoing updates to API endpoints and models continue.

Prerequisites

Before starting, make sure you have:

Python 3.7+ or Node.js 14+ installed
A GravixLayer API key
The GravixLayer SDK installed

Setup

Python SDK
JavaScript SDK

First, install the GravixLayer Python SDK:

pip install gravixlayer

Set your API key as an environment variable:

export GRAVIXLAYER_API_KEY="your_api_key_here"

First, install the GravixLayer JavaScript SDK:

npm install gravixlayer

Set your API key as an environment variable:

# Windows (PowerShell)
$env:GRAVIXLAYER_API_KEY="your_api_key_here"

# Windows (Command Prompt)
set GRAVIXLAYER_API_KEY=your_api_key_here

# macOS/Linux
export GRAVIXLAYER_API_KEY="your_api_key_here"

Your First Request

Python SDK
JavaScript SDK

Create a new file called main.py:

import os
from gravixlayer import GravixLayer

client = GravixLayer()

completion = client.chat.completions.create(
    model="meta-llama/llama-3.1-8b-instruct",
    messages=[{"role": "user", "content": "Hello, world!"}]
)

print(completion.choices[0].message.content)

Run the script:

python main.py

Create a new file called main.mjs:

import { GravixLayer } from 'gravixlayer';

const client = new GravixLayer({
  apiKey: process.env.GRAVIXLAYER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "meta-llama/llama-3.1-8b-instruct",
  messages: [{"role": "user", "content": "Hello, world!"}]
});

console.log(completion.choices[0].message.content);

Run the script:

node main.mjs

You should see a response from the model!

Chat Completions

Simple conversation with the AI:

Python SDK
JavaScript SDK

from gravixlayer import GravixLayer

client = GravixLayer()

response = client.chat.completions.create(
    model="meta-llama/llama-3.1-8b-instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is Python?"}
    ]
)

print(response.choices[0].message.content)

import { GravixLayer } from 'gravixlayer';

const client = new GravixLayer({
  apiKey: process.env.GRAVIXLAYER_API_KEY,
});

const response = await client.chat.completions.create({
  model: "meta-llama/llama-3.1-8b-instruct",
  messages: [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is JavaScript?"}
  ]
});

console.log(response.choices[0].message.content);

Completions

Simple text completion from a prompt:

Python SDK
JavaScript SDK

from gravixlayer import GravixLayer

client = GravixLayer()

completion = client.completions.create(
    model="meta-llama/llama-3.1-8b-instruct",
    prompt="The future of artificial intelligence is",
    max_tokens=50
)

print(completion.choices[0].text.strip())

import { GravixLayer } from 'gravixlayer';

const client = new GravixLayer({
  apiKey: process.env.GRAVIXLAYER_API_KEY,
});

const completion = await client.completions.create({
  model: "meta-llama/llama-3.1-8b-instruct",
  prompt: "The future of artificial intelligence is",
  maxTokens: 50
});

console.log(completion.choices[0].text.trim());

Streaming Responses

Get responses in real-time:

Python SDK
JavaScript SDK

from gravixlayer import GravixLayer

client = GravixLayer()

# Chat streaming
stream = client.chat.completions.create(
    model="meta-llama/llama-3.1-8b-instruct",
    messages=[{"role": "user", "content": "Tell me a short story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="", flush=True)
print()  # New line

# Completions streaming
stream = client.completions.create(
    model="meta-llama/llama-3.1-8b-instruct",
    prompt="Write a poem about",
    max_tokens=100,
    stream=True
)

for chunk in stream:
    if chunk.choices[0].text is not None:
        print(chunk.choices[0].text, end="", flush=True)
print()  # New line

import { GravixLayer } from 'gravixlayer';

const client = new GravixLayer({
  apiKey: process.env.GRAVIXLAYER_API_KEY,
});

// Chat streaming
const stream = await client.chat.completions.create({
  model: "meta-llama/llama-3.1-8b-instruct",
  messages: [{"role": "user", "content": "Tell me a short story"}],
  stream: true
});

for await (const chunk of stream) {
  if (chunk.choices[0].delta.content !== null) {
    process.stdout.write(chunk.choices[0].delta.content);
  }
}
console.log(); // New line

// Completions streaming
const completionStream = await client.completions.create({
  model: "meta-llama/llama-3.1-8b-instruct",
  prompt: "Write a poem about",
  maxTokens: 100,
  stream: true
});

for await (const chunk of completionStream) {
  if (chunk.choices[0].text !== null) {
    process.stdout.write(chunk.choices[0].text);
  }
}
console.log(); // New line

Embeddings

Generate text embeddings:

Python SDK
JavaScript SDK

import os
import json
from gravixlayer import GravixLayer

client = GravixLayer(
    api_key=os.environ.get("GRAVIXLAYER_API_KEY"),
)

embedding = client.embeddings.create(
    model="baai/bge-large-en-v1.5",
    input="Why is the sky blue?",
)

print(json.dumps(embedding.model_dump(), indent=2))

import { GravixLayer } from 'gravixlayer';

const client = new GravixLayer({
  apiKey: process.env.GRAVIXLAYER_API_KEY,
});

const embedding = await client.embeddings.create({
  model: "baai/bge-large-en-v1.5",
  input: "Why is the sky blue?",
});

console.log(JSON.stringify(embedding, null, 2));

Function Calling

Let AI call your functions:

Python SDK
JavaScript SDK

import os
import json
import requests
from gravixlayer import GravixLayer

# Define a simple function
def get_weather(latitude, longitude):
    """Get current temperature for coordinates."""
    url = f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&current=temperature_2m"
    response = requests.get(url)
    data = response.json()
    return data['current']['temperature_2m']

client = GravixLayer()

# Define the function for the AI
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current temperature for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "latitude": {"type": "number"},
                "longitude": {"type": "number"}
            },
            "required": ["latitude", "longitude"]
        }
    }
}]

# Ask AI to call the function
messages = [{"role": "user", "content": "What's the weather in Paris?"}]

completion = client.chat.completions.create(
    model="meta-llama/llama-3.1-8b-instruct",
    messages=messages,
    tools=tools,
    tool_choice="auto"
)

response_message = completion.choices[0].message
messages.append(response_message)

# Check if AI wants to call function
if response_message.tool_calls:
    tool_call = response_message.tool_calls[0]
    
    # Call the function
    args = json.loads(tool_call.function.arguments)
    result = get_weather(args["latitude"], args["longitude"])
    
    # Send result back to AI
    messages.append({
        "role": "tool",
        "tool_call_id": tool_call.id,
        "name": "get_weather",
        "content": str(result)
    })
    
    # Get final response
    final_completion = client.chat.completions.create(
        model="meta-llama/llama-3.1-8b-instruct",
        messages=messages,
        tools=tools
    )
    
    print(final_completion.choices[0].message.content)

// Filename: weather_bot.js
// To run: node weather_bot.js
// Prerequisites: npm install node-fetch gravixlayer
// Ensure API key is set: export GRAVIXLAYER_API_KEY='your-api-key'

import fetch from 'node-fetch';
import { GravixLayer } from 'gravixlayer';

async function get_weather(latitude, longitude) {
  console.log(`--- Calling get_weather function for Lat: ${latitude}, Lon: ${longitude} ---`);
  
  const url = `https://api.open-meteo.com/v1/forecast?latitude=${latitude}&longitude=${longitude}&current=temperature_2m`;
  const response = await fetch(url);
  
  if (!response.ok) {
    throw new Error(`HTTP error! status: ${response.status}`);
  }
  
  const data = await response.json();
  return data.current.temperature_2m;
}

async function runConversation() {
  const apiKey = process.env.GRAVIXLAYER_API_KEY;
  
  if (!apiKey) {
    throw new Error("GRAVIXLAYER_API_KEY environment variable not set.");
  }

  const client = new GravixLayer({
    apiKey: apiKey
  });

  console.log(`User: What's the weather like in Paris, France today?\n`);
  console.log("--- Making first API call to model ---");

  try {
    // First, try to get the model to understand we need weather data
    const initialMessages = [
      { 
        role: "system", 
        content: "You are a helpful assistant. When asked about weather, you should ask for the latitude and longitude coordinates, or if you know the coordinates for a city, you can provide them. For Paris, France, the coordinates are approximately 48.8566° N, 2.3522° E."
      },
      { 
        role: "user", 
        content: "What's the weather like in Paris, France today? I need the current temperature." 
      }
    ];

    const completion = await client.chat.completions.create({
      model: "meta-llama/llama-3.1-8b-instruct",
      messages: initialMessages,
      max_tokens: 200
    });

    const responseMessage = completion.choices[0].message;
    console.log(`Assistant: ${responseMessage.content}\n`);

    // Extract coordinates (Paris coordinates)
    const parisLat = 48.8566;
    const parisLon = 2.3522;
    
    console.log("--- Getting weather data ---");
    const temperature = await get_weather(parisLat, parisLon);

    console.log(`Tool Result (Temperature): ${temperature}°C\n`);
    console.log("--- Making second API call with weather data ---");

    // Now provide the weather data to get a natural response
    const finalMessages = [
      { 
        role: "system", 
        content: "You are a helpful weather assistant. Provide a natural, conversational response about the weather."
      },
      { 
        role: "user", 
        content: "What's the weather like in Paris, France today?" 
      },
      {
        role: "assistant",
        content: `I can help you with that! Let me check the current weather in Paris.`
      },
      {
        role: "user",
        content: `The current temperature in Paris, France is ${temperature}°C. Please give me a natural response about this weather.`
      }
    ];

    const finalCompletion = await client.chat.completions.create({
      model: "meta-llama/llama-3.1-8b-instruct",
      messages: finalMessages,
      max_tokens: 150
    });

    console.log(`\nFinal Assistant Response: ${finalCompletion.choices[0].message.content}`);

  } catch (error) {
    console.error('❌ Error in conversation:', error.message);
    
    // Fallback: Simple weather check without complex function calling
    console.log('\n--- Trying fallback approach ---');
    try {
      const temperature = await get_weather(48.8566, 2.3522); // Paris coordinates
      console.log(`\nFallback Result: The current temperature in Paris is ${temperature}°C`);
      
      const simpleResponse = await client.chat.completions.create({
        model: "meta-llama/llama-3.1-8b-instruct",
        messages: [
          {
            role: "user",
            content: `The current temperature in Paris, France is ${temperature}°C. Please provide a brief, natural comment about this weather.`
          }
        ],
        max_tokens: 100
      });
      
      console.log(`Simple Response: ${simpleResponse.choices[0].message.content}`);
    } catch (fallbackError) {
      console.error('❌ Fallback also failed:', fallbackError.message);
    }
  }
}

runConversation().catch(console.error);

Building a Simple Chatbot

Create an interactive chatbot with conversation memory:

Python SDK
JavaScript SDK

from gravixlayer import GravixLayer

class SimpleChatbot:
    def __init__(self):
        self.client = GravixLayer()
        self.conversation = []
    
    def chat(self, message):
        # Add user message to conversation
        self.conversation.append({"role": "user", "content": message})
        
        # Create messages with system prompt
        messages = [
            {"role": "system", "content": "You are a helpful assistant."}
        ] + self.conversation
        
        # Get response from the model
        completion = self.client.chat.completions.create(
            model="meta-llama/llama-3.1-8b-instruct",
            messages=messages,
            temperature=0.7,
            max_tokens=150
        )
        
        # Add assistant response to conversation
        assistant_message = completion.choices[0].message.content
        self.conversation.append({"role": "assistant", "content": assistant_message})
        
        return assistant_message

# Use the chatbot
bot = SimpleChatbot()
print(bot.chat("What is Python programming?"))
print(bot.chat("Can you give me an example?"))

import { GravixLayer } from 'gravixlayer';

class SimpleChatbot {
  constructor() {
    this.client = new GravixLayer({
      apiKey: process.env.GRAVIXLAYER_API_KEY,
    });
    this.conversation = [];
  }
  
  async chat(message) {
    // Add user message to conversation
    this.conversation.push({ role: "user", content: message });
    
    // Create messages with system prompt
    const messages = [
      { role: "system", content: "You are a helpful assistant." },
      ...this.conversation
    ];
    
    // Get response from the model
    const completion = await this.client.chat.completions.create({
      model: "meta-llama/llama-3.1-8b-instruct",
      messages: messages,
      temperature: 0.7,
      maxTokens: 150
    });
    
    // Add assistant response to conversation
    const assistantMessage = completion.choices[0].message.content;
    this.conversation.push({ role: "assistant", content: assistantMessage });
    
    return assistantMessage;
  }
}

// Use the chatbot
const bot = new SimpleChatbot();
console.log(await bot.chat("What is JavaScript programming?"));
console.log(await bot.chat("Can you give me an example?"));

Async Operations

Python SDK
JavaScript SDK

import asyncio
import os
from gravixlayer import AsyncGravixLayer

class AsyncChatbot:
    def __init__(self):
        self.client = AsyncGravixLayer()
    
    async def chat(self, message):
        completion = await self.client.chat.completions.create(
            model="meta-llama/llama-3.1-8b-instruct",
            messages=[{"role": "user", "content": message}],
            temperature=0.7
        )
        return completion.choices[0].message.content
    
    async def chat_multiple(self, messages):
        """Handle multiple messages concurrently"""
        tasks = [self.chat(msg) for msg in messages]
        return await asyncio.gather(*tasks)

async def main():
    bot = AsyncChatbot()
    
    # Single request
    response = await bot.chat("What is Python?")
    print(f"Single: {response}")
    
    # Multiple concurrent requests
    messages = [
        "What is machine learning?",
        "Explain neural networks",
        "What is deep learning?"
    ]
    
    responses = await bot.chat_multiple(messages)
    for i, response in enumerate(responses):
        print(f"Response {i+1}: {response[:100]}...")

asyncio.run(main())

import { GravixLayer } from 'gravixlayer';

class AsyncChatbot {
  constructor() {
    this.client = new GravixLayer({
      apiKey: process.env.GRAVIXLAYER_API_KEY,
    });
  }
  
  async chat(message) {
    const completion = await this.client.chat.completions.create({
      model: "meta-llama/llama-3.1-8b-instruct",
      messages: [{ role: "user", content: message }],
      temperature: 0.7
    });
    return completion.choices[0].message.content;
  }
  
  async chatMultiple(messages) {
    // Handle multiple messages concurrently
    const tasks = messages.map(msg => this.chat(msg));
    return await Promise.all(tasks);
  }
}

async function main() {
  const bot = new AsyncChatbot();
  
  // Single request
  const response = await bot.chat("What is JavaScript?");
  console.log(`Single: ${response}`);
  
  // Multiple concurrent requests
  const messages = [
    "What is machine learning?",
    "Explain neural networks",
    "What is deep learning?"
  ];
  
  const responses = await bot.chatMultiple(messages);
  responses.forEach((response, i) => {
    console.log(`Response ${i+1}: ${response.substring(0, 100)}...`);
  });
}

main().catch(console.error);

CLI Usage

Use the command line interface:

Python SDK
JavaScript SDK

CLI Structure: All inference operations now use the chat subcommand: gravixlayer chat [options]

# Install globally
pip install gravixlayer

# Chat completion
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Hello!"

# Text completion
gravixlayer chat --mode completions --model "meta-llama/llama-3.1-8b-instruct" --prompt "The future of AI is"

# Chat with system message
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --system "You are a helpful assistant" --user "Explain AI"

# Streaming chat
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Tell a story" --stream

# Streaming completion
gravixlayer chat --mode completions --model "meta-llama/llama-3.1-8b-instruct" --prompt "Write a poem" --stream

# Install globally
npm install -g gravixlayer

# Chat completion
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Hello!"

# Text completion
gravixlayer chat --mode completions --model "meta-llama/llama-3.1-8b-instruct" --prompt "The future of AI is"

# Chat with system message
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --system "You are a helpful assistant" --user "Explain AI"

# Streaming chat
gravixlayer chat --model "meta-llama/llama-3.1-8b-instruct" --user "Tell a story" --stream

# Streaming completion
gravixlayer chat --mode completions --model "meta-llama/llama-3.1-8b-instruct" --prompt "Write a poem" --stream

# Memory management (Mem0-compatible)
gravixlayer memory add user-123 --message "I prefer dark mode and TypeScript"
gravixlayer memory search user-123 --query "programming preferences" --limit 5
gravixlayer memory list user-123 --limit 10

Deployments

Manage dedicated model deployments:

Python SDK
JavaScript SDK

import os
from gravixlayer import GravixLayer

# Initialize the client
client = GravixLayer(api_key=os.environ.get("GRAVIXLAYER_API_KEY"))

# Create a deployment
deployment = client.deployments.create(
    deployment_name="custom_model",
    model_name="qwen3-1.7b",
    gpu_model="NVIDIA_T4_16GB",
    gpu_count=1,
    min_replicas=1,
    hw_type="dedicated"
)
print(f"Created deployment: {deployment.id}")

# List all deployments
deployments = client.deployments.list()
for deployment in deployments:
    print(f"Deployment: {deployment.name} - Status: {deployment.status}")

# Delete a deployment
client.deployments.delete(deployment_id="your_deployment_id")

# List available hardware
hardware_options = client.deployments.list_hardware()
for hardware in hardware_options:
    print(f"Hardware: {hardware.name} - Memory: {hardware.memory}")

# Get hardware as JSON
hardware_json = client.deployments.list_hardware(format="json")
print(hardware_json)

import { GravixLayer } from 'gravixlayer';

// Initialize the client
const client = new GravixLayer({
    apiKey: process.env.GRAVIXLAYER_API_KEY,
});

// Create a deployment with unique name
const uniqueName = `test-deployment-${Date.now()}`;
const deployment = await client.deployments.create({
    deployment_name: uniqueName,
    model_name: "qwen3-1.7b",
    gpu_model: "NVIDIA_T4_16GB",
    gpu_count: 1,
    min_replicas: 1,
    max_replicas: 1,
    hw_type: "dedicated"
});
console.log(`Created deployment: ${deployment.deployment_id || deployment.id}`);
console.log(`Deployment status: ${deployment.status}`);

// List all deployments
const deployments = await client.deployments.list();
console.log(`Found ${deployments.length} deployments:`);
deployments.forEach(deployment => {
    console.log(`- Deployment: ${deployment.deployment_name} - Status: ${deployment.status}`);
});

// Delete a deployment (only if we have a valid deployment ID)
if (deployment.deployment_id) {
    try {
        await client.deployments.delete(deployment.deployment_id);
        console.log(`Deleted deployment: ${deployment.deployment_id}`);
    } catch (error) {
        console.log(`Note: Could not delete deployment (this is normal for testing): ${error.message}`);
    }
}

// List available hardware (using accelerators resource)
const hardwareOptions = await client.accelerators.list();
hardwareOptions.forEach(hardware => {
    console.log(`Hardware: ${hardware.name || hardware.gpu_type} - Memory: ${hardware.memory || 'N/A'}`);
});

Deployment Benefits

Dedicated deployments provide:

Guaranteed capacity with no cold starts
Consistent performance and low latency
Isolated resources for enterprise workloads
Custom scaling policies and configurations

Conclusion

This tutorial covered the usage of the GravixLayer SDK in both Python and JavaScript SDK:

Chat Completions - Conversational AI
Text Completions - Prompt-based text generation
Embeddings - Text similarity and search
Streaming - Real-time responses
Function Calling - AI tool integration
Chatbot - Interactive conversation with memory
Async Support - High-performance operations
Deployment Management - Dedicated model instances
CLI Interface - Command-line usage
Memory Management - Mem0-compatible memory system (JavaScript only)

Each example is simple and can be used as a starting point for your applications. Both SDKs provide identical functionality with language-specific optimizations. For more advanced features, check the Dedicated Deployments documentation.

Getting Started

Core Features

Advanced Usage

Resources

Prerequisites

Setup

Your First Request

Chat Completions

Completions

Streaming Responses

Embeddings

Function Calling

Building a Simple Chatbot

Async Operations

CLI Usage

Deployments

Deployment Benefits

Conclusion

Getting Started

Core Features

Advanced Usage

Resources

​Prerequisites

​Setup

​Your First Request

​Chat Completions

​Completions

​Streaming Responses

​Embeddings

​Function Calling

​Building a Simple Chatbot

​Async Operations

​CLI Usage

​Deployments

​Deployment Benefits

​Conclusion

Prerequisites

Setup

Your First Request

Chat Completions

Completions

Streaming Responses

Embeddings

Function Calling

Building a Simple Chatbot

Async Operations

CLI Usage

Deployments

Deployment Benefits

Conclusion