Skip to main content
This guide walks you through building a Retrieval-Augmented Generation (RAG) system using GravixLayer. It covers every step, explains the code, and highlights practical use cases.

What is RAG?

Retrieval-Augmented Generation (RAG) combines information retrieval (searching for relevant documents) with generative AI (LLMs) to answer questions using both stored knowledge and generative capabilities. RAG reduces hallucinations and improves factual accuracy.

Use Cases

  • Enterprise Knowledge Search: Answer employee questions using company documents.
  • Customer Support: Retrieve relevant help articles and generate responses.
  • Research Assistant: Summarize and answer questions from scientific papers.
  • Programming Help: Search code documentation and generate code explanations.
  • Education: Provide context-rich answers from textbooks and notes.

Step-by-Step Guide

1. Setup and Installation

Install required packages:
pip install gravixlayer requests python-dotenv
This ensures you have the GravixLayer SDK, HTTP requests, and environment variable support.

2. API Key and Client Initialization

Load your API key from environment variables and initialize the GravixLayer client:
from gravixlayer import GravixLayer
from dotenv import load_dotenv
import os

load_dotenv()

API_KEY = os.getenv('GRAVIXLAYER_API_KEY')
client = GravixLayer()
This authenticates you and prepares the SDK for use.

3. Configuration

Set up your RAG system parameters:
  • Embedding model (for vectorization)
  • LLM model (for generation)
  • Vector dimension
  • Similarity metric
  • Index name
  • Top K results
  • API base URL
CONFIG = {
	'embedding_model': 'baai/bge-large-en-v1.5',
	'llm_model': 'meta-llama/llama-3.1-8b-instruct',
	'vector_dimension': 1024,
	'similarity_metric': 'cosine',
	'index_name': 'rag-knowledge-base',
	'top_k_results': 3,
	'base_url': 'https://api.gravixlayer.com/v1/vectors'
}

4. Vector Database Setup

The VectorDatabase class manages all vector DB operations:
  • Creating indexes
  • Listing indexes
  • Upserting (adding/updating) vectors
  • Searching vectors Each method is documented in the code for clarity.

5. Knowledge Base Ingestion

Prepare your documents as a list of dictionaries, each with an ID, text, model, and metadata. Ingest them into the vector DB using upsert_text_vectors. Example:
knowledge_base = [
	{
		"id": "doc_ai_overview",
		"text": "Artificial Intelligence (AI)...",
		"model": CONFIG['embedding_model'],
		"metadata": {
			"category": "AI",
			"topic": "overview",
			"difficulty": "beginner"
		}
	},
	# ... more documents ...
]

vector_db.upsert_text_vectors(knowledge_base)

6. RAG System Implementation

The SimpleRAG class orchestrates retrieval and generation:
  • retrieve_context: Searches for relevant documents and formats context
  • generate_response: Uses the LLM to answer using the context
  • query: Main method to run a RAG query and print results All functions have detailed docstrings and comments.

7. Advanced Features

  • Filtering: Search with metadata filters (category, difficulty)
  • Interactive Chat: Ask questions in a loop
  • System Stats: View index and configuration details
  • Adding Documents: Dynamically add new knowledge
  • Performance Evaluation: Measure response time and length
  • Cleanup: Remove all resources safely

Practical Tips

  • Always check your API key and environment setup
  • Tune top_k and similarity metric for your use case
  • Use metadata for precise filtering
  • Expand your knowledge base for better answers
  • Monitor performance for scaling needs

Happy building with GravixLayer!
I