Implementing Simple RAG

This guide walks you through building a Retrieval-Augmented Generation (RAG) system using GravixLayer. It covers every step, explains the code, and highlights practical use cases.

What is RAG?

Retrieval-Augmented Generation (RAG) combines information retrieval (searching for relevant documents) with generative AI (LLMs) to answer questions using both stored knowledge and generative capabilities. RAG reduces hallucinations and improves factual accuracy.

Use Cases

Enterprise Knowledge Search: Answer employee questions using company documents.
Customer Support: Retrieve relevant help articles and generate responses.
Research Assistant: Summarize and answer questions from scientific papers.
Programming Help: Search code documentation and generate code explanations.
Education: Provide context-rich answers from textbooks and notes.

Step-by-Step Guide

1. Setup and Installation

Install required packages:

pip install gravixlayer requests python-dotenv

This ensures you have the GravixLayer SDK, HTTP requests, and environment variable support.

2. API Key and Client Initialization

Load your API key from environment variables and initialize the GravixLayer client:

from gravixlayer import GravixLayer
from dotenv import load_dotenv
import os

load_dotenv()

API_KEY = os.getenv('GRAVIXLAYER_API_KEY')
client = GravixLayer()

This authenticates you and prepares the SDK for use.

3. Configuration

Set up your RAG system parameters:

Embedding model (for vectorization)
LLM model (for generation)
Vector dimension
Similarity metric
Index name
Top K results
API base URL

CONFIG = {
	'embedding_model': 'baai/bge-large-en-v1.5',
	'llm_model': 'meta-llama/llama-3.1-8b-instruct',
	'vector_dimension': 1024,
	'similarity_metric': 'cosine',
	'index_name': 'rag-knowledge-base',
	'top_k_results': 3,
	'base_url': 'https://api.gravixlayer.com/v1/vectors'
}

4. Vector Database Setup

The VectorDatabase class manages all vector DB operations:

Creating indexes
Listing indexes
Upserting (adding/updating) vectors
Searching vectors Each method is documented in the code for clarity.

5. Knowledge Base Ingestion

Prepare your documents as a list of dictionaries, each with an ID, text, model, and metadata. Ingest them into the vector DB using upsert_text_vectors. Example:

knowledge_base = [
	{
		"id": "doc_ai_overview",
		"text": "Artificial Intelligence (AI)...",
		"model": CONFIG['embedding_model'],
		"metadata": {
			"category": "AI",
			"topic": "overview",
			"difficulty": "beginner"
		}
	},
	# ... more documents ...
]

vector_db.upsert_text_vectors(knowledge_base)

6. RAG System Implementation

The SimpleRAG class orchestrates retrieval and generation:

retrieve_context: Searches for relevant documents and formats context
generate_response: Uses the LLM to answer using the context
query: Main method to run a RAG query and print results All functions have detailed docstrings and comments.

7. Advanced Features

Filtering: Search with metadata filters (category, difficulty)
Interactive Chat: Ask questions in a loop
System Stats: View index and configuration details
Adding Documents: Dynamically add new knowledge
Performance Evaluation: Measure response time and length
Cleanup: Remove all resources safely

Practical Tips

Always check your API key and environment setup
Tune top_k and similarity metric for your use case
Use metadata for precise filtering
Expand your knowledge base for better answers
Monitor performance for scaling needs

Happy building with GravixLayer!

Getting Started

RAG

Advanced

​What is RAG?​

​Use Cases​

​Step-by-Step Guide​

​1. Setup and Installation​

​2. API Key and Client Initialization​

​3. Configuration​

​4. Vector Database Setup​

​5. Knowledge Base Ingestion​

​6. RAG System Implementation​

​7. Advanced Features​

​Practical Tips​