Implementing Agentic RAG

This guide provides a compact, production-minded example of an Agentic Retrieval-Augmented Generation (RAG) pipeline focused on PDF documents. It follows a clear three-agent separation:

IngestionAgent: Extracts text from PDFs, creates overlapping chunks, and stores both the raw chunks in a local knowledge_base and their embeddings in a vector index.
RetrievalAgent: Performs semantic search against the vector index and assembles human-readable context from the best hits (with a robust fallback to the local knowledge_base).
GenerationAgent: Builds a prompt from the retrieved context and calls an LLM to produce concise, sourced answers.

Quick Start

Set GRAVIXLAYER_API_KEY in your environment or a .env file (the guide assumes you load .env).
Update PDF_PATH to point to your PDF file or use sample data for testing.
Follow the sections top-to-bottom: configuration → vector DB init → ingestion (or sample load) → query examples.
Ask questions with pdf_rag_system.query('Your question here') --- set show_context=True to see retrieved snippets.

What’s Included

VectorDatabase: Minimal wrapper for GravixLayer vector index operations (create/list/find, upsert, text search).
PDFIngestionAgent: PDF extraction, sentence-aware chunking with overlap, and preparation of payloads for the vector DB and local knowledge_base.
RetrievalAgent: Runs text search and assembles readable context; falls back to local knowledge_base if the vector search is unavailable or returns limited metadata.
GenerationAgent: Simple LLM prompt wrapper that calls client.chat.completions.create and returns a concise answer with source citations when possible.

Setup and Configuration

Install required packages:

pip install gravixlayer requests python-dotenv PyPDF2 -q

Initialize the environment and client:

import os
import json
import time
import re
import requests
import PyPDF2
from typing import List, Dict, Any, Optional
from gravixlayer import GravixLayer
from dotenv import load_dotenv
from IPython.display import display, Markdown  # Optional for Jupyter

# Load environment variables
load_dotenv()

# Set up API key
API_KEY = os.getenv('GRAVIXLAYER_API_KEY')
if not API_KEY:
	raise ValueError("Please set GRAVIXLAYER_API_KEY environment variable")

# Initialize GravixLayer client
client = GravixLayer()
print("GravixLayer client initialized successfully")

# Configuration
CONFIG = {
	'embedding_model': 'baai/bge-large-en-v1.5',
	'llm_model': 'meta-llama/llama-3.1-8b-instruct',
	'vector_dimension': 1024,
	'similarity_metric': 'cosine',
	'index_name': 'pdf-rag',
	'top_k_results': 3,
	'base_url': 'https://api.gravixlayer.com/v1/vectors'
}

This sets up the GravixLayer client and defines configuration for embeddings, LLM, and vector index.

Vector Database

Define a VectorDatabase class to handle index operations:

class VectorDatabase:
	def __init__(self, api_key: str, base_url: str):
		self.api_key = api_key
		self.base_url = base_url.rstrip('/')
		self.headers = {
			'Authorization': f'Bearer {api_key}',
			'Content-Type': 'application/json'
		}
		self.index_id = None

	# Methods: create_index, list_indexes, find_or_create_index, upsert_text_vectors, search_text
	# Full implementations are intentionally omitted here for brevity; see the notebook for details.

Initialize:

vector_db = VectorDatabase(API_KEY, CONFIG['base_url'])
print("Vector database client initialized")

This class manages creating/finding indexes, upserting vectors, and searching text.

Knowledge Base

Initialize a global knowledge base to store PDF chunks:

knowledge_base = []  # List to store PDF chunks for retrieval
print("Knowledge base initialized")

PDF Processing Agents

PDF Ingestion Agent

Handles PDF extraction and chunking:

class PDFIngestionAgent:
	def __init__(self, vector_db: VectorDatabase, config: Dict):
		self.vector_db = vector_db
		self.config = config

	def ingest_pdf(self, pdf_path: str, custom_name: Optional[str] = None) -> bool:
		"""Extract text, create chunks, prepare vectors, upsert to DB, add to knowledge_base."""
		# (Full implementation omitted for brevity)
		return True

	def _extract_pdf_text(self, pdf_path: str) -> str:
		# Placeholder for PDF text extraction using PyPDF2 or similar
		return ""

	def _create_chunks(self, text: str, chunk_size: int = 1000, overlap: int = 200) -> List[Dict[str, Any]]:
		# Placeholder for chunking logic
		return []

Retrieval Agent

Handles semantic search and context assembly:

class RetrievalAgent:
	def __init__(self, vector_db, knowledge_base, config):
		self.vector_db = vector_db
		self.knowledge_base = knowledge_base
		self.config = config

	def retrieve(self, query: str, top_k: int = None, filter_dict: dict = None):
		"""Run search, assemble context from hits with fallback to knowledge_base."""
		# (Full implementation omitted)
		return []

Generation Agent

Handles LLM-based response generation:

class GenerationAgent:
	def __init__(self, llm_client, config: dict):
		self.llm_client = llm_client
		self.config = config

	def generate(self, query: str, context: str) -> str:
		"""Build prompt and call LLM to produce a concise, sourced answer."""
		# (Full implementation omitted)
		return ""

PDF RAG System

Orchestrates the agents:

class PDFRAGSystem:
	def __init__(self, ingestion_agent: PDFIngestionAgent, retrieval_agent: RetrievalAgent, generation_agent: GenerationAgent, config: Dict):
		self.ingestion_agent = ingestion_agent
		self.retrieval_agent = retrieval_agent
		self.generation_agent = generation_agent
		self.config = config

	def query(self, question: str, show_context: bool = True, top_k: int = None, filter_dict: dict = None) -> Dict[str, Any]:
		"""Retrieve context, generate response, and return structured answer."""
		# (Full implementation omitted)
		return {"answer": ""}

System Initialization

Create/find index and initialize agents/system:

index_id = vector_db.find_or_create_index(
	name=CONFIG['index_name'],
	dimension=CONFIG['vector_dimension'],
	metric=CONFIG['similarity_metric']
)

pdf_ingestion = PDFIngestionAgent(vector_db, CONFIG)
retrieval_agent = RetrievalAgent(vector_db, knowledge_base, CONFIG)
generation_agent = GenerationAgent(client, CONFIG)

pdf_rag_system = PDFRAGSystem(pdf_ingestion, retrieval_agent, generation_agent, CONFIG)

PDF Ingestion

Ingest a PDF:

PDF_PATH = "/path/to/your/pdf.pdf"  # Update this
success = pdf_ingestion.ingest_pdf(PDF_PATH)

# Or use sample data if PDF not found (see original notebook)

Query Your PDF

Example queries:

pdf_rag_system.query("What is the main topic of the PDF document?")
pdf_rag_system.query("What AI applications are mentioned?")

Notebook

Access the full notebook version of this guide on GitHub: Agentic RAG Notebook

Getting Started

RAG

Advanced

Implementing Agentic RAG

Quick Start

What’s Included

Setup and Configuration

Vector Database

Knowledge Base

PDF Processing Agents

PDF Ingestion Agent

Retrieval Agent

Generation Agent

PDF RAG System

System Initialization

PDF Ingestion

Query Your PDF

Notebook

Getting Started

RAG

Advanced

​Quick Start​

​What’s Included​

​Setup and Configuration​

​Vector Database​

​Knowledge Base​

​PDF Processing Agents​

​PDF Ingestion Agent​

​Retrieval Agent​

​Generation Agent​

​PDF RAG System​

​System Initialization​

​PDF Ingestion​

​Query Your PDF​

​Notebook​

Quick Start

What’s Included

Setup and Configuration

Vector Database

Knowledge Base

PDF Processing Agents

PDF Ingestion Agent

Retrieval Agent

Generation Agent

PDF RAG System

System Initialization

PDF Ingestion

Query Your PDF

Notebook