Skip to main content
This guide provides a compact, production-minded example of an Agentic Retrieval-Augmented Generation (RAG) pipeline focused on PDF documents. It follows a clear three-agent separation:
  • IngestionAgent: Extracts text from PDFs, creates overlapping chunks, and stores both the raw chunks in a local knowledge_base and their embeddings in a vector index.
  • RetrievalAgent: Performs semantic search against the vector index and assembles human-readable context from the best hits (with a robust fallback to the local knowledge_base).
  • GenerationAgent: Builds a prompt from the retrieved context and calls an LLM to produce concise, sourced answers.

Quick Start

  1. Set GRAVIXLAYER_API_KEY in your environment or a .env file (the guide assumes you load .env).
  2. Update PDF_PATH to point to your PDF file or use sample data for testing.
  3. Follow the sections top-to-bottom: configuration → vector DB init → ingestion (or sample load) → query examples.
  4. Ask questions with pdf_rag_system.query('Your question here') --- set show_context=True to see retrieved snippets.

What’s Included

  • VectorDatabase: Minimal wrapper for GravixLayer vector index operations (create/list/find, upsert, text search).
  • PDFIngestionAgent: PDF extraction, sentence-aware chunking with overlap, and preparation of payloads for the vector DB and local knowledge_base.
  • RetrievalAgent: Runs text search and assembles readable context; falls back to local knowledge_base if the vector search is unavailable or returns limited metadata.
  • GenerationAgent: Simple LLM prompt wrapper that calls client.chat.completions.create and returns a concise answer with source citations when possible.
PDF RAG Flowchart

Setup and Configuration

Install required packages:
pip install gravixlayer requests python-dotenv PyPDF2 -q
Initialize the environment and client:
import os
import json
import time
import re
import requests
import PyPDF2
from typing import List, Dict, Any, Optional
from gravixlayer import GravixLayer
from dotenv import load_dotenv
from IPython.display import display, Markdown  # Optional for Jupyter
# Load environment variables
load_dotenv()
# Set up API key
API_KEY = os.getenv('GRAVIXLAYER_API_KEY')
if not API_KEY:
	raise ValueError("Please set GRAVIXLAYER_API_KEY environment variable")
# Initialize GravixLayer client
client = GravixLayer()
print("GravixLayer client initialized successfully")
# Configuration
CONFIG = {
	'embedding_model': 'baai/bge-large-en-v1.5',
	'llm_model': 'meta-llama/llama-3.1-8b-instruct',
	'vector_dimension': 1024,
	'similarity_metric': 'cosine',
	'index_name': 'pdf-rag',
	'top_k_results': 3,
	'base_url': 'https://api.gravixlayer.com/v1/vectors'
}
This sets up the GravixLayer client and defines configuration for embeddings, LLM, and vector index.

Vector Database

Define a VectorDatabase class to handle index operations:
class VectorDatabase:
	def __init__(self, api_key: str, base_url: str):
		self.api_key = api_key
		self.base_url = base_url.rstrip('/')
		self.headers = {
			'Authorization': f'Bearer {api_key}',
			'Content-Type': 'application/json'
		}
		self.index_id = None
	# Methods: create_index, list_indexes, find_or_create_index, upsert_text_vectors, search_text
	# Full implementations are intentionally omitted here for brevity; see the notebook for details.
Initialize:
vector_db = VectorDatabase(API_KEY, CONFIG['base_url'])
print("Vector database client initialized")
This class manages creating/finding indexes, upserting vectors, and searching text.

Knowledge Base

Initialize a global knowledge base to store PDF chunks:
knowledge_base = []  # List to store PDF chunks for retrieval
print("Knowledge base initialized")

PDF Processing Agents

PDF Ingestion Agent

Handles PDF extraction and chunking:
class PDFIngestionAgent:
	def __init__(self, vector_db: VectorDatabase, config: Dict):
		self.vector_db = vector_db
		self.config = config
	def ingest_pdf(self, pdf_path: str, custom_name: Optional[str] = None) -> bool:
		"""Extract text, create chunks, prepare vectors, upsert to DB, add to knowledge_base."""
		# (Full implementation omitted for brevity)
		return True
	def _extract_pdf_text(self, pdf_path: str) -> str:
		# Placeholder for PDF text extraction using PyPDF2 or similar
		return ""
	def _create_chunks(self, text: str, chunk_size: int = 1000, overlap: int = 200) -> List[Dict[str, Any]]:
		# Placeholder for chunking logic
		return []

Retrieval Agent

Handles semantic search and context assembly:
class RetrievalAgent:
	def __init__(self, vector_db, knowledge_base, config):
		self.vector_db = vector_db
		self.knowledge_base = knowledge_base
		self.config = config
	def retrieve(self, query: str, top_k: int = None, filter_dict: dict = None):
		"""Run search, assemble context from hits with fallback to knowledge_base."""
		# (Full implementation omitted)
		return []

Generation Agent

Handles LLM-based response generation:
class GenerationAgent:
	def __init__(self, llm_client, config: dict):
		self.llm_client = llm_client
		self.config = config
	def generate(self, query: str, context: str) -> str:
		"""Build prompt and call LLM to produce a concise, sourced answer."""
		# (Full implementation omitted)
		return ""

PDF RAG System

Orchestrates the agents:
class PDFRAGSystem:
	def __init__(self, ingestion_agent: PDFIngestionAgent, retrieval_agent: RetrievalAgent, generation_agent: GenerationAgent, config: Dict):
		self.ingestion_agent = ingestion_agent
		self.retrieval_agent = retrieval_agent
		self.generation_agent = generation_agent
		self.config = config
	def query(self, question: str, show_context: bool = True, top_k: int = None, filter_dict: dict = None) -> Dict[str, Any]:
		"""Retrieve context, generate response, and return structured answer."""
		# (Full implementation omitted)
		return {"answer": ""}

System Initialization

Create/find index and initialize agents/system:
index_id = vector_db.find_or_create_index(
	name=CONFIG['index_name'],
	dimension=CONFIG['vector_dimension'],
	metric=CONFIG['similarity_metric']
)

pdf_ingestion = PDFIngestionAgent(vector_db, CONFIG)
retrieval_agent = RetrievalAgent(vector_db, knowledge_base, CONFIG)
generation_agent = GenerationAgent(client, CONFIG)

pdf_rag_system = PDFRAGSystem(pdf_ingestion, retrieval_agent, generation_agent, CONFIG)

PDF Ingestion

Ingest a PDF:
PDF_PATH = "/path/to/your/pdf.pdf"  # Update this
success = pdf_ingestion.ingest_pdf(PDF_PATH)

# Or use sample data if PDF not found (see original notebook)

Query Your PDF

Example queries:
pdf_rag_system.query("What is the main topic of the PDF document?")
pdf_rag_system.query("What AI applications are mentioned?")

Notebook

Access the full notebook version of this guide on GitHub: Agentic RAG Notebook
I