- IngestionAgent: Extracts text from PDFs, creates overlapping chunks, and stores both the raw chunks in a local
knowledge_baseand their embeddings in a vector index. - RetrievalAgent: Performs semantic search against the vector index and assembles human-readable context from the best hits (with a robust fallback to the local
knowledge_base). - GenerationAgent: Builds a prompt from the retrieved context and calls an LLM to produce concise, sourced answers.
Quick Start
- Set
GRAVIXLAYER_API_KEYin your environment or a.envfile (the guide assumes you load.env). - Update
PDF_PATHto point to your PDF file or use sample data for testing. - Follow the sections top-to-bottom: configuration → vector DB init → ingestion (or sample load) → query examples.
- Ask questions with
pdf_rag_system.query('Your question here')--- setshow_context=Trueto see retrieved snippets.
What’s Included
VectorDatabase: Minimal wrapper for GravixLayer vector index operations (create/list/find, upsert, text search).PDFIngestionAgent: PDF extraction, sentence-aware chunking with overlap, and preparation of payloads for the vector DB and localknowledge_base.RetrievalAgent: Runs text search and assembles readable context; falls back to localknowledge_baseif the vector search is unavailable or returns limited metadata.GenerationAgent: Simple LLM prompt wrapper that callsclient.chat.completions.createand returns a concise answer with source citations when possible.

Setup and Configuration
Install required packages:Vector Database
Define aVectorDatabase class to handle index operations:

