MultiModal RAG

  • Tech Stack: Unstructured, LangChain, Google Gemini API, HuggingFace, ChromaDB, Gradio, LangSmith
  • Github URL: Project Link

The Multimodal Retrieval-Augmented Generation (RAG) system helps user query complex, multi-format PDFs containing text, images, and tables.

The pipeline extracts the information and stores it in a database, with can be later retrieved as context to answer user prompts regarding the PDF.