Welcome to the End-to-End QA GenAI Project, where cutting-edge technology meets seamless user experience! π This project implements a Retrieval-Augmented Generation (RAG) system with a Streamlit UI for uploading and processing PDF documents. It leverages the Google Gemini API (featuring the powerful Gemini Pro model) for high-performance natural language understanding and generation. The LlamaIndex framework is used for efficient document indexing and retrieval, enabling fast and accurate query responses. ππ¬
With this system, you can interact with documents in a whole new way: upload, query, and generate intelligent responses! π
- π₯οΈ Streamlit UI for seamless PDF uploads and interactions.
- π€ Google Gemini API integration for powerful language generation using the Gemini Pro model.
- ποΈ LlamaIndex framework for fast and scalable document indexing and retrieval.
- π οΈ Modular Code Structure for easy maintainability and modification.
- π High-performance QA generation based on uploaded documents and queries.
The system is built with a modular design for scalability and easy maintenance. Below are the core components that make everything work:
- Streamlit UI: Upload PDF files and interact with the system seamlessly.
- Document Preprocessing: Extracts content from PDFs for indexing and retrieval.
- LlamaIndex Integration: Indexes document content for fast search and retrieval.
- Google Gemini API: Processes queries and generates responses using the Gemini Pro model.
- Modular Code: Cleanly separated components for easy updates and improvements.
Get ready to set up your local environment and dive into this powerful GenAI system! Follow the steps below to get started:
- Python 3.9 or higher.
- Google Gemini API credentials.
- LlamaIndex library.
-
Clone the repository:
git clone https://github.com/shaheennabi/End-to-End-QA-GenAI-Project.git cd End-to-End-QA-GenAI-Project
-
Create and activate a Python virtual environment:
python3.9 -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the required Python packages:
pip install -r requirements.txt
-
Set up your Google Gemini API credentials. Follow Google's official documentation for setting up the API credentials.
-
Install LlamaIndex:
pip install llama-index
Now that you're all set up, letβs get the app running:
-
Start the Streamlit app:
streamlit run app.py
-
Open your browser and visit http://localhost:8501 to interact with the app. π
-
Upload a PDF file and query it for intelligent responses generated using Gemini Pro!
Hereβs how the magic happens:
- Upload PDF: The user uploads a PDF file through the Streamlit UI.
- Text Extraction: The system extracts text content from the uploaded PDF.
- Document Indexing: The text is indexed using LlamaIndex for quick retrieval.
- Query Generation: The user submits a query, which is processed by the Gemini Pro model.
- Response Generation: The system retrieves the relevant information and generates a natural language response using the Gemini API. π―
This project utilizes the Google Gemini API (with the Gemini Pro model) for natural language generation. To interact with the API, you must set up your Google Gemini API credentials in the config.py
file.
- π Multilingual Support: Extend the system to support multiple languages for text generation.
- π Document Summarization: Automatically generate summaries for long documents.
- π Advanced Search Features: Add advanced search and filtering capabilities for document retrieval.
We welcome contributions to improve this project! To contribute:
- Fork the repository.
- Create a new branch for your changes.
- Make your changes and commit them.
- Open a pull request for review.
This project is licensed under the MIT License. See the LICENSE file for more details.
A special thank you to the following technologies and resources that made this project possible:
- Google Gemini API: For providing powerful AI capabilities.
- LlamaIndex: For efficient document indexing and retrieval.
- Streamlit: For creating beautiful and user-friendly web interfaces.
- Python 3.9: The language powering this entire project.
- Contributors: For making this project even better! π
If you love this project, donβt forget to star it on GitHub! It helps us keep the project alive and motivates us to keep improving it. ππ
Ready to jump in? Clone the repository, install the dependencies, and start exploring this next-gen AI-powered QA system! ππ¬