LLM - RAG with Claude for Graduate School Notes

This is a personal project that uses a Retrieval-Augmented Generation (RAG) Large Language Model (LLM) implementation to derive insights from my personal graduate school notes. My use case is that I enjoyed my graduate school experience but sometimes find myself wishing that I explicitly summarized the most important takeaways from each course. I want to be able to refresh my memory and generate insights across the 22 courses I took without having to read 1,000+ pages of text. This project allows me to solve that problem using Claude models (2 and Sonnet) via Amazon Bedrock, Amazon Titan Embeddings, and Meta's Facebook AI Similarity Search (FAISS) vector store to generate question-and-answer insights from my graduate school notes.

Disclaimer: This work is in a personal capacity unrelated to my employer. This package is for illustrative purposes and is not designed for end-to-end productionization as-is.

Prerequisites

You will need your own Amazon Web Services (AWS) account with Claude and Titan Amazon Bedrock model access. Your Python environment will also require:

langchain>=0.1.11
langchain-community
faiss-cpu==1.8.0

Overview

This package will demonstrate how to:

Import libraries
Instantiate the LLM and embeddings models
Load PDFs of notes as documents
Split documents into chunks
Confirm embeddings functionality
Create vector store
Define Claude 3 function
Embed question and return relevant chunks
Create prompt template
Produce RAG outputs with Claude 2
Produce outputs with Claude 3 Sonnet for comparison

Claude 2 RAG Output Examples

When Claude 2 was provided with the vector store of user reviews and prompted "What are the most important concepts in Behavioral Economics?", it returned:

When Claude 2 was provided with the vector store of user reviews and prompted "If I manage a small team that directly competes with larger, better-resourced teams, how can I more effectively compete against these teams?", it returned:

Input Data

The input data for this project are my personal graduate school notes for 22 courses, which reflects thousands of pages of text. These word processing documents were converted to PDFs for easier loading for the vector store.

Next Steps

Future next steps for this project include:

Incorporating evaluation methodologies to assess the quality of the outputs beyond the current heuristic assessment
Inspecting the methodological decisions with further granularity (e.g., the chunk size during the chunking process, etc.)
Applying this approach to additional use cases

License

This project is licensed under the MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM - RAG with Claude for Graduate School Notes

Prerequisites

Overview

Claude 2 RAG Output Examples

Input Data

Next Steps

License

Resources Referenced

About

Releases

Packages

Languages

License

blallen22/llm-rag-claude-notes

Folders and files

Latest commit

History

Repository files navigation

LLM - RAG with Claude for Graduate School Notes

Prerequisites

Overview

Claude 2 RAG Output Examples

Input Data

Next Steps

License

Resources Referenced

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages