How to effectively embed PDFs with images for a RAG LLM? #4031
Replies: 2 comments
-
i have added examples within: https://github.com/andysingal/llm-course |
Beta Was this translation helpful? Give feedback.
-
https://lu.ma/b1yqug98 - recently found a great webinar |
Beta Was this translation helpful? Give feedback.
-
I’m working on a project where I need to embed PDF documents that contain images (which may or may not be relevant to the response) to create a vector database for later retrieval in a Retrieval-Augmented Generation (RAG) LLM. Currently, I’m using Unstructured + Faiss, but I’m not achieving satisfactory results with the images in the PDFs.
Here are some details about my approach:
I’m using the Unstructured library to parse the PDFs.
FAISS is being used to create and manage the vector database.
Text embeddings are working fine, but image embeddings are not yielding good results.
Questions:
What are the best practices for embedding PDFs that contain both text and images?
Are there any specific techniques or libraries that handle image embeddings within PDFs more effectively?
How can I improve the integration of image embeddings with text embeddings in my current setup?
Any advice or suggestions would be greatly appreciated!
Beta Was this translation helpful? Give feedback.
All reactions