Replies: 2 comments 1 reply
-
Thanks for writing up! A few questions
|
Beta Was this translation helpful? Give feedback.
1 reply
-
Below is our current thought for the Beyond standard retrieval, implementations of this adapter inject the document embedding into the retrieved document metadata. Additionally the query-based methods return the embedded query vector. This is done to support server-side embedding.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Checked
Feature request
Replace the
GraphVectorStore
interface and implementations with retrievers that traverse relationships stored in metadata. These retrievers allow graph traversal using the vector + metadata search functionalities available in many vector stores. Additionally, contribute implementations of LazyGraphRAG and other techniques based on these retrievers.Motivation
We have released the following implementations under the
GraphVectorStore
interface:langchain-community.graph_vectorstores.Cassandra
langchain-astradb.AstraDBGraphVectorStore
These implementations rely on the primary interfaces:
GraphVectorStore
,Link
, andNode
; located atlangchain-community.graph_vectorstores.base
.However, we have been informed that further contributions to LangChain using the
GraphVectorStore
interface will not be accepted. This makes our ChromaDB and OpenSearch implementations non-viable for PRs.It was suggested to transition to the
DocumentIndex
interface. After evaluation, we believe theBaseRetriever
interface is a better fit because it enables graph traversal as a lightweight layer on top of existing vector stores via metadata.Example:
Proposal (If applicable)
We propose the following changes:
Introduce Metadata-Based Graph Traversal Retrievers
Create retrievers that implement the
BaseRetriever
interface and traverse metadata relationships. These retrievers eliminate the need forLink
andNode
objects and focus directly onDocument
metadata for traversal.We will start by replicating the current functionality of
GraphVectorStore
with two retrievers:GraphTraversalRetriever
GraphMMRTraversalRetriever
Introduce VectorStoreAdapter Interfaces
To address gaps in the current
VectorStore
interface, we propose usingVectorStoreAdapter
interfaces. These will define the additional methods necessary for graph traversal.Why adapters?
Adapters provide a flexible way to integrate graph traversal functionality without requiring immediate changes to the
VectorStore
interface.Note that these adapters don't provide new functionality. They merely expose existing capabilities of most vector stores through a standard interface. This is also why these are easy to write, and we consider them just "adapters" rather than full-on implementations.
In the future, if the
VectorStore
interface adds support for richer metadata filtering (as discussed in LangChain PR #24206) and vector embedding retrieval, the adapters can be phased out.Adapter Implementation
Refactor LinkExtractors into DocumentTransformers
The existing
LinkExtractors
will be converted intoDocumentTransformers
that add metadata to Document objects. This approach supports graph traversal while also enabling broader use cases, such as metadata-based filtering.Example API:
Benefits:
Declarative Traversal
Traversal edges are defined at retriever initialization using metadata keys, allowing a single vector store to support multiple traversal strategies. This enables fast ingestion and defers detailed graph creation, as seen in Microsoft's LazyGraphRAG.
Narrower Abstraction
The
BaseRetriever
interface is read-only, emphasizing that any vector store can be traversed as a graph without requiring special ingestion processes.Broad Compatibility
Adapters enable graph retrieval on any vector store, maximizing flexibility and extensibility.
Metadata Efficiency
Avoids embedding
Link
objects in metadata, reducing the risk of hitting document size limits. This approach is more idiomatic, existing metadata fields are used naturally, rather than requiring specialLink
fields.Discussion Points:
Why BaseRetriever, Not DocumentIndex?
Graph traversal is a property of querying data, not writing or configuring it. Using
BaseRetriever
keeps the focus on reading and leaves ingestion to the vector store.Shared or Individual Adapter Definitions?
Should there be a shared Adapter interface for all Graph Traversal Retrievers or individual definitions for each? Where should the Adapter code live?
Location for Graph Algorithms?
Where should graph algorithms like LazyGraphRAG, which can be implemented as a chain on top of a retriever, live within the codebase?
Beta Was this translation helpful? Give feedback.
All reactions