Skip to content

Commit

Permalink
Vector + Index quantization (#1400)
Browse files Browse the repository at this point in the history
* Add KG tests (#1351)

* cli tests

* add sdk tests

* typo fix

* change workflow ordering

* add collection integration tests (#1352)

* bump pkg

* remove workflows

* fix sdk test port

* fix delete collection return check

* Fix document info serialization (#1353)

* Update integration-test-workflow-debian.yml

* pre-commit

* slightly modify

* up

* up

* smaller file

* up

* typo, change order

* up

* up

* change order

---------

Co-authored-by: emrgnt-cmplxty <68796651+emrgnt-cmplxty@users.noreply.github.com>
Co-authored-by: emrgnt-cmplxty <owen@algofi.org>
Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com>

* add graphrag docs (#1362)

* add documentation

* up

* Update js/sdk/src/models.tsx

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* pre-commit

---------

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* Concurrent index creation, allow -1 for paginated entries (#1363)

* update webdev-template for current next.js and r2r-js sdk (#1218)

Co-authored-by: Simeon <simeon@theobald.nz>

* Feature/extend integration tests rebased (#1361)

* cleanups

* add back overzealous edits

* extend workflows

* fix full setup

* simplify cli

* add ymls

* rename to light

* try again

* start light

* add cli tests

* fix

* fix

* testing..

* trying complete matrix testflow

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* up

* up

* up

* All actions

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* try offic pgvec formula

* sudo make

* sudo make

* push and pray

* push and pray

* add new actions

* add new actions

* docker push & pray

* inspect manifests during launch

* inspect manifests during launch

* inspect manifests during launch

* inspect manifests during launch

* setup docker

* setup docker

* fix default

* fix default

* Feature/rebase to r2r vars (#1364)

* cleanups

* add back overzealous edits

* extend workflows

* fix full setup

* simplify cli

* add ymls

* rename to light

* try again

* start light

* add cli tests

* fix

* fix

* testing..

* trying complete matrix testflow

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* up

* up

* up

* All actions

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* try offic pgvec formula

* sudo make

* sudo make

* push and pray

* push and pray

* add new actions

* add new actions

* docker push & pray

* inspect manifests during launch

* inspect manifests during launch

* inspect manifests during launch

* inspect manifests during launch

* setup docker

* setup docker

* fix default

* fix default

* make changes

* update the windows workflow

* update the windows workflow

* remove extra workflows for now

* bump pkg

* push and pray

* revive full workflow

* revive full workflow

* revive full workflow

* revive full workflow

* revive full workflow

* revive full workflow

* revive full workflow

* revive full workflow

* revive tests

* revive tests

* revive tests

* revive tests

* update tests

* fix typos (#1366)

* update tests

* up

* up

* up

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* Add ingestion concurrency limit (#1367)

* up

* up

* up

---------

Co-authored-by: --global=Shreyas Pimpalgaonkar <--global=shreyas.gp.7@gmail.com>

* tweaks and fixes

* Fix Ollama Tool Calling (#1372)

* Update graphrag.mdx

* Fix Ollama tool calling

---------

Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com>

* Clean up Docker Compose (#1368)

* Fix hatchet, dockerfile

* Update compose

* point to correct docker image

* Fix bug in deletion, better validation error handling (#1374)

* Update graphrag.mdx

* Fix bug in deletion, better validation error handling

---------

Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com>

* vec index creation endpoint (#1373)

* Update graphrag.mdx

* upload files

* create vector index endpoint

* add to fastapi background task

* pre-commit

* move logging

* add api spec, support for all vecs

* pre-commit

* add workflow

* Modify KG Endpoints and update API spec (#1369)

* Update graphrag.mdx

* modify API endpoints and update documentation

* Update ingestion_router.py

* try different docker setup (#1371)

* try different docker setup

* action

* add login

* add full

* update action

* cleanup upload script

* cleanup upload script

* tweak action

* tweak action

* tweak action

* tweak action

* tweak action

* tweak action

* Nolan/ingest chunks js (#1375)

* Update graphrag.mdx

* Clean up ingest chunks, add to JS SDK

* Update JS docs

---------

Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com>

* up (#1376)

* Bump JS package (#1378)

* Fix Create Graph (#1379)

* up

* up

* modify assertion

* up

* up

* increase entity limit

* changing aristotle back to v2

* pre-commit

* typos

* add test_ingest_sample_file_2_sdk

* Update server.py

* add docs and refine code

* add python SDK documentation

* up

* add logs

* merge changes

* mc

* more

* add index + vector quantization

* pre-commits

* chnage default back to FP32

* kg vector and index test

* rm duplicate import

* Update r2r.toml

---------

Co-authored-by: emrgnt-cmplxty <68796651+emrgnt-cmplxty@users.noreply.github.com>
Co-authored-by: emrgnt-cmplxty <owen@algofi.org>
Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com>
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Co-authored-by: FutureProofTechOps <operations@theobald.nz>
Co-authored-by: Simeon <simeon@theobald.nz>
Co-authored-by: --global=Shreyas Pimpalgaonkar <--global=shreyas.gp.7@gmail.com>
  • Loading branch information
8 people authored Oct 15, 2024
1 parent 7331d3a commit f58032d
Show file tree
Hide file tree
Showing 19 changed files with 188 additions and 82 deletions.
1 change: 1 addition & 0 deletions py/core/agent/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
Message,
syncable,
)

from core.base.agent import Agent, Conversation

logger = logging.getLogger(__name__)
Expand Down
5 changes: 4 additions & 1 deletion py/core/base/providers/database.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
from pydantic import BaseModel

from .base import Provider, ProviderConfig
from shared.abstractions.vector import VectorQuantizationType

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -69,7 +70,9 @@ def supported_providers(self) -> list[str]:

class VectorDBProvider(Provider, ABC):
@abstractmethod
def _initialize_vector_db(self, dimension: int) -> None:
def _initialize_vector_db(
self, dimension: int, quantization_type: VectorQuantizationType
) -> None:
pass


Expand Down
5 changes: 5 additions & 0 deletions py/core/base/providers/embedding.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@
default_embedding_prefixes,
)

from shared.abstractions.vector import VectorQuantizationSettings

from .base import Provider, ProviderConfig

logger = logging.getLogger(__name__)
Expand All @@ -30,6 +32,9 @@ class EmbeddingConfig(ProviderConfig):
max_retries: int = 8
initial_backoff: float = 1
max_backoff: float = 64.0
quantization_settings: VectorQuantizationSettings = (
VectorQuantizationSettings()
)

def validate_config(self) -> None:
if self.provider not in self.supported_providers:
Expand Down
2 changes: 1 addition & 1 deletion py/core/base/providers/kg.py
Original file line number Diff line number Diff line change
Expand Up @@ -215,7 +215,7 @@ async def delete_node_via_document_id(
) -> None:
"""Abstract method to delete the node via document id."""
pass

@abstractmethod
async def get_creation_estimate(self, *args: Any, **kwargs: Any) -> Any:
"""Abstract method to get the creation estimate."""
Expand Down
3 changes: 1 addition & 2 deletions py/core/main/api/management_router.py
Original file line number Diff line number Diff line change
Expand Up @@ -692,7 +692,6 @@ async def remove_document_from_collection_app(
default=None,
description="Settings for the graph enrichment process.",
),

) -> WrappedDeleteResponse:
collection_uuid = UUID(collection_id)
document_uuid = UUID(document_id)
Expand Down Expand Up @@ -734,7 +733,7 @@ async def remove_document_from_collection_app(
self.orchestration_provider.run_workflow(
"enrich-graph", {"request": workflow_input}, {}
)

return None # type: ignore

@self.router.get("/document_collections/{document_id}")
Expand Down
8 changes: 7 additions & 1 deletion py/core/main/assembly/factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,11 +148,17 @@ async def create_database_provider(
)

vector_db_dimension = self.config.embedding.base_dimension
quantization_type = (
self.config.embedding.quantization_settings.quantization_type
)
if db_config.provider == "postgres":
from core.providers import PostgresDBProvider

database_provider = PostgresDBProvider(
db_config, vector_db_dimension, crypto_provider=crypto_provider
db_config,
vector_db_dimension,
crypto_provider=crypto_provider,
quantization_type=quantization_type,
)
await database_provider.initialize()
return database_provider
Expand Down
1 change: 1 addition & 0 deletions py/core/main/orchestration/hatchet/kg_workflow.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from core import GenerationConfig
from core.base import OrchestrationProvider
from shared.abstractions.document import KGExtractionStatus

from ...services import KgService

from shared.utils import create_hatchet_logger
Expand Down
3 changes: 1 addition & 2 deletions py/core/main/services/kg_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -293,8 +293,7 @@ async def delete_node_via_document_id(
**kwargs,
):
return await self.providers.kg.delete_node_via_document_id(
collection_id,
document_id
collection_id, document_id
)

@telemetry_event("get_creation_estimate")
Expand Down
4 changes: 3 additions & 1 deletion py/core/main/services/management_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -394,7 +394,9 @@ async def remove_document_from_collection(
self.providers.database.vector.remove_document_from_collection(
document_id, collection_id
)
enrichment = await self.providers.kg.delete_node_via_document_id(document_id, collection_id)
enrichment = await self.providers.kg.delete_node_via_document_id(
document_id, collection_id
)
return enrichment

@telemetry_event("DocumentCollections")
Expand Down
4 changes: 4 additions & 0 deletions py/core/providers/database/postgres.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import warnings
from typing import Any, Optional

from shared.abstractions.vector import VectorQuantizationType
from core.base import (
CryptoProvider,
DatabaseConfig,
Expand Down Expand Up @@ -48,6 +49,7 @@ def __init__(
self,
config: DatabaseConfig,
dimension: int,
quantization_type: VectorQuantizationType,
crypto_provider: CryptoProvider,
*args,
**kwargs,
Expand Down Expand Up @@ -95,6 +97,7 @@ def __init__(
logger.info("Connecting to Postgres via TCP/IP")

self.vector_db_dimension = dimension
self.vector_db_quantization_type = quantization_type
self.conn = None
self.config: DatabaseConfig = config
self.crypto_provider = crypto_provider
Expand All @@ -119,6 +122,7 @@ def _initialize_vector_db(self) -> VectorDBProvider:
connection_string=self.connection_string,
project_name=self.project_name,
dimension=self.vector_db_dimension,
quantization_type=self.vector_db_quantization_type,
)

async def _initialize_relational_db(self) -> RelationalDBProvider:
Expand Down
4 changes: 3 additions & 1 deletion py/core/providers/database/vecs/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
from . import exc
from .client import Client
from .collection import Collection
from .collection import (
Collection,
)

__project__ = "vecs"
__version__ = "0.4.2"
Expand Down
3 changes: 3 additions & 0 deletions py/core/providers/database/vecs/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
from sqlalchemy.orm import sessionmaker
from sqlalchemy.pool import QueuePool

from shared.abstractions.vector import VectorQuantizationType
from .adapter import Adapter
from .exc import CollectionNotFound

Expand Down Expand Up @@ -156,6 +157,7 @@ def get_or_create_vector_table(
*,
dimension: Optional[int] = None,
adapter: Optional[Adapter] = None,
quantization_type: Optional[VectorQuantizationType] = None,
) -> Collection:
"""
Get a vector collection by name, or create it if no collection with
Expand All @@ -181,6 +183,7 @@ def get_or_create_vector_table(
collection = Collection(
name=name,
dimension=dimension or adapter_dimension, # type: ignore
quantization_type=quantization_type,
client=self,
adapter=adapter,
)
Expand Down
57 changes: 48 additions & 9 deletions py/core/providers/database/vecs/collection.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,13 +36,13 @@
from core.base import VectorSearchResult
from core.base.abstractions import VectorSearchSettings
from shared.abstractions.vector import (
INDEX_MEASURE_TO_OPS,
INDEX_MEASURE_TO_SQLA_ACC,
IndexArgsHNSW,
IndexArgsIVFFlat,
IndexMeasure,
IndexMethod,
VectorTableName,
VectorQuantizationType,
)

from .adapter import Adapter, AdapterContext, NoOp, Record
Expand All @@ -54,19 +54,41 @@
MismatchedDimension,
)

from shared.utils import _decorate_vector_type

if TYPE_CHECKING:
from vecs.client import Client


def index_measure_to_ops(
measure: IndexMeasure, quantization_type: VectorQuantizationType
):
return _decorate_vector_type(measure.ops, quantization_type)


class Vector(UserDefinedType):
cache_ok = True

def __init__(self, dim=None):
def __init__(
self,
dim=None,
quantization_type: Optional[
VectorQuantizationType
] = VectorQuantizationType.FP32,
):
super(UserDefinedType, self).__init__()
self.dim = dim
self.quantization_type = quantization_type

def get_col_spec(self, **kw):
return "VECTOR" if self.dim is None else f"VECTOR({self.dim})"
col_spec = ""
if self.dim is None:
col_spec = _decorate_vector_type("", self.quantization_type)
else:
col_spec = _decorate_vector_type(
f"({self.dim})", self.quantization_type
)
return col_spec

def bind_processor(self, dialect):
def process(value):
Expand Down Expand Up @@ -132,6 +154,7 @@ def __init__(
self,
name: str,
dimension: int,
quantization_type: VectorQuantizationType,
client: Client,
adapter: Optional[Adapter] = None,
):
Expand All @@ -151,8 +174,13 @@ def __init__(
self.client = client
self.name = name
self.dimension = dimension
self.quantization_type = quantization_type
self.table = _build_table(
client.project_name, name, client.meta, dimension
client.project_name,
name,
client.meta,
dimension,
quantization_type,
)
self._index: Optional[str] = None
self.adapter = adapter or Adapter(steps=[NoOp(dimension=dimension)])
Expand Down Expand Up @@ -863,7 +891,7 @@ def is_indexed_for_measure(self, measure: IndexMeasure):
if index_name is None:
return False

ops = INDEX_MEASURE_TO_OPS.get(measure)
ops = index_measure_to_ops(measure, self.quantization_type)
if ops is None:
return False

Expand Down Expand Up @@ -892,6 +920,7 @@ def create_index(
] = None,
replace: bool = True,
concurrently: bool = True,
quantization_type: VectorQuantizationType = VectorQuantizationType.FP32,
) -> None:
"""
Creates an index for the collection.
Expand Down Expand Up @@ -979,14 +1008,17 @@ def create_index(
"HNSW Unavailable. Upgrade your pgvector installation to > 0.5.0 to enable HNSW support"
)

ops = INDEX_MEASURE_TO_OPS.get(measure)
ops = index_measure_to_ops(
measure, quantization_type=self.quantization_type
)

if ops is None:
raise ArgError("Unknown index measure")

concurrently_sql = "CONCURRENTLY" if concurrently else ""

# Drop existing index if needed (must be outside of transaction)
# Doesn't drop
if self.index is not None and replace:
drop_index_sql = f'DROP INDEX {concurrently_sql} IF EXISTS {self.client.project_name}."{self.index}";'
try:
Expand Down Expand Up @@ -1028,9 +1060,12 @@ def create_index(


def _build_table(
project_name: str, name: str, meta: MetaData, dimension: int
project_name: str,
name: str,
meta: MetaData,
dimension: int,
quantization_type: VectorQuantizationType = VectorQuantizationType.FP32,
) -> Table:

table = Table(
name,
meta,
Expand All @@ -1042,7 +1077,11 @@ def _build_table(
postgresql.ARRAY(postgresql.UUID),
server_default="{}",
),
Column("vec", Vector(dimension), nullable=False),
Column(
"vec",
Vector(dimension, quantization_type=quantization_type),
nullable=False,
),
Column("text", postgresql.TEXT, nullable=True),
Column(
"fts",
Expand Down
Loading

0 comments on commit f58032d

Please sign in to comment.