Skip to content

Commit

Permalink
Index quantization (#1393)
Browse files Browse the repository at this point in the history
* Add KG tests (#1351)

* cli tests

* add sdk tests

* typo fix

* change workflow ordering

* add collection integration tests (#1352)

* bump pkg

* remove workflows

* fix sdk test port

* fix delete collection return check

* Fix document info serialization (#1353)

* Update integration-test-workflow-debian.yml

* pre-commit

* slightly modify

* up

* up

* smaller file

* up

* typo, change order

* up

* up

* change order

---------

Co-authored-by: emrgnt-cmplxty <68796651+emrgnt-cmplxty@users.noreply.github.com>
Co-authored-by: emrgnt-cmplxty <owen@algofi.org>
Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com>

* add graphrag docs (#1362)

* add documentation

* up

* Update js/sdk/src/models.tsx

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* pre-commit

---------

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* Concurrent index creation, allow -1 for paginated entries (#1363)

* update webdev-template for current next.js and r2r-js sdk (#1218)

Co-authored-by: Simeon <simeon@theobald.nz>

* Feature/extend integration tests rebased (#1361)

* cleanups

* add back overzealous edits

* extend workflows

* fix full setup

* simplify cli

* add ymls

* rename to light

* try again

* start light

* add cli tests

* fix

* fix

* testing..

* trying complete matrix testflow

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* up

* up

* up

* All actions

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* try offic pgvec formula

* sudo make

* sudo make

* push and pray

* push and pray

* add new actions

* add new actions

* docker push & pray

* inspect manifests during launch

* inspect manifests during launch

* inspect manifests during launch

* inspect manifests during launch

* setup docker

* setup docker

* fix default

* fix default

* Feature/rebase to r2r vars (#1364)

* cleanups

* add back overzealous edits

* extend workflows

* fix full setup

* simplify cli

* add ymls

* rename to light

* try again

* start light

* add cli tests

* fix

* fix

* testing..

* trying complete matrix testflow

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* cleanup matrix logic

* up

* up

* up

* All actions

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* rename to runner

* try offic pgvec formula

* sudo make

* sudo make

* push and pray

* push and pray

* add new actions

* add new actions

* docker push & pray

* inspect manifests during launch

* inspect manifests during launch

* inspect manifests during launch

* inspect manifests during launch

* setup docker

* setup docker

* fix default

* fix default

* make changes

* update the windows workflow

* update the windows workflow

* remove extra workflows for now

* bump pkg

* push and pray

* revive full workflow

* revive full workflow

* revive full workflow

* revive full workflow

* revive full workflow

* revive full workflow

* revive full workflow

* revive full workflow

* revive tests

* revive tests

* revive tests

* revive tests

* update tests

* fix typos (#1366)

* update tests

* up

* up

* up

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* bump max connections

* Add ingestion concurrency limit (#1367)

* up

* up

* up

---------

Co-authored-by: --global=Shreyas Pimpalgaonkar <--global=shreyas.gp.7@gmail.com>

* tweaks and fixes

* Fix Ollama Tool Calling (#1372)

* Update graphrag.mdx

* Fix Ollama tool calling

---------

Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com>

* Clean up Docker Compose (#1368)

* Fix hatchet, dockerfile

* Update compose

* point to correct docker image

* Fix bug in deletion, better validation error handling (#1374)

* Update graphrag.mdx

* Fix bug in deletion, better validation error handling

---------

Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com>

* vec index creation endpoint (#1373)

* Update graphrag.mdx

* upload files

* create vector index endpoint

* add to fastapi background task

* pre-commit

* move logging

* add api spec, support for all vecs

* pre-commit

* add workflow

* Modify KG Endpoints and update API spec (#1369)

* Update graphrag.mdx

* modify API endpoints and update documentation

* Update ingestion_router.py

* try different docker setup (#1371)

* try different docker setup

* action

* add login

* add full

* update action

* cleanup upload script

* cleanup upload script

* tweak action

* tweak action

* tweak action

* tweak action

* tweak action

* tweak action

* Nolan/ingest chunks js (#1375)

* Update graphrag.mdx

* Clean up ingest chunks, add to JS SDK

* Update JS docs

---------

Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com>

* up (#1376)

* Bump JS package (#1378)

* Fix Create Graph (#1379)

* up

* up

* modify assertion

* up

* up

* increase entity limit

* changing aristotle back to v2

* pre-commit

* typos

* add test_ingest_sample_file_2_sdk

* Update server.py

* add docs and refine code

* add python SDK documentation

* up

* add logs

* merge changes

* mc

* more

---------

Co-authored-by: emrgnt-cmplxty <68796651+emrgnt-cmplxty@users.noreply.github.com>
Co-authored-by: emrgnt-cmplxty <owen@algofi.org>
Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com>
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Co-authored-by: FutureProofTechOps <operations@theobald.nz>
Co-authored-by: Simeon <simeon@theobald.nz>
Co-authored-by: --global=Shreyas Pimpalgaonkar <--global=shreyas.gp.7@gmail.com>
  • Loading branch information
8 people authored Oct 14, 2024
1 parent 89089af commit 3475145
Show file tree
Hide file tree
Showing 15 changed files with 116 additions and 12 deletions.
1 change: 1 addition & 0 deletions py/core/agent/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
Message,
syncable,
)

from core.base.agent import Agent, Conversation

logger = logging.getLogger(__name__)
Expand Down
4 changes: 4 additions & 0 deletions py/core/base/providers/embedding.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@
VectorSearchResult,
default_embedding_prefixes,
)

from shared.abstractions.vector import VectorQuantizationSettings

from .base import Provider, ProviderConfig

logger = logging.getLogger(__name__)
Expand All @@ -29,6 +32,7 @@ class EmbeddingConfig(ProviderConfig):
max_retries: int = 8
initial_backoff: float = 1
max_backoff: float = 64.0
quantization_settings: Optional[VectorQuantizationSettings] = None

def validate_config(self) -> None:
if self.provider not in self.supported_providers:
Expand Down
17 changes: 17 additions & 0 deletions py/core/main/api/ingestion_router.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from pydantic import Json

from core.base import R2RException, RawChunk, generate_document_id

from core.base.api.models import (
CreateVectorIndexResponse,
WrappedCreateVectorIndexResponse,
Expand All @@ -28,6 +29,15 @@
from ..services.ingestion_service import IngestionService
from .base_router import BaseRouter, RunType

from shared.abstractions.vector import (
IndexMethod,
IndexArgsIVFFlat,
IndexArgsHNSW,
VectorTableName,
IndexMeasure,
VectorIndexQuantizationConfig,
)

logger = logging.getLogger(__name__)


Expand Down Expand Up @@ -361,6 +371,12 @@ async def create_vector_index_app(
default=True,
description="Whether to create the index concurrently.",
),
quantization_config: Optional[
VectorIndexQuantizationConfig
] = Body(
default=None,
description="The quantization configuration for the index.",
),
auth_user=Depends(self.service.providers.auth.auth_wrapper),
) -> WrappedCreateVectorIndexResponse:

Expand All @@ -378,6 +394,7 @@ async def create_vector_index_app(
"index_arguments": index_arguments,
"replace": replace,
"concurrently": concurrently,
"quantization_config": quantization_config,
},
},
options={
Expand Down
1 change: 0 additions & 1 deletion py/core/main/api/kg_router.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@
from core.utils import generate_default_user_collection_id
from shared.abstractions.kg import KGRunType
from shared.utils.base_utils import update_settings_from_dict

from ..services.kg_service import KgService
from .base_router import BaseRouter

Expand Down
2 changes: 0 additions & 2 deletions py/core/main/orchestration/hatchet/kg_workflow.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@

from core import GenerationConfig
from core.base import OrchestrationProvider

from ...services import KgService

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -70,7 +69,6 @@ async def kg_extract(self, context: Context) -> dict:

await self.kg_service.kg_triples_extraction(
document_id=uuid.UUID(document_id),
logger=context.log,
**input_data["kg_creation_settings"],
)

Expand Down
1 change: 1 addition & 0 deletions py/core/main/orchestration/simple/kg_workflow.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ async def create_graph(input_data):

for _, document_id in enumerate(document_ids):
# Extract triples from the document

await service.kg_triples_extraction(
document_id=document_id,
**input_data["kg_creation_settings"],
Expand Down
5 changes: 5 additions & 0 deletions py/core/main/services/ingestion_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,11 @@
)

from ..abstractions import R2RAgents, R2RPipelines, R2RPipes, R2RProviders
from shared.abstractions.vector import (
IndexMethod,
IndexMeasure,
VectorTableName,
)
from ..config import R2RConfig
from .base import Service

Expand Down
2 changes: 2 additions & 0 deletions py/core/main/services/kg_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
from ..config import R2RConfig
from .base import Service

from time import strftime

logger = logging.getLogger(__name__)


Expand Down
1 change: 1 addition & 0 deletions py/core/pipes/kg/triples_extraction.py
Original file line number Diff line number Diff line change
Expand Up @@ -230,6 +230,7 @@ async def _run_logic( # type: ignore
max_knowledge_triples = input.message["max_knowledge_triples"]
entity_types = input.message["entity_types"]
relation_types = input.message["relation_types"]

extractions = [
DocumentExtraction(
id=extraction["extraction_id"],
Expand Down
4 changes: 3 additions & 1 deletion py/core/providers/database/vecs/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
from . import exc
from .client import Client
from .collection import Collection
from .collection import (
Collection,
)

__project__ = "vecs"
__version__ = "0.4.2"
Expand Down
53 changes: 48 additions & 5 deletions py/core/providers/database/vecs/collection.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
from typing import TYPE_CHECKING, Any, Iterable, Optional, Union
from uuid import UUID, uuid4

import time
from flupy import flu
from sqlalchemy import (
Column,
Expand Down Expand Up @@ -44,6 +45,17 @@
VectorTableName,
)

from shared.abstractions.vector import (
VectorTableName,
IndexMeasure,
IndexMethod,
IndexArgsIVFFlat,
IndexArgsHNSW,
INDEX_MEASURE_TO_OPS,
INDEX_MEASURE_TO_SQLA_ACC,
VectorQuantizationType,
)

from .adapter import Adapter, AdapterContext, NoOp, Record
from .exc import (
ArgError,
Expand All @@ -64,8 +76,17 @@ def __init__(self, dim=None):
super(UserDefinedType, self).__init__()
self.dim = dim

def get_col_spec(self, **kw):
return "VECTOR" if self.dim is None else f"VECTOR({self.dim})"
def get_col_spec(
self, quantization_type: Optional[VectorQuantizationType] = None, **kw
):
# return self._decorate_vector_type("VECTOR", quantization_type) if self.dim is None else self._decorate_vector_type(f"VECTOR({self.dim})", quantization_type)

if self.dim is None:
return self._decorate_vector_type("", quantization_type)
else:
return self._decorate_vector_type(
f"({self.dim})", quantization_type
)

def bind_processor(self, dialect):
def process(value):
Expand Down Expand Up @@ -199,7 +220,20 @@ def __len__(self) -> int:
stmt = select(func.count()).select_from(self.table)
return sess.execute(stmt).scalar() or 0

def _create_if_not_exists(self):
def _decorate_vector_type(
self,
input_str: str,
quantization_type: Optional[VectorQuantizationType],
) -> str:
if (
quantization_type is None
or quantization_type == VectorQuantizationType.FP32
):
return f"{quantization_type}{input_str}"

def _create_if_not_exists(
self, quantization_type: Optional[VectorQuantizationType]
):
"""
PRIVATE
Expand Down Expand Up @@ -1027,8 +1061,18 @@ def create_index(


def _build_table(
project_name: str, name: str, meta: MetaData, dimension: int
project_name: str,
name: str,
meta: MetaData,
dimension: int,
quantization_type: Optional[VectorQuantizationType] = None,
) -> Table:

if quantization_type is not None:
vector_column = (Column("vec", Vector(dimension), nullable=False),)
else:
vector_column = Vector(dimension)

table = Table(
name,
meta,
Expand All @@ -1040,7 +1084,6 @@ def _build_table(
postgresql.ARRAY(postgresql.UUID),
server_default="{}",
),
Column("vec", Vector(dimension), nullable=False),
Column("text", postgresql.TEXT, nullable=True),
Column(
"fts",
Expand Down
13 changes: 10 additions & 3 deletions py/core/providers/database/vector.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import logging
import time
from concurrent.futures import ThreadPoolExecutor
import time
from typing import Any, Optional, Union

from sqlalchemy import text
Expand All @@ -15,15 +16,21 @@
VectorSearchResult,
)
from core.base.abstractions import VectorSearchSettings

from shared.abstractions.vector import (
IndexArgsHNSW,
IndexArgsIVFFlat,
IndexMeasure,
IndexMethod,
VectorTableName,
IndexArgsHNSW,
IndexArgsIVFFlat,
)

from .vecs import (
Client,
Collection,
create_client,
)

from .vecs import Client, Collection, create_client

logger = logging.getLogger(__name__)

Expand Down
1 change: 1 addition & 0 deletions py/r2r.toml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ batch_size = 128
add_title_as_prefix = false
rerank_model = "None"
concurrent_request_limit = 256
quantization_config = { quantization_type = "fp32" }

[file]
provider = "postgres"
Expand Down
22 changes: 22 additions & 0 deletions py/shared/abstractions/vector.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,28 @@ def __str__(self) -> str:
return self.value


class VectorQuantizationType(str, Enum):
FP32 = "vector" # or vector
FP16 = "halfvec" # or halfvec
INT1 = "bit" # or bit
SPARSE = "sparsevec" # or sparsevec

def __str__(self) -> str:
return self.value


# VECTOR_QUANTIZATION_OBJECT_MAP = {
# VectorQuantizationType.FP32: Vector,
# VectorQuantizationType.FP16: Vector,
# VectorQuantizationType.INT1: Vector,
# VectorQuantizationType.SPARSE: Vector,
# }


class VectorQuantizationSettings(R2RSerializable):
quantization_type: Optional[VectorQuantizationType] = None


class Vector(R2RSerializable):
"""A vector with the option to fix the number of elements."""

Expand Down
1 change: 1 addition & 0 deletions py/shared/utils/base_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
from datetime import datetime
from typing import TYPE_CHECKING, Any, AsyncGenerator, Iterable
from uuid import NAMESPACE_DNS, UUID, uuid4, uuid5
from copy import deepcopy

from ..abstractions import R2RSerializable
from ..abstractions.graph import EntityType, RelationshipType
Expand Down

0 comments on commit 3475145

Please sign in to comment.