Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weaviate vector store #708

Merged
merged 9 commits into from
Apr 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions docs/docs/modules/indexes/vector_stores/integrations/weaviate.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
hide_table_of_contents: true
---

import CodeBlock from "@theme/CodeBlock";

# Weaviate

Weaviate is an open source vector database that stores both objects and vectors, allowing for combining vector search with structured filtering. LangChain connects to Weaviate via the `weaviate-ts-client` package, the official Typescript client for Weaviate.

LangChain inserts vectors directly to Weaviate, and queries Weaviate for the nearest neighbors of a given vector, so that you can use all the LangChain Embeddings integrations with Weaviate.

## Setup

```bash npm2yarn
npm install weaviate-ts-client graphql
```

You'll need to run Weaviate either locally or on a server, see [the Weaviate documentation](https://weaviate.io/developers/weaviate/installation) for more information.

## Usage, insert documents

import InsertExample from "@examples/indexes/vector_stores/weaviate_fromTexts.ts";

<CodeBlock language="typescript">{InsertExample}</CodeBlock>

## Usage, query documents

import QueryExample from "@examples/indexes/vector_stores/weaviate_search.ts";

<CodeBlock language="typescript">{QueryExample}</CodeBlock>
3 changes: 3 additions & 0 deletions examples/.env.example
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,6 @@ SERPAPI_API_KEY=ADD_YOURS_HERE # https://serpapi.com/manage-api-key
SERPER_API_KEY=ADD_YOURS_HERE # https://serper.dev/api-key
SUPABASE_PRIVATE_KEY=ADD_YOURS_HERE # https://app.supabase.com/project/YOUR_PROJECT_ID/settings/api
SUPABASE_URL=ADD_YOURS_HERE # # https://app.supabase.com/project/YOUR_PROJECT_ID/settings/api
WEAVIATE_HOST=ADD_YOURS_HERE
WEAVIATE_SCHEME=ADD_YOURS_HERE
WEAVIATE_API_KEY=ADD_YOURS_HERE
2 changes: 2 additions & 0 deletions examples/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -30,13 +30,15 @@
"@zilliz/milvus2-sdk-node": "^2.2.0",
"axios": "^0.26.0",
"chromadb": "^1.3.0",
"graphql": "^16.6.0",
"js-yaml": "^4.1.0",
"langchain": "workspace:*",
"ml-distance": "^4.0.0",
"mongodb": "^5.2.0",
"prisma": "^4.11.0",
"sqlite3": "^5.1.4",
"typeorm": "^0.3.12",
"weaviate-ts-client": "^1.0.0",
"zod": "^3.21.4"
},
"devDependencies": {
Expand Down
28 changes: 28 additions & 0 deletions examples/src/indexes/vector_stores/weaviate_fromTexts.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
/* eslint-disable @typescript-eslint/no-explicit-any */
import weaviate from "weaviate-ts-client";
import { WeaviateStore } from "langchain/vectorstores/weaviate";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";

export async function run() {
// Something wrong with the weaviate-ts-client types, so we need to disable
const client = (weaviate as any).client({
scheme: process.env.WEAVIATE_SCHEME || "https",
host: process.env.WEAVIATE_HOST || "localhost",
apiKey: new (weaviate as any).ApiKey(
process.env.WEAVIATE_API_KEY || "default"
),
});

// Create a store and fill it with some texts + metadata
await WeaviateStore.fromTexts(
["hello world", "hi there", "how are you", "bye now"],
[{ foo: "bar" }, { foo: "baz" }, { foo: "qux" }, { foo: "bar" }],
new OpenAIEmbeddings(),
{
client,
indexName: "Test",
textKey: "text",
metadataKeys: ["foo"],
}
);
}
44 changes: 44 additions & 0 deletions examples/src/indexes/vector_stores/weaviate_search.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
/* eslint-disable @typescript-eslint/no-explicit-any */
import weaviate from "weaviate-ts-client";
import { WeaviateStore } from "langchain/vectorstores/weaviate";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";

export async function run() {
// Something wrong with the weaviate-ts-client types, so we need to disable
const client = (weaviate as any).client({
scheme: process.env.WEAVIATE_SCHEME || "https",
host: process.env.WEAVIATE_HOST || "localhost",
apiKey: new (weaviate as any).ApiKey(
process.env.WEAVIATE_API_KEY || "default"
),
});

// Create a store for an existing index
const store = await WeaviateStore.fromExistingIndex(new OpenAIEmbeddings(), {
client,
indexName: "Test",
metadataKeys: ["foo"],
});

// Search the index without any filters
const results = await store.similaritySearch("hello world", 1);
console.log(results);
/*
[ Document { pageContent: 'hello world', metadata: { foo: 'bar' } } ]
*/

// Search the index with a filter, in this case, only return results where
// the "foo" metadata key is equal to "baz", see the Weaviate docs for more
// https://weaviate.io/developers/weaviate/api/graphql/filters
const results2 = await store.similaritySearch("hello world", 1, {
where: {
operator: "Equal",
path: ["foo"],
valueText: "baz",
},
});
console.log(results2);
/*
[ Document { pageContent: 'hi there', metadata: { foo: 'baz' } } ]
*/
}
3 changes: 3 additions & 0 deletions langchain/.env.example
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,6 @@ ANTHROPIC_API_KEY=ADD_YOURS_HERE
REPLICATE_API_KEY=ADD_YOURS_HERE
MONGO_URI=ADD_YOURS_HERE
MILVUS_URL=ADD_YOURS_HERE
WEAVIATE_HOST=ADD_YOURS_HERE
WEAVIATE_SCHEME=ADD_YOURS_HERE
WEAVIATE_API_KEY=ADD_YOURS_HERE
3 changes: 3 additions & 0 deletions langchain/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,9 @@ vectorstores/chroma.d.ts
vectorstores/hnswlib.cjs
vectorstores/hnswlib.js
vectorstores/hnswlib.d.ts
vectorstores/weaviate.cjs
vectorstores/weaviate.js
vectorstores/weaviate.d.ts
vectorstores/mongo.cjs
vectorstores/mongo.js
vectorstores/mongo.d.ts
Expand Down
18 changes: 16 additions & 2 deletions langchain/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,9 @@
"vectorstores/hnswlib.cjs",
"vectorstores/hnswlib.js",
"vectorstores/hnswlib.d.ts",
"vectorstores/weaviate.cjs",
"vectorstores/weaviate.js",
"vectorstores/weaviate.d.ts",
"vectorstores/mongo.cjs",
"vectorstores/mongo.js",
"vectorstores/mongo.d.ts",
Expand Down Expand Up @@ -293,6 +296,7 @@
"eslint-plugin-no-instanceof": "^1.0.1",
"eslint-plugin-prettier": "^4.2.1",
"eslint-plugin-tree-shaking": "^1.10.0",
"graphql": "^16.6.0",
"hnswlib-node": "^1.4.2",
"html-to-text": "^9.0.5",
"jest": "^29.5.0",
Expand All @@ -310,7 +314,8 @@
"srt-parser-2": "^1.2.2",
"ts-jest": "^29.0.5",
"typeorm": "^0.3.12",
"typescript": "^4.9.5"
"typescript": "^4.9.5",
"weaviate-ts-client": "^1.0.0"
},
"peerDependencies": {
"@aws-sdk/client-lambda": "^3.310.0",
Expand All @@ -337,7 +342,8 @@
"redis": "^4.6.4",
"replicate": "^0.9.0",
"srt-parser-2": "^1.2.2",
"typeorm": "^0.3.12"
"typeorm": "^0.3.12",
"weaviate-ts-client": "^1.0.0"
},
"peerDependenciesMeta": {
"@aws-sdk/client-lambda": {
Expand Down Expand Up @@ -414,6 +420,9 @@
},
"typeorm": {
"optional": true
},
"weaviate-ts-client": {
"optional": true
}
},
"dependencies": {
Expand Down Expand Up @@ -602,6 +611,11 @@
"import": "./vectorstores/hnswlib.js",
"require": "./vectorstores/hnswlib.cjs"
},
"./vectorstores/weaviate": {
"types": "./vectorstores/weaviate.d.ts",
"import": "./vectorstores/weaviate.js",
"require": "./vectorstores/weaviate.cjs"
},
"./vectorstores/mongo": {
"types": "./vectorstores/mongo.d.ts",
"import": "./vectorstores/mongo.js",
Expand Down
2 changes: 2 additions & 0 deletions langchain/scripts/create-entrypoints.js
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ const entrypoints = {
"vectorstores/memory": "vectorstores/memory",
"vectorstores/chroma": "vectorstores/chroma",
"vectorstores/hnswlib": "vectorstores/hnswlib",
"vectorstores/weaviate": "vectorstores/weaviate",
"vectorstores/mongo": "vectorstores/mongo",
"vectorstores/pinecone": "vectorstores/pinecone",
"vectorstores/supabase": "vectorstores/supabase",
Expand Down Expand Up @@ -132,6 +133,7 @@ const requiresOptionalDependency = [
"prompts/load",
"vectorstores/chroma",
"vectorstores/hnswlib",
"vectorstores/weaviate",
"vectorstores/mongo",
"vectorstores/pinecone",
"vectorstores/supabase",
Expand Down
2 changes: 1 addition & 1 deletion langchain/src/vectorstores/base.ts
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ export abstract class VectorStore {
query: string,
k = 4,
filter: object | undefined = undefined
): Promise<[object, number][]> {
): Promise<[Document, number][]> {
return this.similaritySearchVectorWithScore(
await this.embeddings.embedQuery(query),
k,
Expand Down
16 changes: 13 additions & 3 deletions langchain/src/vectorstores/opensearch.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
/* eslint-disable no-instanceof/no-instanceof */
import { Client, RequestParams, errors } from "@opensearch-project/opensearch";
import { v4 as uuid } from "uuid";
import { Embeddings } from "../embeddings/base.js";
Expand Down Expand Up @@ -111,6 +110,7 @@ export class OpenSearchVectorStore extends VectorStore {

const { body } = await this.client.search(search);

// eslint-disable-next-line @typescript-eslint/no-explicit-any
return body.hits.hits.map((hit: any) => [
new Document({
pageContent: hit._source.text,
Expand All @@ -125,7 +125,7 @@ export class OpenSearchVectorStore extends VectorStore {
metadatas: object[] | object,
embeddings: Embeddings,
args: OpenSearchClientArgs
): Promise<VectorStore> {
): Promise<OpenSearchVectorStore> {
const documents = texts.map((text, idx) => {
const metadata = Array.isArray(metadatas) ? metadatas[idx] : metadatas;
return new Document({ pageContent: text, metadata });
Expand All @@ -138,12 +138,21 @@ export class OpenSearchVectorStore extends VectorStore {
docs: Document[],
embeddings: Embeddings,
dbConfig: OpenSearchClientArgs
): Promise<VectorStore> {
): Promise<OpenSearchVectorStore> {
const store = new OpenSearchVectorStore(embeddings, dbConfig);
await store.addDocuments(docs).then(() => store);
return store;
}

static async fromExistingIndex(
embeddings: Embeddings,
dbConfig: OpenSearchClientArgs
): Promise<OpenSearchVectorStore> {
const store = new OpenSearchVectorStore(embeddings, dbConfig);
await store.client.cat.indices({ index: store.indexName });
return store;
}

private async ensureIndexExists(
dimension: number,
engine = "nmslib",
Expand Down Expand Up @@ -210,6 +219,7 @@ export class OpenSearchVectorStore extends VectorStore {
await this.client.cat.indices({ index: this.indexName });
return true;
} catch (err: unknown) {
// eslint-disable-next-line no-instanceof/no-instanceof
if (err instanceof errors.ResponseError && err.statusCode === 404) {
return false;
}
Expand Down
2 changes: 1 addition & 1 deletion langchain/src/vectorstores/tests/opensearch.int.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ import { OpenAIEmbeddings } from "../../embeddings/openai.js";
import { OpenSearchVectorStore } from "../opensearch.js";
import { Document } from "../../document.js";

test("OpenSearchVectorStore integration", async () => {
test.skip("OpenSearchVectorStore integration", async () => {
if (!process.env.OPENSEARCH_URL) {
throw new Error("OPENSEARCH_URL not set");
}
Expand Down
46 changes: 46 additions & 0 deletions langchain/src/vectorstores/tests/weaviate.int.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
/* eslint-disable no-process-env */
import { test, expect } from "@jest/globals";
import weaviate from "weaviate-ts-client";
import { WeaviateStore } from "../weaviate.js";
import { OpenAIEmbeddings } from "../../embeddings/openai.js";
import { Document } from "../../document.js";

test.skip("WeaviateStore", async () => {
// Something wrong with the weaviate-ts-client types, so we need to disable
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const client = (weaviate as any).client({
scheme: process.env.WEAVIATE_SCHEME || "https",
host: process.env.WEAVIATE_HOST || "localhost",
// eslint-disable-next-line @typescript-eslint/no-explicit-any
apiKey: new (weaviate as any).ApiKey(
process.env.WEAVIATE_API_KEY || "default"
),
});
const store = await WeaviateStore.fromTexts(
["hello world", "hi there", "how are you", "bye now"],
[{ foo: "bar" }, { foo: "baz" }, { foo: "qux" }, { foo: "bar" }],
new OpenAIEmbeddings(),
{
client,
indexName: "Test",
textKey: "text",
metadataKeys: ["foo"],
}
);

const results = await store.similaritySearch("hello world", 1);
expect(results).toEqual([
new Document({ pageContent: "hello world", metadata: { foo: "bar" } }),
]);

const results2 = await store.similaritySearch("hello world", 1, {
where: {
operator: "Equal",
path: ["foo"],
valueText: "baz",
},
});
expect(results2).toEqual([
new Document({ pageContent: "hi there", metadata: { foo: "baz" } }),
]);
});
Loading