Contributing to pg_vectorize

If you encounter any issues or have questions, feel free to join the Tembo Community Slack for support.

Prerequisites

Rust - Toolchain including rustc, cargo, and rustfmt
PGRX - Rust-based PostgreSQL extension development framework
Docker Engine - For running local containers
psql - Terminal-based front-end to PostgreSQL
pgmq - PostgreSQL extension for message queues
pg_cron - PostgreSQL extension for cron-based job scheduling
pgvector - PostgreSQL extension for vector similarity search

Building from source

This process is more involved, but can easily be distilled down into a handful of steps.

1. Set up pgrx

cargo pgrx init

2. Set up Docker container

docker run -d -p 3000:3000 quay.io/tembo/vector-serve:latest

Confirm a successful set up by running the following:

docker ps

3. Clone and compile `pg_vectorize` and extension dependencies

3.1. Clone and enter directory

git clone https://github.com/tembo-io/pg_vectorize.git

cd pg_vectorize/extension

3.2. Install dependencies

From within the pg_vectorize/extension directory, run the following, which will install pg_cron, pgmq, and pgvector:

make setup

3.3. Compile and run `pg_vectorize`

make run

4. Confirm successful build

4.1. Check extension presence

Once the above command is run, you will be brought into Postgres via psql.

Run the following command inside the psql console to enable the extensions:

create extension vectorize cascade

To list out the enabled extensions, run:

\dx

                                      List of installed extensions
    Name    | Version |   Schema   |                             Description
------------+---------+------------+---------------------------------------------------------------------
 pg_cron    | 1.6     | pg_catalog | Job scheduler for PostgreSQL
 pgmq       | 1.1.1   | pgmq       | A lightweight message queue. Like AWS SQS and RSMQ but on Postgres.
 plpgsql    | 1.0     | pg_catalog | PL/pgSQL procedural language
 vector     | 0.6.0   | public     | vector data type and ivfflat and hnsw access methods
 vectorize  | 0.19.0  | vectorize  | The simplest way to do vector search on Postgres
(6 rows)

4.2 Confirm embedding service url is set to localhost

Run the following SHOW command to confirm that the url is set to localhost:

SHOW vectorize.embedding_service_url;

   vectorize.embedding_service_url
-------------------------------------
 http://localhost:3000/v1
(1 row)

4.3. Load example data

The following can be found within the this project's README, under Vector Search Example.

Begin by creating a products table with the dataset that comes included with pg_vectorize.

CREATE TABLE products (LIKE vectorize.example_products INCLUDING ALL);
INSERT INTO products SELECT * FROM vectorize.example_products;

You can then confirm everything is correct by running the following:

SELECT * FROM products limit 2;

 product_id | product_name |                      description                       |        last_updated_at        
------------+--------------+--------------------------------------------------------+-------------------------------
          1 | Pencil       | Utensil used for writing and often works best on paper | 2023-07-26 17:20:43.639351-05
          2 | Laptop Stand | Elevated platform for laptops, enhancing ergonomics    | 2023-07-26 17:20:43.639351-05

4.4. Sample queries

SELECT vectorize.table(
job_name => 'product_search_hf',
"table" => 'products',
primary_key => 'product_id',
columns => ARRAY['product_name', 'description'],
transformer => 'sentence-transformers/multi-qa-MiniLM-L6-dot-v1'
);

                    table
---------------------------------------------
 Successfully created job: product_search_hf
(1 row)

SELECT * FROM vectorize.search(
job_name => 'product_search_hf',
query => 'accessories for mobile devices',
return_columns => ARRAY['product_id', 'product_name'],
num_results => 3
);

                                       search_results
---------------------------------------------------------------------------------------------
 {"product_id": 13, "product_name": "Phone Charger", "similarity_score": 0.8147812194590133}
 {"product_id": 6, "product_name": "Backpack", "similarity_score": 0.774306211384604}
 {"product_id": 11, "product_name": "Stylus Pen", "similarity_score": 0.7709903789778251}
(3 rows)

5. Local URL

Once all of the following is complete, you should be able to access Swagger UI for Tembo-Embedding-Service at http://localhost:3000/docs and explore. This is a platform that allows, for example, the input of different sentence-transformers models from Hugging Face.

TroubleShooting

To check pgrx logs for debugging:

cat ~/.pgrx/17.log

Releases

pg_vectorize releases are automated through a Github workflow. The compiled binaries are publish to and hosted at pgt.dev. To create a release, create a new tag follow a valid semver, then create a release with the same name. Auto-generate the release notes and/or add more relevant details as needed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CONTRIBUTING.md

CONTRIBUTING.md

Contributing to pg_vectorize

Prerequisites

Building from source

1. Set up pgrx

2. Set up Docker container

3. Clone and compile `pg_vectorize` and extension dependencies

3.1. Clone and enter directory

3.2. Install dependencies

3.3. Compile and run `pg_vectorize`

4. Confirm successful build

4.1. Check extension presence

4.2 Confirm embedding service url is set to localhost

4.3. Load example data

4.4. Sample queries

5. Local URL

TroubleShooting

Releases

Files

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to pg_vectorize

Prerequisites

Building from source

1. Set up pgrx

2. Set up Docker container

3. Clone and compile pg_vectorize and extension dependencies

3.1. Clone and enter directory

3.2. Install dependencies

3.3. Compile and run pg_vectorize

4. Confirm successful build

4.1. Check extension presence

4.2 Confirm embedding service url is set to localhost

4.3. Load example data

4.4. Sample queries

5. Local URL

TroubleShooting

Releases

3. Clone and compile `pg_vectorize` and extension dependencies

3.3. Compile and run `pg_vectorize`