If you encounter any issues or have questions, feel free to join the Tembo Community Slack for support.
- Rust - Toolchain including
rustc
,cargo
, andrustfmt
- PGRX - Rust-based PostgreSQL extension development framework
- Docker Engine - For running local containers
- psql - Terminal-based front-end to PostgreSQL
- pgmq - PostgreSQL extension for message queues
- pg_cron - PostgreSQL extension for cron-based job scheduling
- pgvector - PostgreSQL extension for vector similarity search
This process is more involved, but can easily be distilled down into a handful of steps.
cargo pgrx init
docker run -d -p 3000:3000 quay.io/tembo/vector-serve:latest
Confirm a successful set up by running the following:
docker ps
git clone https://github.com/tembo-io/pg_vectorize.git
cd pg_vectorize/extension
From within the pg_vectorize/extension directory, run the following, which will install pg_cron
, pgmq
, and pgvector
:
make setup
make run
Once the above command is run, you will be brought into Postgres via psql
.
Run the following command inside the psql
console to enable the extensions:
create extension vectorize cascade
To list out the enabled extensions, run:
\dx
List of installed extensions
Name | Version | Schema | Description
------------+---------+------------+---------------------------------------------------------------------
pg_cron | 1.6 | pg_catalog | Job scheduler for PostgreSQL
pgmq | 1.1.1 | pgmq | A lightweight message queue. Like AWS SQS and RSMQ but on Postgres.
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
vector | 0.6.0 | public | vector data type and ivfflat and hnsw access methods
vectorize | 0.19.0 | vectorize | The simplest way to do vector search on Postgres
(6 rows)
Run the following SHOW command to confirm that the url is set to localhost
:
SHOW vectorize.embedding_service_url;
vectorize.embedding_service_url
-------------------------------------
http://localhost:3000/v1
(1 row)
The following can be found within the this project's README, under Vector Search Example.
Begin by creating a products
table with the dataset that comes included with pg_vectorize
.
CREATE TABLE products (LIKE vectorize.example_products INCLUDING ALL);
INSERT INTO products SELECT * FROM vectorize.example_products;
You can then confirm everything is correct by running the following:
SELECT * FROM products limit 2;
product_id | product_name | description | last_updated_at
------------+--------------+--------------------------------------------------------+-------------------------------
1 | Pencil | Utensil used for writing and often works best on paper | 2023-07-26 17:20:43.639351-05
2 | Laptop Stand | Elevated platform for laptops, enhancing ergonomics | 2023-07-26 17:20:43.639351-05
SELECT vectorize.table(
job_name => 'product_search_hf',
"table" => 'products',
primary_key => 'product_id',
columns => ARRAY['product_name', 'description'],
transformer => 'sentence-transformers/multi-qa-MiniLM-L6-dot-v1'
);
table
---------------------------------------------
Successfully created job: product_search_hf
(1 row)
SELECT * FROM vectorize.search(
job_name => 'product_search_hf',
query => 'accessories for mobile devices',
return_columns => ARRAY['product_id', 'product_name'],
num_results => 3
);
search_results
---------------------------------------------------------------------------------------------
{"product_id": 13, "product_name": "Phone Charger", "similarity_score": 0.8147812194590133}
{"product_id": 6, "product_name": "Backpack", "similarity_score": 0.774306211384604}
{"product_id": 11, "product_name": "Stylus Pen", "similarity_score": 0.7709903789778251}
(3 rows)
Once all of the following is complete, you should be able to access Swagger UI for Tembo-Embedding-Service
at http://localhost:3000/docs and explore.
This is a platform that allows, for example, the input of different sentence-transformers models from Hugging Face.
To check pgrx
logs for debugging:
cat ~/.pgrx/17.log
pg_vectorize
releases are automated through a Github workflow.
The compiled binaries are publish to and hosted at pgt.dev.
To create a release, create a new tag follow a valid semver, then create a release with the same name.
Auto-generate the release notes and/or add more relevant details as needed.