-
Notifications
You must be signed in to change notification settings - Fork 8.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add pg vector index #12338
base: main
Are you sure you want to change the base?
feat: add pg vector index #12338
Conversation
Please link an existing issue or create one in the description. :) |
done~ |
# DONE: create index https://github.com/pgvector/pgvector?tab=readme-ov-file#indexing | ||
# PG hnsw index only support 2000 dimension or less |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# DONE: create index https://github.com/pgvector/pgvector?tab=readme-ov-file#indexing | |
# PG hnsw index only support 2000 dimension or less | |
# pgvector's hnsw index support 2000 or less dimensions | |
# ref: https://github.com/pgvector/pgvector?tab=readme-ov-file#indexing |
SQL_CREATE_INDEX = """ | ||
CREATE INDEX IF NOT EXISTS embedding_cosine_v1_idx ON {table_name} USING hnsw (embedding vector_cosine_ops); | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prevent to use global variables for less memory consumpsion. Consider to use in-line variable inside the function instead.
# DONE: create index https://github.com/pgvector/pgvector?tab=readme-ov-file#indexing | ||
# PG hnsw index only support 2000 dimension or less | ||
if dimension <= 2000: | ||
cur.execute(SQL_CREATE_INDEX.format(table_name=self.table_name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider use sqlalchemy
's text
aka sql_text
method to wrap and templating instead. Or simply format strings.
And how about explictly set the options (m and ef_construction) as well, making the index DDL more readable and helpful?
|
thanks for this suggestion,I have added it~ |
Summary
Close #12341
Screenshots
Checklist
Important
Please review the checklist below before submitting your pull request.
dev/reformat
(backend) andcd web && npx lint-staged
(frontend) to appease the lint gods