-
Notifications
You must be signed in to change notification settings - Fork 332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve kg throughput #1342
Improve kg throughput #1342
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
1 Skipped Deployment
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to a4a7a20 in 25 seconds
More details
- Looked at
183
lines of code in9
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. py/core/base/providers/embedding.py:28
- Draft comment:
Increasingconcurrent_request_limit
to 256 may lead to resource exhaustion if the system is not equipped to handle it. Ensure system resources can support this change. - Reason this comment was not posted:
Comment did not seem useful.
2. py/core/base/providers/embedding.py:29
- Draft comment:
Increasingmax_retries
to 8 can lead to longer wait times and increased load. Ensure the retry logic is efficient and the system can handle the increased load. - Reason this comment was not posted:
Comment did not seem useful.
Workflow ID: wflow_YWTOROeVwkB9vI5a
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 5231ea4 in 15 seconds
More details
- Looked at
114
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. py/core/providers/orchestration/hatchet.py:39
- Draft comment:
Duplicatedconcurrency
method. Remove the duplicate to avoid redundancy. - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_nKSFA7PclgaKxu9W
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 0cc4671 in 9 seconds
More details
- Looked at
13
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. py/r2r.toml:24
- Draft comment:
The model name change in the PR description does not match the actual change in the code. The description states changing fromopenai/gpt-4o
toazure/gpt-4o
, but the code changes fromazure/gpt-4o
toopenai/gpt-4o
. Please verify the intended change. - Reason this comment was not posted:
Comment did not seem useful.
Workflow ID: wflow_cUWzEj7IfQkV7JRB
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on df6bd31 in 10 seconds
More details
- Looked at
24
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. py/core/providers/kg/postgres.py:161
- Draft comment:
The change fromdocument_ids
todocument_id
in theentity_embedding
table's UNIQUE constraint might affect existing data integrity if there are multiple document IDs associated with a single name. Ensure that this change aligns with the intended data model and that any necessary data migrations are performed. - Reason this comment was not posted:
Comment did not seem useful.
Workflow ID: wflow_8uqCqfDP3izitaTz
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on a6d9630 in 18 seconds
More details
- Looked at
53
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. py/core/providers/kg/postgres.py:339
- Draft comment:
TheGROUP BY
clause still includese.document_ids
, which should be updated toe.document_id
. - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_GD5B5JLCigyZ4Tvs
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on e0175fe in 8 seconds
More details
- Looked at
22
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. py/core/providers/kg/postgres.py:339
- Draft comment:
The removal ofe.entity_id
from theGROUP BY
clause is correct sinceentity_id
is not part of the selected columns in the query. Ensure that this change aligns with the intended logic of the query. - Reason this comment was not posted:
Confidence changes required:0%
The change in the GROUP BY clause is correct as 'entity_id' is not selected in the query, so it should not be in the GROUP BY clause.
Workflow ID: wflow_h2lliYbOoGKzMfky
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
* Feature/revive integration tests (#1343) * add postgres to integration * add postgres to integration * up * rename * hardcode * add back postgres * add back postgres * add pgvector * add pgvector * add pgvector * add pgvector * add pgvector * add pgvector * add pgvector * tweak config docs * fix integration suite * fix integration suite * fix integration suite * up * up * up * up * up * up * up * up * up * update integration test * final user tests * final user tests * Fix validation error on collection creation responses, remove unnecessary error on deletion (#1344) * Don't throw an error when deleting a collection with no documents, fix return on create collection, more js tests * Revert "Don't throw an error when deleting a collection with no documents, fix return on create collection, more js tests" This reverts commit f9f6ead. * Don't throw an error when deleting a collection with no documents, fix return on create collection, more js tests * more * Improve kg throughput (#1342) * try * up * up * space * add it in ingestion * rm ingestion * init * add semaphore * test * rm duplicates * kg_creation_settings * rm semaphores * increase conns * change it back * clean * up * up * up * up --------- Co-authored-by: --global=Shreyas Pimpalgaonkar <--global=shreyas.gp.7@gmail.com> Co-authored-by: emrgnt-cmplxty <68796651+emrgnt-cmplxty@users.noreply.github.com> --------- Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com> Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com> Co-authored-by: --global=Shreyas Pimpalgaonkar <--global=shreyas.gp.7@gmail.com>
* Feature/revive integration tests (SciPhi-AI#1343) * add postgres to integration * add postgres to integration * up * rename * hardcode * add back postgres * add back postgres * add pgvector * add pgvector * add pgvector * add pgvector * add pgvector * add pgvector * add pgvector * tweak config docs * fix integration suite * fix integration suite * fix integration suite * up * up * up * up * up * up * up * up * up * update integration test * final user tests * final user tests * Fix validation error on collection creation responses, remove unnecessary error on deletion (SciPhi-AI#1344) * Don't throw an error when deleting a collection with no documents, fix return on create collection, more js tests * Revert "Don't throw an error when deleting a collection with no documents, fix return on create collection, more js tests" This reverts commit f9f6ead. * Don't throw an error when deleting a collection with no documents, fix return on create collection, more js tests * more * Improve kg throughput (SciPhi-AI#1342) * try * up * up * space * add it in ingestion * rm ingestion * init * add semaphore * test * rm duplicates * kg_creation_settings * rm semaphores * increase conns * change it back * clean * up * up * up * up --------- Co-authored-by: --global=Shreyas Pimpalgaonkar <--global=shreyas.gp.7@gmail.com> Co-authored-by: emrgnt-cmplxty <68796651+emrgnt-cmplxty@users.noreply.github.com> --------- Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com> Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com> Co-authored-by: --global=Shreyas Pimpalgaonkar <--global=shreyas.gp.7@gmail.com>
Important
Increase concurrency limits and retries to improve throughput, with updates in
embedding.py
,llm.py
, andr2r.toml
.concurrent_request_limit
to 256 inembedding.py
,llm.py
, andr2r.toml
.max_retries
to 8 inembedding.py
andllm.py
.max_backoff
to 64.0 inembedding.py
andllm.py
.asyncio.Semaphore
for connection management inrelational.py
.get_connection()
inrelational.py
.r2r.toml
fromopenai/gpt-4o
toazure/gpt-4o
.embedding.py
.kg_workflow.py
.This description was created by for e0175fe. It will automatically update as commits are pushed.