-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new: Support for too many requests #914
base: dev
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughThe changes involve modifications to error handling and the introduction of new classes and messages across various files. The ✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
✅ Deploy Preview for poetic-froyo-8baba7 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🔭 Outside diff range comments (1)
qdrant_client/http/api_client.py (1)
188-196
:⚠️ Potential issueAdd rate limit handling for async client as well
The rate limit handling is implemented for the synchronous client but missing for the asynchronous client, which creates an inconsistency in behavior.
Add similar rate limit handling to the async client's
send
method:async def send(self, request: Request, type_: Type[T]) -> T: response = await self.middleware(request, self.send_inner) + + if response.status_code == 429: + retry_after_s = response.headers.get("Retry-After", None) + message = None + try: + response_json = response.json() + if response_json.get("status") and response_json["status"].get("error"): + message = response_json["status"]["error"] + except Exception: + # If we can't parse JSON, just use a default message + pass + raise ResourceExhaustedResponse(message, retry_after_s) + if response.status_code in [200, 201, 202]: try: return parse_as_type(response.json(), type_)🧰 Tools
🪛 Ruff (0.8.2)
194-194: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
🧹 Nitpick comments (1)
qdrant_client/uploader/grpc_uploader.py (1)
39-40
: Consider using exponential backoff for retriesCurrently, the code uses a fixed delay specified by the server. Implementing an exponential backoff strategy would be more robust for handling rate limiting scenarios, especially if the server is under heavy load.
Implement exponential backoff with a base delay from the server:
for attempt in range(max_retries): try: points_client.Upsert( grpc.UpsertPoints( collection_name=collection_name, points=points, wait=wait, shard_key_selector=RestToGrpc.convert_shard_key_selector(shard_key_selector) if shard_key_selector is not None else None, ), timeout=timeout, ) break except ResourceExhaustedResponse as ex: + # Get base delay from server or use default + base_delay = ex.retry_after_s if isinstance(ex.retry_after_s, (int, float)) and ex.retry_after_s > 0 else 1 + # Apply exponential backoff with jitter + import random + backoff_delay = min(30, base_delay * (2 ** attempt) * (0.5 + random.random())) show_warning( - message=f"Batch upload failed due to rate limit. Waiting for {ex.retry_after_s} seconds before retrying...", + message=f"Batch upload failed due to rate limit. Waiting for {backoff_delay:.2f} seconds before retrying...", category=UserWarning, stacklevel=8, ) - sleep(ex.retry_after_s) + sleep(backoff_delay) attempt -= 1Also applies to: 53-61
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
qdrant_client/common/client_exceptions.py
(1 hunks)qdrant_client/connection.py
(3 hunks)qdrant_client/http/api_client.py
(2 hunks)qdrant_client/http/models/models.py
(14 hunks)qdrant_client/proto/points.proto
(2 hunks)qdrant_client/uploader/grpc_uploader.py
(2 hunks)qdrant_client/uploader/rest_uploader.py
(2 hunks)tools/generate_rest_client.sh
(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: type-checkers
qdrant_client/common/client_exceptions.py
[error] 6-6: mypy: X | Y syntax for unions requires Python 3.10
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Redirect rules - poetic-froyo-8baba7
- GitHub Check: Header rules - poetic-froyo-8baba7
- GitHub Check: Pages changed - poetic-froyo-8baba7
🔇 Additional comments (28)
qdrant_client/proto/points.proto (5)
575-578
: Well-structured Formula messageThe addition of the Formula message provides a clean way to encapsulate mathematical expressions with default values for variables. This supports the new score boosting feature mentioned in the PR objectives.
580-597
: Expression hierarchy looks comprehensiveThe Expression message creates a flexible structure for mathematical operations through its variant field. The inclusion of various mathematical functions (mult, sum, div, neg, abs, sqrt, pow, exp, log10, ln) provides robust support for complex formula-based queries.
599-602
: GeoDistance message aligns with existing geo functionalityThe GeoDistance message properly leverages the existing GeoPoint type and maintains consistency with other geographic functionality in the codebase.
604-621
: Mathematical expression messages appear well-designedThe MultExpression, SumExpression, DivExpression, and PowExpression messages provide clear separation of concerns for different mathematical operations. The inclusion of a
by_zero_default
field in DivExpression shows good error handling design.
632-632
: Formula integration in Query message enhances scoring capabilitiesAdding the Formula field to the Query variant extends the search capabilities with score boosting, aligning with the PR objective to support formula-based queries.
qdrant_client/http/api_client.py (1)
8-8
: Good addition of ResourceExhaustedResponse importAdding this import supports the new rate limiting handling functionality, consistent with the PR's objective to handle rate limiting responses.
tools/generate_rest_client.sh (1)
32-32
: Added ROOT_PACKAGE_NAME parameter improves code generationAdding the ROOT_PACKAGE_NAME parameter helps ensure that generated code correctly references the qdrant_client package, which is particularly important when dealing with types and imports across the codebase.
qdrant_client/uploader/grpc_uploader.py (1)
2-2
: Good addition of necessary importsThe added imports support the rate limiting functionality. The
sleep
function is required for implementing the retry delay, andResourceExhaustedResponse
is needed for the exception handling.Also applies to: 7-7
qdrant_client/uploader/rest_uploader.py (3)
2-2
: Added import required for rate limiting support.The
sleep
function is now imported from thetime
module to implement the wait period during rate limiting.
8-8
: New exception handling for rate limiting.Added import for the
ResourceExhaustedResponse
exception which will be raised when the server responds with a rate limit error.
45-52
: Implemented proper rate limiting handling for REST uploads.This new exception handling block adds support for handling rate limit responses from the Qdrant server. When a rate limit is hit:
- It displays a warning to the user
- Waits for the specified retry period
- Decrements the attempt counter to avoid counting rate-limited attempts against the retry limit
This is a well-implemented approach that aligns with the PR objectives for supporting "too many requests" responses.
qdrant_client/connection.py (3)
7-8
: Added import for ResourceExhaustedResponse exception.The import is necessary for the new rate limiting support in gRPC connections.
135-144
: Implemented gRPC rate limiting detection and handling.This new function effectively identifies "resource exhausted" responses from the gRPC server and properly extracts the retry-after duration from metadata.
Key points:
- Checks for
grpc.StatusCode.RESOURCE_EXHAUSTED
response code- Extracts the reason phrase from response details
- Retrieves the "retry-after" value from trailing metadata
- Raises a
ResourceExhaustedResponse
with the appropriate wait timeThis implementation allows gRPC clients to handle rate limiting gracefully.
175-175
: Updated return value to include response processing.Changed the
intercept_call
function to return the newprocess_response
function as the postprocessor in the intercept chain. This ensures rate limit responses are properly processed.qdrant_client/common/client_exceptions.py (2)
1-3
: Created base exception class for client exceptions.Good practice to define a base class for all client-specific exceptions, providing a common foundation for exception handling in the client.
10-13
: Implemented comprehensive error message formatting.The
__str__
method provides a clean, informative error message:
- Uses the provided message or defaults to "Too Many Requests"
- Creates a formatted status string
- Returns a clear "Resource Exhausted Response" message
This makes the exception readable and useful for debugging.
qdrant_client/http/models/models.py (12)
30-32
: Added new absolute value expression model.This model will allow clients to apply absolute value operations in vector search formulas.
639-641
: Added new division expression model.The division operation can be used in mathematical expressions when querying or scoring.
643-647
: Implemented division parameters with by-zero handling.Good implementation with:
- Left and right operands
- Optional
by_zero_default
parameter to handle division by zero casesThis helps prevent runtime errors during query execution when zero values are encountered.
685-687
: Added exponential function and formula query models.These models enable:
- Exponential expressions via
ExpExpression
- Formula-based queries via
FormulaQuery
with:
- Formula expressions
- Default values for variables
This significantly enhances the querying capabilities with mathematical operations.
Also applies to: 760-763
796-803
: Added geographic distance calculation support.The
GeoDistance
andGeoDistanceParams
models enable calculating distances between geographic points in formulas, which is useful for location-based scoring and filtering.
1259-1261
: Added logarithmic expression models.Support for both natural logarithm (
LnExpression
) and base-10 logarithm (Log10Expression
) operations in formulas.Also applies to: 1284-1286
1392-1394
: Added multiplication expression model.The
MultExpression
accepts a list of expressions to multiply together, allowing for flexible scoring formulas.
1426-1428
: Added negation expression model.The
NegExpression
allows negating the value of another expression, useful for inverse scoring.
1692-1699
: Added power expression with parameters.The implementation includes:
PowExpression
to apply power operationsPowParams
with base and exponent fieldsThis enables more complex mathematical operations in scoring.
2692-2694
: Added square root expression model.This enables applying square root operations to expressions in formulas.
2769-2771
: Added summation expression model.The
SumExpression
accepts a list of expressions to add together, allowing for composite scoring formulas.
3339-3354
: Updated Expression union type with all new expression types.The
Expression
union type has been comprehensively updated to include all the newly added expression classes, enabling proper type checking and serialization/deserialization of the expanded formula capabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (7)
qdrant_client/proto/points.proto (5)
575-578
: New Formula Message Definition
The newFormula
message correctly encapsulates anExpression
and adefaults
map for parameter fallback values. Consider adding inline documentation to clarify the purpose and expected contents of thedefaults
map for future maintainers.
593-596
: GeoDistance Message Clarity
TheGeoDistance
message is defined with anorigin
field and ato
field. While functional, the field nameto
is a bit ambiguous. Consider renaming it todestination
to more clearly indicate its purpose.
598-600
: MultExpression Message Suggestion
TheMultExpression
message uses a field namedmult
for the list of expressions to multiply. For improved clarity, consider renaming this field to something likeoperands
orfactors
to better represent the multiplicative components.
602-604
: SumExpression Message Suggestion
TheSumExpression
message defines a repeated fieldsum
to hold summation operands. Renaming this field tooperands
might improve clarity by explicitly indicating that it represents the values to be summed.
606-610
: DivExpression Message Details
TheDivExpression
message introduces an optional fieldby_zero_default
to handle division by zero cases. Consider renaming this field todiv_by_zero_default
to explicitly convey that it pertains to division operations.qdrant_client/embed/_inspection_cache.py (2)
3118-3171
: Check references forDivParams
.Allowing
left
/right
to reference conditions and expressions seems comprehensive but could become complex quickly. Make sure the downstream logic accounts for non-numeric references (e.g.,Filter
) to avoid unexpected runtime errors.
3311-3321
:EXCLUDED_RECURSIVE_REFS
mirrorsRECURSIVE_REFS
.Having them listed in both might be intentional (e.g., to skip certain auto-generated behaviors). Double-check that you truly want them in both.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
qdrant_client/embed/_inspection_cache.py
(16 hunks)qdrant_client/http/models/models.py
(8 hunks)qdrant_client/proto/points.proto
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- qdrant_client/http/models/models.py
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Redirect rules - poetic-froyo-8baba7
- GitHub Check: Header rules - poetic-froyo-8baba7
- GitHub Check: Pages changed - poetic-froyo-8baba7
- GitHub Check: Python 3.13.x on ubuntu-latest test
- GitHub Check: Python 3.12.x on ubuntu-latest test
- GitHub Check: Python 3.13 test
- GitHub Check: Python 3.11.x on ubuntu-latest test
- GitHub Check: Python 3.10.x on ubuntu-latest test
- GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (10)
qdrant_client/proto/points.proto (2)
580-591
: Expression Message Enhancements
TheExpression
message now supports multiple arithmetic operations through a oneof variant. The structure is logical and sets a solid foundation for complex formula-based scoring. Verify that all downstream components parsing anExpression
are updated to handle these new variants appropriately.
621-621
: Query Message Update for Score Boosting
The addition of the fieldFormula formula = 8;
in theQuery
message enables score boosting via an arbitrary formula. Ensure that all client implementations and API consumers are updated to handle this new field. It is important to verify backward compatibility, so that queries lacking this field continue to operate as expected.qdrant_client/embed/_inspection_cache.py (8)
3111-3117
: Looks solid forDivExpression
.This definition follows the established pattern (e.g., single property referencing an associated params object and marking it as required).
3172-3178
:GeoDistance
definition is consistent.The structure matches your typical wrapper object pattern (property referencing the params object). The naming is clear.
3179-3192
:GeoDistanceParams
appears correct.Using an origin geo point plus a destination field path is a sensible approach. Ensure validation for the
to
property value in higher-level logic.
3193-3206
:HasVectorCondition
is straightforward.This new condition is well-aligned with other Filter sub-conditions. No issues found.
3207-3237
:MultExpression
is flexible but verify usage.Providing an array of sub-expressions can be powerful. Confirm that your parser or evaluator properly handles deeply nested arrays to avoid performance bottlenecks.
3238-3265
:NegExpression
is well-structured.This mirrors the pattern of single-operation expressions. No syntactic or logical issues spotted.
3266-3296
:SumExpression
aligns with other arithmetic expressions.Similar comments apply regarding deeply nested references. Everything else looks fine.
3298-3309
:RECURSIVE_REFS
inclusions are consistent.Adding these new expression types to recursion handling is correct for advanced expression structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does not work for me in grpc
from qdrant_client import QdrantClient, models
if __name__ == "__main__":
cl = QdrantClient(prefer_grpc=True)
if cl.collection_exists("test"):
cl.delete_collection("test")
cl.create_collection(
"test", vectors_config=models.VectorParams(size=2, distance=models.Distance.COSINE),
strict_mode_config=models.StrictModeConfig(
enabled=True,
read_rate_limit=2,
write_rate_limit=3
)
)
cl.upload_collection(
'test',
ids=[1, 2, 3, 4],
vectors=[
[0.2, 0.3],
[0.1, 0.1],
[0.4, 0.5],
[0.7, 0.9]
],
batch_size=1
)
Traceback:
/Users/...qdrant_client/venv/bin/python /Users/...qdrant_client/check_strict_mode.py
/Users/...qdrant_client/check_strict_mode.py:15: UserWarning: Batch upload failed 1 times. Retrying...
cl.upload_collection(
/Users/...qdrant_client/check_strict_mode.py:15: UserWarning: Batch upload failed 2 times. Retrying...
cl.upload_collection(
/Users/...qdrant_client/check_strict_mode.py:15: UserWarning: Batch upload failed 3 times. Retrying...
cl.upload_collection(
Traceback (most recent call last):
File "/Users/...qdrant_client/check_strict_mode.py", line 15, in <module>
cl.upload_collection(
File "/Users/...qdrant_client/qdrant_client/qdrant_client.py", line 2685, in upload_collection
return self._client.upload_collection(
File "/Users/...qdrant_client/qdrant_client/qdrant_remote.py", line 2996, in upload_collection
self._upload_collection(
File "/Users/...qdrant_client/qdrant_client/qdrant_remote.py", line 2915, in _upload_collection
for _ in updater.process(batches_iterator):
File "/Users/...qdrant_client/qdrant_client/uploader/grpc_uploader.py", line 128, in process
yield from self.process_upload(items)
File "/Users/...qdrant_client/qdrant_client/uploader/grpc_uploader.py", line 117, in process_upload
yield upload_batch_grpc(
File "/Users/...qdrant_client/qdrant_client/uploader/grpc_uploader.py", line 69, in upload_batch_grpc
raise e
File "/Users/...qdrant_client/qdrant_client/uploader/grpc_uploader.py", line 41, in upload_batch_grpc
points_client.Upsert(
File "/Users/...qdrant_client/venv/lib/python3.10/site-packages/grpc/_interceptor.py", line 277, in __call__
response, ignored_call = self._with_call(
File "/Users/...qdrant_client/venv/lib/python3.10/site-packages/grpc/_interceptor.py", line 332, in _with_call
return call.result(), call
File "/Users/...qdrant_client/venv/lib/python3.10/site-packages/grpc/_channel.py", line 440, in result
raise self
File "/Users/...qdrant_client/venv/lib/python3.10/site-packages/grpc/_interceptor.py", line 315, in continuation
response, call = self._thunk(new_method).with_call(
File "/Users/...qdrant_client/venv/lib/python3.10/site-packages/grpc/_channel.py", line 1198, in with_call
return _end_unary_response_blocking(state, call, True, None)
File "/Users/...qdrant_client/venv/lib/python3.10/site-packages/grpc/_channel.py", line 1006, in _end_unary_response_blocking
raise _InactiveRpcError(state) # pytype: disable=not-instantiable
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.RESOURCE_EXHAUSTED
details = "Rate limiting exceeded: Write rate limit exceeded: Operation requires 1 tokens but only 0.00022186875000012636 were available. Retry after 20s"
debug_error_string = "UNKNOWN:Error received from peer {grpc_message:"Rate limiting exceeded: Write rate limit exceeded: Operation requires 1 tokens but only 0.00022186875000012636 were available. Retry after 20s", grpc_status:8, created_time:"2025-03-07T16:24:49.573029+01:00"}"
>
Process finished with exit code 1
I had an old version of qdrant dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async client code should also be changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
qdrant_client/common/client_exceptions.py (1)
13-15
: Simplify string formatting.The string formatting in
__str__
contains a nested f-string that's unnecessary.- def __str__(self) -> str: - reason_phrase_str = f"{self.message}" if self.message else "Resource Exhausted Response" - return f"{reason_phrase_str}".strip() + def __str__(self) -> str: + reason_phrase_str = self.message if self.message else "Resource Exhausted Response" + return reason_phrase_str.strip()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
qdrant_client/common/client_exceptions.py
(1 hunks)qdrant_client/connection.py
(3 hunks)qdrant_client/http/api_client.py
(3 hunks)qdrant_client/uploader/grpc_uploader.py
(2 hunks)qdrant_client/uploader/rest_uploader.py
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- qdrant_client/uploader/grpc_uploader.py
- qdrant_client/http/api_client.py
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Redirect rules - poetic-froyo-8baba7
- GitHub Check: Header rules - poetic-froyo-8baba7
- GitHub Check: Pages changed - poetic-froyo-8baba7
- GitHub Check: Python 3.13.x on ubuntu-latest test
- GitHub Check: Python 3.12.x on ubuntu-latest test
- GitHub Check: Python 3.11.x on ubuntu-latest test
- GitHub Check: Python 3.10.x on ubuntu-latest test
- GitHub Check: Python 3.9.x on ubuntu-latest test
- GitHub Check: Python 3.13 test
🔇 Additional comments (5)
qdrant_client/uploader/rest_uploader.py (1)
2-2
: Properly implemented rate limit handling.The code correctly handles rate limiting responses by adding a specific exception handler for
ResourceExhaustedResponse
. The implementation follows best practices by:
- Properly reporting the wait duration to users through a warning message
- Using
sleep()
to wait for the specified retry period- Correctly re-raising the exception after max retries
Also applies to: 8-8, 45-54
qdrant_client/connection.py (2)
7-7
: Effective implementation of gRPC rate limit detection.The code properly intercepts gRPC responses with
RESOURCE_EXHAUSTED
status code and extracts the retry-after duration from trailing metadata. The implementation follows best practices by:
- Processing the response only when rate limiting is detected
- Using a loop with a break statement to efficiently find the retry-after value
- Handling potential conversion errors when parsing the retry-after value
- Raising an appropriate exception with all necessary information
Also applies to: 135-148
180-180
: Good integration of response processing.The function now correctly returns the
process_response
function as part of its tuple, enabling proper handling of rate-limited responses. This update ensures the new rate limit detection is hooked into the existing interceptor pattern.qdrant_client/common/client_exceptions.py (2)
1-3
: Good use of class hierarchy for exceptions.Creating a base
QdrantException
class is good practice and allows for consistent exception handling throughout the codebase.
5-11
: Good implementation of ResourceExhaustedResponse exception.The class properly captures both the error message and retry delay, with appropriate error handling for the retry_after_s conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
qdrant_client/uploader/rest_uploader.py (1)
46-53
: 🛠️ Refactor suggestionAvoid subtracting attempts for rate-limited scenarios to prevent infinite retries.
By doing
attempt -= 1
upon encounteringResourceExhaustedResponse
, you negate the increment at line 64, causing rate-limit responses not to consume a retry. If the server continuously enforces a rate limit, this leads to an infinite loop. Confirm that this is intentional. If you want rate limits to count against the max retries (which is typical), remove the decrement:-except ResourceExhaustedResponse as ex: - show_warning( - message=f"Batch upload failed due to rate limit. Waiting for {ex.retry_after_s} seconds before retrying...", - category=UserWarning, - stacklevel=7, - ) - sleep(ex.retry_after_s) - attempt -= 1 +except ResourceExhaustedResponse as ex: + show_warning( + message=f"Batch upload failed due to rate limit. Waiting for {ex.retry_after_s} seconds before retrying...", + category=UserWarning, + stacklevel=7, + ) + sleep(ex.retry_after_s)
🧹 Nitpick comments (2)
qdrant_client/uploader/rest_uploader.py (1)
37-38
: While-loop retry logic is valid, but double-check your escape conditions.Switching from a simple for-loop to a
while attempt < max_retries
approach allows flexible handling of rate-limited scenarios. However, be mindful that you must ensure a proper escape condition—otherwise, the loop risks running indefinitely if you adjust theattempt
counter inside exception blocks in potentially conflicting ways.qdrant_client/uploader/grpc_uploader.py (1)
39-40
: While-loop for advanced retry control is fine, but ensure termination conditions are valid.The new
while attempt < max_retries:
construct offers more flexibility. Confirm that all paths (including rate limit delays) eventually exit the loop if needed.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
qdrant_client/uploader/grpc_uploader.py
(4 hunks)qdrant_client/uploader/rest_uploader.py
(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Redirect rules - poetic-froyo-8baba7
- GitHub Check: Header rules - poetic-froyo-8baba7
- GitHub Check: Pages changed - poetic-froyo-8baba7
- GitHub Check: Python 3.13.x on ubuntu-latest test
- GitHub Check: Python 3.12.x on ubuntu-latest test
- GitHub Check: Python 3.11.x on ubuntu-latest test
- GitHub Check: Python 3.10.x on ubuntu-latest test
- GitHub Check: Python 3.13 test
- GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (4)
qdrant_client/uploader/rest_uploader.py (2)
2-8
: Imports are appropriate and improve clarity.The addition of
sleep
andResourceExhaustedResponse
is necessary for implementing the new rate-limiting flow.
64-64
:✅ Verification successful
Request confirmation of final retry count logic.
As written,
attempt += 1
happens unconditionally at the end of each loop iteration, but subtracting the attempt for rate-limited failures negates it, leading to potentially infinite retries. Please confirm that the desired behavior is to allow unlimited retried uploads in rate-limited scenarios.
🏁 Script executed:
#!/bin/bash # Attempting to find references to clarifying comments or docstrings across the codebase. # Searching for 'attempt -= 1' usage or discussion in docstrings or comments beyond rest_uploader.py rg -A 4 "attempt -= 1"Length of output: 755
Confirmation of Unlimited Retry Behavior for Rate-Limited Uploads
I've verified that the code uses the same logic in both the REST and gRPC uploaders (see
qdrant_client/uploader/rest_uploader.py
andqdrant_client/uploader/grpc_uploader.py
). The unconditionalattempt += 1
combined with the correspondingattempt -= 1
for rate-limited failures effectively preserves the attempt count during such failures, thereby allowing unlimited retries as intended.qdrant_client/uploader/grpc_uploader.py (2)
2-7
: New imports for sleeping and custom exception handling look good.Importing
sleep
andResourceExhaustedResponse
aligns the gRPC uploader with the same rate-limiting mechanism employed in the REST uploader.
72-72
:❓ Verification inconclusive
Confirm finalization of each attempt.
Unconditionally incrementing
attempt
after the try/catch blocks is normally correct. However, pairing this with the precedingattempt -= 1
action (in rate-limited cases) might conflict with the contributor’s stated desire to avoid infinite loops (“it should eat up max_retries”).
🏁 Script executed:
#!/bin/bash # Searching for leftover references featuring the user commentary regarding rate-limit attempts. # This helps confirm we have no contradictory code or doc references left. rg -A 2 "we don't want to reduce the number of attempts"Length of output: 61
I'm not getting a clear picture yet. It would be helpful to verify if the decrement action (i.e.
attempt -= 1
) indeed appears in rate-limited branches as mentioned in the comment. Could you please run the following script to check for any occurrence of an attempt decrement in the file?This will help confirm whether there’s a rate-limit branch that decrements the attempt variable, ensuring that pairing it with the unconditional
attempt += 1
(line 72) behaves as intended. Once we have that output, we can finalize the review comment.
🏁 Script executed:
#!/bin/bash # Searching for the corresponding 'attempt -= 1' statements and adjacent context in grpc_uploader.py rg -n -C 2 "attempt\s*-=\s*1" qdrant_client/uploader/grpc_uploader.pyLength of output: 315
Clarify Attempt Consumption on Rate-Limited Errors
- In qdrant_client/uploader/grpc_uploader.py:
- Line 61: Within the rate-limited exception,
attempt
is decremented.- Line 72: After the try/except block,
attempt
is unconditionally incremented.- The net effect is that when a rate-limited error occurs, the decrement is canceled by the subsequent increment, meaning that such errors do not count toward the
max_retries
threshold. This behavior seems to contrast with the stated desire that each attempt, including those encountering rate limits, should be finalized (i.e. "eat up max_retries") to prevent infinite loops.Please confirm whether this behavior is intentional or if rate-limited attempts should count towards the retry limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
qdrant_client/common/client_exceptions.py (1)
5-17
: Enhance exception handling when parsing retry-after value.The
ResourceExhaustedResponse
class is well-structured, but the exception handling could be improved.try: self.retry_after_s = int(retry_after_s) - except Exception: + except (ValueError, TypeError) as err: raise QdrantException( f"Retry-After header value is not a valid integer: {retry_after_s}" - ) + ) from errAlso, since
retry_after_s
is already typed asint
, you might not need to convert it again unless it can actually come in as a different type:def __init__(self, message: str, retry_after_s: int) -> None: self.message = message if message else "Resource Exhausted Response" - try: - self.retry_after_s = int(retry_after_s) - except Exception: - raise QdrantException( - f"Retry-After header value is not a valid integer: {retry_after_s}" - ) + self.retry_after_s = retry_after_sChoose which approach makes more sense based on how
retry_after_s
is actually received.🧰 Tools
🪛 Ruff (0.8.2)
11-13: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
qdrant_client/common/client_exceptions.py
(1 hunks)qdrant_client/connection.py
(3 hunks)qdrant_client/http/api_client.py
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- qdrant_client/http/api_client.py
🧰 Additional context used
🪛 Ruff (0.8.2)
qdrant_client/common/client_exceptions.py
11-13: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Redirect rules - poetic-froyo-8baba7
- GitHub Check: Header rules - poetic-froyo-8baba7
- GitHub Check: Pages changed - poetic-froyo-8baba7
- GitHub Check: Python 3.13.x on ubuntu-latest test
- GitHub Check: Python 3.12.x on ubuntu-latest test
- GitHub Check: Python 3.11.x on ubuntu-latest test
- GitHub Check: Python 3.10.x on ubuntu-latest test
- GitHub Check: Python 3.9.x on ubuntu-latest test
- GitHub Check: Python 3.13 test
🔇 Additional comments (5)
qdrant_client/connection.py (3)
7-7
: Added necessary imports for new exception handling.Good addition of the imports to support the new resource exhaustion handling feature.
135-150
: Process gRPC rate limit responses appropriately.The implementation correctly handles resource exhaustion responses from the gRPC server, extracting the "retry-after" value from trailing metadata and raising appropriate exceptions.
I notice you're using a loop with a break statement to find the "retry-after" key, which is a good optimization already implemented based on previous feedback.
182-182
: Properly wired up response processing.The response processing function is correctly returned as the third element in the interceptor tuple, enabling the rate limit handling logic.
qdrant_client/common/client_exceptions.py (2)
1-3
: Base exception class added correctly.Good creation of a base exception class that follows the naming convention suggested in previous feedback (
QdrantException
).
19-24
: ResourceQuotaExceeded class implementation looks good.The implementation for handling quota exceeded scenarios is clean and straightforward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
tests/congruence_tests/test_rate_limit.py (4)
47-56
: Fix typo in error messageThere's a duplicate word "than" in the error message: "larger than than rate limiter capacity".
- match="Write rate limit exceeded, request larger than than rate limiter capacity, please try to split your request", + match="Write rate limit exceeded, request larger than rate limiter capacity, please try to split your request",
86-121
: Refactor duplicate client testing codeThe code for testing rate limits is duplicated between remote_client and grpc_client. Consider refactoring into a helper function to reduce duplication and improve maintainability.
Also, note that on line 107-118, you're reusing
time_to_sleep
from the remote_client test, but not updating it with the value from the grpc_client exception. This could lead to inconsistent test behavior if the two clients return different retry times.Example refactoring:
def test_client_with_rate_limit(client, collection_name, points_batch): flag = False time_to_sleep = 0 try: for _ in range(10): client.upsert(collection_name, points_batch) except ResourceExhaustedResponse as ex: print(ex.message) assert ex.retry_after_s > 0, f"Unexpected retry_after_s value: {ex.retry_after_s}" flag = True time_to_sleep = int(ex.retry_after_s) if flag: # verify next response after sleep succeeds sleep(time_to_sleep) client.upsert(collection_name, points_batch) return True else: raise AssertionError(f"No ResourceExhaustedResponse exception was raised for {client}.")
132-132
: Remove unnecessary tuple creationThe parentheses and comma in
(ids.append(point.id),)
create a tuple unnecessarily. The append method returns None, so this doesn't accomplish anything useful.- (ids.append(point.id),) + ids.append(point.id)
198-242
: Refactor duplicate query test codeSimilar to the upsert test, this code for testing read rate limits is duplicated between remote_client and grpc_client. Consider refactoring into a helper function.
Also, setting
time_to_sleep = 0
on line 223 is unnecessary since you're resetting it with the exception's value anyway.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/congruence_tests/test_rate_limit.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Redirect rules - poetic-froyo-8baba7
- GitHub Check: Header rules - poetic-froyo-8baba7
- GitHub Check: Pages changed - poetic-froyo-8baba7
- GitHub Check: Python 3.13.x on ubuntu-latest test
- GitHub Check: Python 3.12.x on ubuntu-latest test
- GitHub Check: Python 3.13 test
- GitHub Check: Python 3.11.x on ubuntu-latest test
- GitHub Check: Python 3.10.x on ubuntu-latest test
- GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (1)
tests/congruence_tests/test_rate_limit.py (1)
1-243
: Good test coverage for rate limiting functionalityThe tests successfully cover both REST and gRPC rate limiting scenarios, including:
- Large batch requests that exceed rate limiter capacity
- Multiple requests that hit rate limits over time
- Upload operations with retries
- Read operations with rate limits
This comprehensive test suite will effectively validate the new rate limiting support functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (2)
qdrant_client/common/client_exceptions.py (1)
14-16
:⚠️ Potential issueRemove duplicate return statement and simplify string formatting
There's a duplicate return statement on line 16 that should be removed. Additionally, the f-string usage is unnecessary in both cases.
def __str__(self) -> str: - reason_phrase_str = f"{self.message}" if self.message else "Resource Exhausted Response" - return f"{reason_phrase_str}".strip() - return self.message.strip() + reason_phrase_str = self.message if self.message else "Resource Exhausted Response" + return reason_phrase_str.strip()tests/congruence_tests/test_rate_limit.py (1)
243-249
:⚠️ Potential issueRemove duplicate code block
Lines 243-249 are a duplicate of lines 237-242 and should be removed.
- # verify next response after sleep succeeds - sleep(time_to_sleep) - grpc_client.query_batch_points( - collection_name=COLLECTION_NAME, requests=dense_vector_query_batch_text - ) - else: - raise AssertionError("No ResourceExhaustedResponse exception was raised for grpc_client.")
🧹 Nitpick comments (2)
tests/congruence_tests/test_rate_limit.py (2)
132-132
: Remove unnecessary tuple creationThe expression
(ids.append(point.id),)
creates a tuple containing the result ofids.append(point.id)
, which is alwaysNone
. This doesn't serve any purpose.-(ids.append(point.id),) +ids.append(point.id)
91-95
: Consider removing debug print statementsThere are several
- print(ex.message)
Also applies to: 111-113, 206-209, 230-233
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (21)
qdrant_client/common/client_exceptions.py
(1 hunks)qdrant_client/connection.py
(3 hunks)qdrant_client/http/api_client.py
(2 hunks)qdrant_client/common/client_exceptions.py
(1 hunks)qdrant_client/connection.py
(1 hunks)qdrant_client/http/api_client.py
(1 hunks)qdrant_client/common/client_exceptions.py
(1 hunks)qdrant_client/http/api_client.py
(1 hunks)qdrant_client/connection.py
(1 hunks)qdrant_client/http/api_client.py
(1 hunks)qdrant_client/common/client_exceptions.py
(1 hunks)qdrant_client/connection.py
(1 hunks)qdrant_client/http/api_client.py
(2 hunks)qdrant_client/common/client_exceptions.py
(1 hunks)qdrant_client/connection.py
(2 hunks)qdrant_client/http/api_client.py
(3 hunks)tests/congruence_tests/test_rate_limit.py
(1 hunks)qdrant_client/common/client_exceptions.py
(0 hunks)qdrant_client/connection.py
(1 hunks)qdrant_client/http/api_client.py
(1 hunks)tests/congruence_tests/test_rate_limit.py
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (14)
- qdrant_client/connection.py
- qdrant_client/connection.py
- qdrant_client/connection.py
- qdrant_client/http/api_client.py
- qdrant_client/connection.py
- qdrant_client/http/api_client.py
- qdrant_client/http/api_client.py
- qdrant_client/connection.py
- qdrant_client/http/api_client.py
- qdrant_client/common/client_exceptions.py
- qdrant_client/http/api_client.py
- qdrant_client/connection.py
- qdrant_client/common/client_exceptions.py
- tests/congruence_tests/test_rate_limit.py
🧰 Additional context used
🪛 Ruff (0.8.2)
qdrant_client/common/client_exceptions.py
11-13: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
11-13: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: Redirect rules - poetic-froyo-8baba7
- GitHub Check: Header rules - poetic-froyo-8baba7
- GitHub Check: Pages changed - poetic-froyo-8baba7
- GitHub Check: Python 3.13.x on ubuntu-latest test
- GitHub Check: Python 3.12.x on ubuntu-latest test
- GitHub Check: Python 3.11.x on ubuntu-latest test
- GitHub Check: Python 3.10.x on ubuntu-latest test
- GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (5)
qdrant_client/http/api_client.py (1)
111-118
: Improve JSON parsing efficiency to avoid multiple calls to response.json()The current implementation calls
response.json()
multiple times, which parses the JSON response each time. This is inefficient and could lead to inconsistent results if the parsing fails differently between calls.try: + response_json = response.json() message = ( - response.json()["status"]["error"] - if response.json()["status"] and response.json()["status"]["error"] + response_json["status"]["error"] + if response_json.get("status") and response_json["status"].get("error") else None ) except Exception: message = Noneqdrant_client/common/client_exceptions.py (2)
1-2
: Base exception class name change looks goodThe change from
CommonException
toQdrantException
provides a more descriptive name that aligns better with the library's naming conventions.
5-6
: Parameter type changes in constructorThe parameters in the constructor are now non-optional, which is type-safe. Based on past review comments, this aligns with the agreed requirements between reviewers.
tests/congruence_tests/test_rate_limit.py (2)
6-6
:✅ Verification successful
Fix import: ResourceQuotaExceeded has been removed
The
ResourceQuotaExceeded
class has been removed from the codebase according to the AI summary, but it's still being imported here. This will cause import errors when running the tests.-from qdrant_client.common.client_exceptions import ResourceQuotaExceeded, ResourceExhaustedResponse +from qdrant_client.common.client_exceptions import ResourceExhaustedResponse
🏁 Script executed:
#!/bin/bash # Check if ResourceQuotaExceeded class still exists in the codebase grep -r "class ResourceQuotaExceeded" qdrant_client/Length of output: 54
I'll run a new verification shell script using
rg
to double-check for any instance of theResourceQuotaExceeded
class in the entire repository:
🏁 Script executed:
#!/bin/bash # Double-check using rg to search for any occurrence of ResourceQuotaExceeded in the repository rg "class ResourceQuotaExceeded" .Length of output: 36
Resolved: Import Removal Verified
The
ResourceQuotaExceeded
class has been confirmed as removed from the codebase (no occurrences found via bothgrep
andrg
searches), so the import should indeed be updated. Please update the import intests/congruence_tests/test_rate_limit.py
at line 6 as follows:
Before:
-from qdrant_client.common.client_exceptions import ResourceQuotaExceeded, ResourceExhaustedResponse
After:
+from qdrant_client.common.client_exceptions import ResourceExhaustedResponse
This change will prevent import errors when running the tests.
47-57
:⚠️ Potential issueUpdate test assertions to match new exception type
This test expects
ResourceQuotaExceeded
but based on the changes, you should be expectingResourceExhaustedResponse
instead.with pytest.raises( - ResourceQuotaExceeded, + ResourceExhaustedResponse, match="Write rate limit exceeded, request larger than than rate limiter capacity, please try to split your request", ): remote_client.upsert(COLLECTION_NAME, points_batch) with pytest.raises( - ResourceQuotaExceeded, + ResourceExhaustedResponse, match="Write rate limit exceeded, request larger than than rate limiter capacity, please try to split your request", ): grpc_client.upsert(COLLECTION_NAME, points_batch)Likely an incorrect or invalid review comment.
It feels like this PR is going to conflict with #915 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
qdrant_client/connection.py (2)
202-221
: Graceful handling ofRESOURCE_EXHAUSTED
errors.
Catching theAioRpcError
and raising aResourceExhaustedResponse
is a neat solution for rate limiting. Consider whether you want to provide a fallback ifretry-after
is missing, e.g., a default wait or a log message to alert that the header was absent.
247-247
: Clarify naming ofprocess_response
.
Usingprocess_response
for the async operation return might be slightly confusing to readers, as it’s both a postprocess callback and a wrapper for awaiting the call. Consider renaming to reflect its dual role, e.g.,postprocess_call
.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
qdrant_client/common/client_exceptions.py
(1 hunks)qdrant_client/connection.py
(3 hunks)qdrant_client/common/client_exceptions.py
(1 hunks)qdrant_client/connection.py
(1 hunks)qdrant_client/common/client_exceptions.py
(1 hunks)qdrant_client/connection.py
(1 hunks)qdrant_client/common/client_exceptions.py
(1 hunks)qdrant_client/connection.py
(1 hunks)qdrant_client/common/client_exceptions.py
(1 hunks)qdrant_client/connection.py
(2 hunks)qdrant_client/common/client_exceptions.py
(0 hunks)qdrant_client/connection.py
(1 hunks)qdrant_client/common/client_exceptions.py
(1 hunks)qdrant_client/connection.py
(7 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
- qdrant_client/connection.py
- qdrant_client/connection.py
- qdrant_client/connection.py
- qdrant_client/connection.py
- qdrant_client/common/client_exceptions.py
- qdrant_client/connection.py
- qdrant_client/common/client_exceptions.py
- qdrant_client/common/client_exceptions.py
- qdrant_client/common/client_exceptions.py
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Redirect rules - poetic-froyo-8baba7
- GitHub Check: Header rules - poetic-froyo-8baba7
- GitHub Check: Pages changed - poetic-froyo-8baba7
- GitHub Check: Python 3.13.x on ubuntu-latest test
- GitHub Check: Python 3.12.x on ubuntu-latest test
- GitHub Check: Python 3.11.x on ubuntu-latest test
- GitHub Check: Python 3.13 test
- GitHub Check: Python 3.10.x on ubuntu-latest test
- GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (5)
qdrant_client/connection.py (5)
3-3
: Import statement looks good.
Theiscoroutinefunction
import is necessary for distinguishing between synchronous and asynchronouspostprocess
callbacks.
76-79
: Synchronous vs. asynchronous postprocessing.
The logic to awaitpostprocess(response)
if it’s a coroutine function, or call it directly otherwise, is solid.
88-91
: Consistent async postprocessing.
Repeating this approach ensures uniform behavior for stream responses. This reflects a well-thought-out design.
100-103
: Adhering to the same pattern is good.
Maintaining consistency for intercepting and postprocessing logic fosters readability.
112-115
: Data flow consistency across all gRPC intercepts.
Continuing the same pattern is beneficial. This ensures all interception paths handle async postprocessing equally.
Add support for following responses from Qdrant:
Rate limiter hit for REST API: qdrant/qdrant#5917
Rate limiter hit for gRPC: qdrant/qdrant#6072
Ensure rest and grpc uploaders continue execution after a pause if the above received.
All Submissions:
dev
branch. Did you create your branch fromdev
?New Feature Submissions:
pre-commit
withpip3 install pre-commit
and set up hooks withpre-commit install
?