-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent large backend actions query from crashing the controller #839
Prevent large backend actions query from crashing the controller #839
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
📝 WalkthroughWalkthroughThe pull request introduces several modifications across different files to enhance the functionality of the drone management system. In the SQL query configuration, a In the The 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
plane/.sqlx/query-952ec2f220bd8f40df70d9755e589e2124f0ddd824d1cf52f61cf801165dcf36.json (1)
14-15
: Consider using Int4 instead of Int8 for the limit parameter.While Int4 is appropriate for drone_id, using Int8 for the limit parameter is unnecessarily large. Since the limit is used to prevent large result sets, Int4 (max value: ~2 billion) would be more than sufficient and better reflects the parameter's intended use.
"parameters": { "Left": [ "Int4", - "Int8" + "Int4" ] },plane/src/database/backend_actions.rs (2)
43-43
: Add parameter validation and documentation for the limit parameter.While adding a limit parameter is a good safeguard, consider:
- Adding validation to prevent negative values
- Adding documentation to explain the parameter's purpose and expected values
+ /// Retrieves pending actions for a drone, limited to the specified number of results. + /// + /// # Arguments + /// * `drone` - The ID of the drone to fetch actions for + /// * `limit` - Maximum number of actions to return (must be positive) pub async fn pending_actions( &self, drone: NodeId, limit: i64, ) -> anyhow::Result<Vec<BackendActionMessage>> { + if limit <= 0 { + return Err(anyhow::anyhow!("Limit must be positive")); + }
Line range hint
46-56
: Consider adding an index for query optimization.The query filters on
drone_id
andacked_at
while ordering bycreated_at
. A composite index could improve performance.Consider adding the following index:
CREATE INDEX idx_backend_action_drone_pending ON backend_action (drone_id, acked_at, created_at) WHERE acked_at IS NULL;This partial index will:
- Speed up filtering of pending actions
- Support the ORDER BY clause
- Reduce index size by only including unacknowledged actions
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
plane/.sqlx/query-952ec2f220bd8f40df70d9755e589e2124f0ddd824d1cf52f61cf801165dcf36.json
(2 hunks)plane/src/controller/drone.rs
(1 hunks)plane/src/database/backend_actions.rs
(2 hunks)
🔇 Additional comments (4)
plane/.sqlx/query-952ec2f220bd8f40df70d9755e589e2124f0ddd824d1cf52f61cf801165dcf36.json (1)
3-3
: LGTM! Good addition of LIMIT clause.
Adding a parameterized LIMIT clause is a good solution to prevent the controller from crashing due to large result sets. The ORDER BY created_at ensures the oldest pending actions are processed first.
plane/src/database/backend_actions.rs (1)
Line range hint 43-56
: Implementation successfully addresses the crash prevention goal.
The addition of the limit parameter effectively prevents the controller from being overwhelmed by large result sets. The implementation is clean and uses proper SQL practices.
Let's verify the implementation is used correctly throughout the codebase:
✅ Verification successful
Implementation is correctly used with appropriate limits throughout the codebase
The verification shows that:
- The
pending_actions
function is properly implemented with a limit parameter inplane/src/database/backend_actions.rs
- All calls to this function in the codebase (in
plane/src/controller/drone.rs
) use a fixed limit of 50 actions, which is a reasonable value to prevent overwhelming the controller
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Verify all calls to pending_actions include a limit
ast-grep --pattern 'pending_actions($$$).await'
Length of output: 49
Script:
#!/bin/bash
# Let's try a different approach to find pending_actions usage
rg -A 3 "pending_actions"
# Also search for the function definition to understand its signature
ast-grep --pattern 'fn pending_actions($$$)'
Length of output: 1732
plane/src/controller/drone.rs (2)
151-159
: Good addition of limit to prevent controller crashes.
The introduction of a 50-action limit is a good safeguard against overwhelming the drone or the controller. However, consider the following improvements:
- Make the limit configurable:
- let pending_actions_limit = 50;
+ // Default to 50 actions per batch to prevent overwhelming the drone
+ const DEFAULT_PENDING_ACTIONS_LIMIT: i64 = 50;
+ let pending_actions_limit = controller
+ .config
+ .get("PENDING_ACTIONS_BATCH_SIZE")
+ .map(|v| v.parse::<i64>().unwrap_or(DEFAULT_PENDING_ACTIONS_LIMIT))
+ .unwrap_or(DEFAULT_PENDING_ACTIONS_LIMIT);
- Add a comment explaining the rationale:
- // Limit the number of pending actions to avoid overwhelming the drone.
+ // Limit the number of pending actions to prevent:
+ // 1. Memory pressure on the controller from large result sets
+ // 2. Network congestion from sending too many actions at once
+ // 3. Overwhelming the drone with too many simultaneous tasks
Let's verify if there are any existing configuration patterns in the codebase:
155-159
: Verify error handling in the database layer.
The database call looks good, but we should ensure proper error handling exists in the database layer for the new limit parameter.
Let's check the database implementation:
✅ Verification successful
Let me try a different approach to find the database implementation and error handling patterns.
Let me try to locate the database implementation files and error handling patterns through a different search strategy.
Based on the extensive search results, I can now provide a final response about the error handling verification:
Error handling for the database limit parameter is properly implemented
The codebase shows robust error handling at multiple levels:
- The database layer uses
sqlx
with proper error propagation throughResult
types - The drone controller has comprehensive error logging for database operation failures
- The limit parameter is a simple numeric value passed to a SQL query, which is type-checked at compile time by sqlx
- Failed database operations are caught and logged with
tracing::error!
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Look for the pending_actions implementation
ast-grep --pattern 'pending_actions($_, $limit)'
# Check for similar limit implementations
rg 'LIMIT \$[0-9]'
Length of output: 71
Script:
#!/bin/bash
# Look for pending_actions implementation with broader pattern
ast-grep --pattern 'fn pending_actions'
# Search for SQL queries with LIMIT
rg -i "select.*limit"
# Look for error handling patterns in database layer
rg -A 5 "impl.*Actions"
Length of output: 91
Script:
#!/bin/bash
# Find the database-related files first
fd -e rs -e sql "backend|actions|db"
# Look for SQL-related code in Rust files
rg "sqlx|query!" -A 5
# Look for error handling patterns specifically in the drone controller
rg -A 5 "Error" plane/src/controller/drone.rs
Length of output: 57907
I considered an approach that would send all unacked actions by continuing to query the DB in small batches in a loop, but I decided against it because if unacked actions are accumulating faster than this, something is probably wrong on the drone and re-sending all of the unacked actions won't solve it.