-
Notifications
You must be signed in to change notification settings - Fork 604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for atomic state transitions #4893
Conversation
WalkthroughThe changes in this pull request involve modifications to the Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (15)
fiftyone/factory/repos/delegated_operation.py (1)
247-247
: Update the method docstring to reflect the new return behavior.The
update_run_state
method can now returnNone
if no matching records are found due to therequired_state
filter. Consider updating the docstring to indicate this possibility.Apply this change to the docstring:
def update_run_state( self, _id: ObjectId, run_state: ExecutionRunState, result: ExecutionResult = None, run_link: str = None, progress: ExecutionProgress = None, required_state: ExecutionRunState = None, ) -> DelegatedOperationDocument: + """Update the run state of an operation. + + Returns: + DelegatedOperationDocument if the update is successful, or None if + no matching operation is found. + """fiftyone/operators/delegated.py (10)
93-94
: Ensure consistent documentation of default parameter valuesIn the docstring for
set_running
, therequired_state
parameter does not mention its default value ofNone
, whereas other optional parameters likeprogress (None)
andrun_link (None)
do. For consistency, consider adding(None)
afterrequired_state
in the docstring.
104-104
: Maintain consistent parameter formatting in method callsIn the return statement of
set_running
, the parameters are spread over multiple lines, which improves readability. However, inset_queued
andset_scheduled
, the parameters are on a single line. For consistency, consider formatting the parameters inset_queued
andset_scheduled
over multiple lines as done here.
111-112
: Ensure consistent documentation of default parameter valuesSimilar to
set_running
, therequired_state
parameter inset_scheduled
does not mention its default value ofNone
in the docstring. For consistency across all methods, consider adding(None)
afterrequired_state
.
117-118
: Maintain consistent parameter formatting in method callsThe return statement in
set_scheduled
has parameters on a single line. For better readability and consistency with other methods likeset_running
, consider spreading the parameters over multiple lines.
120-132
: Add missingrequired_state
default value in docstring and format parametersIn the docstring for
set_queued
, therequired_state
parameter does not mention its default value ofNone
. Additionally, the return statement's parameters are on a single line.
- Update the docstring to include
(None)
afterrequired_state
for consistency.- Format the parameters in the return statement over multiple lines for readability.
155-156
: Ensure consistent documentation of default parameter valuesIn
set_completed
, therequired_state
parameter in the docstring does not include its default value(None)
. Align this with the documentation style used for other optional parameters.
168-168
: Maintain consistent parameter order and formattingThe parameters in the return statement of
set_completed
are arranged differently compared toset_running
. For consistency:
- Consider ordering the parameters consistently across methods.
- Spread the parameters over multiple lines if not already done for readability.
191-192
: Ensure consistent documentation of default parameter valuesIn the docstring for
set_failed
, include(None)
afterrequired_state
to indicate its default value, maintaining consistency with other method docstrings.
203-203
: Maintain consistent parameter order and formattingAs with other methods, ensure the parameters in the return statement of
set_failed
are ordered and formatted consistently for readability and maintainability.
Line range hint
177-203
: Catch specific exceptions instead of bareexcept
In the
execute_operation
method, there is a bareexcept:
block. It's a good practice to catch specific exceptions to avoid masking unexpected errors.Apply this diff to catch specific exceptions:
- except: + except Exception:This change ensures that system-exiting exceptions like
KeyboardInterrupt
orSystemExit
are not unintentionally caught.tests/unittests/delegated_operators_tests.py (4)
295-295
: Consider adding assertion messages for better clarityAdding custom messages to your assertions can help identify issues more quickly when tests fail, improving debugging efficiency.
Example:
self.assertEqual( len(queued), len(dynamic_docs) + len(static_docs) + initial_queued, "The total number of queued operations should match the expected count" )
303-303
: Consider adding assertion messages for better clarityAdding messages to your assertions enhances test readability and aids in troubleshooting.
Example:
self.assertEqual( len(queued), len(dynamic_docs) + initial_operator_queued, "Queued operations count for the operator should match the expected value" )
311-311
: Consider adding assertion messages for better clarityIncluding custom messages in assertions makes it easier to trace failures during test execution.
Example:
self.assertEqual( len(queued), len(static_docs) + initial_queued, "After setting dynamic docs to running, queued operations should only include static docs" )
315-315
: Consider adding assertion messages for better clarityCustom messages in assertions can significantly aid in diagnosing test failures.
Example:
self.assertEqual( len(running), len(dynamic_docs) + initial_running, "Running operations count should include dynamic docs transitioned to running" )
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (3)
- fiftyone/factory/repos/delegated_operation.py (4 hunks)
- fiftyone/operators/delegated.py (9 hunks)
- tests/unittests/delegated_operators_tests.py (3 hunks)
🔇 Additional comments (7)
fiftyone/factory/repos/delegated_operation.py (1)
304-311
: Ensure consistent handling of theQUEUED
state transition.The added block correctly handles the
QUEUED
run state by settingqueued_at
andupdated_at
timestamps. Verify that this is consistent with handling of other run states and that any additional required fields are also updated.Run the following script to check for consistency across run state updates:
tests/unittests/delegated_operators_tests.py (6)
349-352
: Assertion correctly validates scheduled operations countThe assertion accurately checks that the subset of documents transitioned to
SCHEDULED
state are accounted for.
357-360
: Assertion correctly validates queued operations after state transitionsThe test ensures that only the static documents remain in the
QUEUED
state after transitioning dynamic documents.
361-374
: Effective verification ofset_queued
behavior withrequired_state
The test successfully verifies that
set_queued
transitions documents only when their current state matches the specifiedrequired_state
. Checking the return values confirms that the method behaves as expected.
375-378
: Assertion confirms correct count of queued operations after transitionThe assertion ensures that the total number of queued operations reflects the transitions made, maintaining test accuracy.
379-382
: Assertion validates that no additional documents are scheduledThe test correctly confirms that only the initial scheduled documents remain after transitioning others back to queued.
383-385
: Assertion confirms running operations count remains consistentThe test verifies that the number of running operations is as expected after attempting state transitions, ensuring test reliability.
update=[update], | ||
return_document=pymongo.ReturnDocument.AFTER, | ||
) | ||
|
||
return DelegatedOperationDocument().from_pymongo(doc) | ||
return DelegatedOperationDocument().from_pymongo(doc) if doc is not None else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle potential None
returns from update_run_state
.
Since update_run_state
may now return None
, ensure that all callers properly handle this case to prevent AttributeError
or other issues when accessing properties on a NoneType
object.
Consider auditing the codebase for calls to update_run_state
:
#!/bin/bash
# Description: Find all calls to `update_run_state` and check for handling of `None` returns.
# Find all usages of `update_run_state` in the codebase.
rg --type py 'update_run_state\('
# Review each call site to ensure `None` is handled appropriately.
collection_filter = {"_id": _id} | ||
if required_state is not None: | ||
collection_filter["run_state"] = required_state | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Missing required_state
Parameter in update_run_state
Calls
Several calls to update_run_state
do not include the required_state
parameter, which may prevent proper enforcement of atomic state transitions.
Affected locations:
fiftyone/operators/delegated.py
fiftyone/factory/repos/delegated_operation.py
🔗 Analysis chain
Confirm that the required_state
filter works as intended.
The collection_filter
now includes run_state
when required_state
is provided. Ensure that this correctly enforces atomic state transitions and does not inadvertently affect other operations.
Run the following script to identify usage of update_run_state
and ensure callers are aware of the required_state
parameter:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find all calls to `update_run_state` and check if `required_state` is used appropriately.
# Find all usages of `update_run_state` in the codebase.
rg --type py 'update_run_state\('
# Verify that calls to `update_run_state` either pass `required_state` when needed or handle `None` returns properly.
Length of output: 685
Script:
#!/bin/bash
# Description: Verify that all calls to `update_run_state` appropriately handle the `required_state` parameter.
# Find all calls to `update_run_state` and check if `required_state` is provided
rg --type py 'update_run_state\s*\(' | grep -v 'required_state='
Length of output: 714
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc = self._collection.find_one_and_update( | ||
filter={"_id": _id}, | ||
filter=collection_filter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect usage of update
parameter in find_one_and_update
The update
parameter in find_one_and_update
expects a dictionary, but a list containing update
is being passed. This will lead to a TypeError
or unintended behavior. Please pass the update
dictionary directly without wrapping it in a list.
Apply this diff to fix the issue:
-doc = self._collection.find_one_and_update(
- filter=collection_filter,
- update=[update],
- return_document=pymongo.ReturnDocument.AFTER,
-)
+doc = self._collection.find_one_and_update(
+ filter=collection_filter,
+ update=update,
+ return_document=pymongo.ReturnDocument.AFTER,
+)
Committable suggestion was skipped due to low confidence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure why we are passing it as a list. But I just tested it and both forms work.
I think I'd prefer if we not wrap in a list, because that feels more standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@swheaton, understood. Passing the update
parameter directly without wrapping it in a list is indeed more standard and avoids potential issues. Thanks for confirming.
(_/)
(• _ •)
( )
✏️ Learnings added
Learnt from: swheaton
PR: voxel51/fiftyone#4893
File: fiftyone/factory/repos/delegated_operation.py:328-328
Timestamp: 2024-10-09T02:11:07.684Z
Learning: In the `MongoDelegatedOperationRepo.update_run_state` method, when calling `find_one_and_update`, pass the `update` parameter directly as a dictionary, not wrapped in a list.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
formatting and docstrings
doc = self._collection.find_one_and_update( | ||
filter={"_id": _id}, | ||
filter=collection_filter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you filter on more than Id then you potentially end up with multiple entries per id if upsert is true or you will not update the doc at all if not upserting... is this what should happen or should that be considered an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upsert
is False
here so I don't think that's a case we need to worry about, but definitely something to keep in mind. The intended behavior is for no update to occur if the filter doesn't match.
update=[update], | ||
return_document=pymongo.ReturnDocument.AFTER, | ||
) | ||
|
||
return DelegatedOperationDocument().from_pymongo(doc) | ||
return DelegatedOperationDocument().from_pymongo(doc) if doc is not None else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the method is update run state then if it fails to update shouldn't it error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This behavior is consistent with the underlying find_one_and_update
, which returns None
if no records match the filter. I generally avoid exceptions when there's a reasonable alternative but could be convinced otherwise if you feel strongly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (5)
fiftyone/operators/delegated.py (5)
Line range hint
83-113
: Improve docstring formatting for consistencyThe docstring for the
required_state
parameter could be formatted more consistently with other docstrings in the codebase.Consider updating the docstring as follows:
- required_state (None): an optional - :class:`fiftyone.operators.executor.ExecutionRunState` required - state of the operation. If provided, the update will only be - performed if the referenced operation matches this state. + required_state (None): an optional ExecutionRunState. If provided, + the update will only be performed if the referenced operation + matches this state
116-133
: Improve docstring formatting for consistencyThe docstring for the
required_state
parameter could be formatted more consistently with other docstrings in the codebase.Consider updating the docstring as follows:
- required_state (None): an optional - :class:`fiftyone.operators.executor.ExecutionRunState` required - state of the operation. If provided, the update will only be - performed if the referenced operation matches this state. + required_state (None): an optional ExecutionRunState. If provided, + the update will only be performed if the referenced operation + matches this state
135-153
: Approve new method and suggest docstring improvementThe addition of the
set_queued
method is a good implementation that aligns with the PR objectives. It follows the pattern of other similar methods in the class.Consider updating the docstring for the
required_state
parameter as follows for consistency:- required_state (None): an optional - :class:`fiftyone.operators.executor.ExecutionRunState` required - state of the operation. If provided, the update will only be - performed if the referenced operation matches this state. + required_state (None): an optional ExecutionRunState. If provided, + the update will only be performed if the referenced operation + matches this state
Line range hint
155-191
: Improve docstring formatting for consistencyThe docstring for the
required_state
parameter could be formatted more consistently with other docstrings in the codebase.Consider updating the docstring as follows:
- required_state (None): an optional - :class:`fiftyone.operators.executor.ExecutionRunState` required - state of the operation. If provided, the update will only be - performed if the referenced operation matches this state. + required_state (None): an optional ExecutionRunState. If provided, + the update will only be performed if the referenced operation + matches this state
Line range hint
193-229
: Improve docstring formatting for consistencyThe docstring for the
required_state
parameter could be formatted more consistently with other docstrings in the codebase.Consider updating the docstring as follows:
- required_state (None): an optional - :class:`fiftyone.operators.executor.ExecutionRunState` required - state of the operation. If provided, the update will only be - performed if the referenced operation matches this state. + required_state (None): an optional ExecutionRunState. If provided, + the update will only be performed if the referenced operation + matches this state
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (2)
- fiftyone/factory/repos/delegated_operation.py (4 hunks)
- fiftyone/operators/delegated.py (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- fiftyone/factory/repos/delegated_operation.py
🧰 Additional context used
🔇 Additional comments (1)
fiftyone/operators/delegated.py (1)
Line range hint
1-529
: Summary of changes and their impactThe modifications to the
DelegatedOperationService
class successfully implement the newrequired_state
parameter across multiple methods (set_running
,set_scheduled
,set_completed
,set_failed
) and introduce a newset_queued
method. These changes align well with the PR objectives, enhancing control over state transitions in delegated operations.The implementation is consistent and correct across all modified methods. The addition of the
required_state
parameter allows for more granular control over state transitions, which can help prevent race conditions in concurrent scenarios.The new
set_queued
method complements the existing state transition methods, providing a more complete set of operations for managing the lifecycle of delegated operations.Overall, these changes improve the flexibility and robustness of the
DelegatedOperationService
class. The only suggestions for improvement are minor adjustments to the docstrings for consistency with the rest of the codebase.
See my PR where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* use required_state in DO.execute_operation, for atomicity * minor * unecessary f-strings
let merge after tests pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
tests/unittests/delegated_operators_tests.py (1)
245-246
: Improved test structure, but could benefit from additional clarity.The restructuring of the
test_list_operations
method has significantly improved its comprehensiveness by testing various states of operations (queued, running, scheduled). However, a few enhancements could make the test even more robust and clear:
Consider adding comments to explain the purpose and difference between
dynamic_docs
andstatic_docs
. This would improve code readability and maintainability.The initial counts (
initial_queued
,initial_running
, etc.) are obtained before creating new operations, but they're used in assertions after creating new operations. This might lead to inconsistent results if there are any existing operations in the database. Consider moving these initial count checks to right before the assertions or resetting the database state before the test.To improve clarity, consider adding comments like this:
# dynamic_docs: Operations that will transition through different states dynamic_docs = [] # static_docs: Operations that will remain in the queued state static_docs = []Also applies to: 275-275, 291-291, 294-309, 311-343
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (2)
- fiftyone/operators/delegated.py (8 hunks)
- tests/unittests/delegated_operators_tests.py (11 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- fiftyone/operators/delegated.py
🧰 Additional context used
🔇 Additional comments (4)
tests/unittests/delegated_operators_tests.py (4)
345-397
: Comprehensive testing of set_queued functionality.The additions to the
test_set_run_states
method significantly improve its coverage by testing the newset_queued
functionality, including the use ofrequired_state
. This ensures that the state transitions are working as expected.To further improve readability, consider adding a comment to explain the purpose of
subset_size
:# Define a subset of documents to transition to 'scheduled' state # This allows testing of set_queued with different current states subset_size = 4
620-641
: Valuable new test for execution state requirements.The addition of
test_queued_state_required_to_execute
is a great improvement to the test suite. It verifies that only operations in the queued state can be executed, which helps prevent potential race conditions or inconsistent states.To slightly improve clarity, consider adding an assertion message:
self.assertEqual(doc.run_state, ExecutionRunState.RUNNING, "Operation should remain in RUNNING state after failed execution attempt")This will make it clearer what the test is checking if it fails.
697-697
: Improved handling of renamed datasets in tests.The modifications to
test_rerun_failed
andtest_execute_with_renamed_dataset
are excellent additions. They ensure that the system can handle dataset renames gracefully, both when rerunning failed operations and executing queued operations.For consistency across the test file, consider using a constant for the delegation target string:
DELEGATION_TARGET = "test_target" # Then use it in all relevant places: delegation_target=DELEGATION_TARGET,This would make it easier to update the delegation target across all tests if needed.
Also applies to: 747-747, 812-812, 994-994, 1225-1225, 1255-1255
Line range hint
1-1260
: Overall, excellent improvements to the test suite.The changes to this test file significantly enhance its coverage and effectiveness. Key improvements include:
- More comprehensive testing of operation states and transitions.
- New tests for the
set_queued
functionality and its interaction withrequired_state
.- Robust testing for scenarios involving renamed datasets.
These changes will help ensure the reliability and correctness of the DelegatedOperationService. The minor suggestions provided earlier will further improve the readability and maintainability of the test suite.
What changes are proposed in this pull request?
This PR adds a new
set_queued
method to theDelegatedOperationService
which provides a mechanism for transitioning operators to thequeued
run state.In addition, this updates the existing
update_run_state
to include an optionalrequired_state
field to provide a means to control state transitions. This new field allows actors to perform a state transition only if the operator matches therequired_state
. This enables concurrency-safe state transitions where multiple actors may attempt to transition the same DO. With this new parameter, only the first attempted transition will succeed; subsequent attempts will fail due to arequired_state
mismatch.Important note!!
update_run_state
can now returnNone
if no matching records are found; this is consistent with the behavior of the underlyingfind_one_and_update
call and can be used to determine whether a state transition occurred.How is this patch tested? If it is not, please explain why.
Unit tests
Release Notes
Is this a user-facing change that should be mentioned in the release notes?
notes for FiftyOne users.
(Details in 1-2 sentences. You can just refer to another PR with a description
if this PR is part of a larger change.)
What areas of FiftyOne does this PR affect?
fiftyone
Python library changesSummary by CodeRabbit
New Features
set_queued
method for managing operation states.required_state
parameter across multiple methods.Bug Fixes
required_state
.Tests
set_queued
method and refined assertions for operation state transitions, ensuring accurate tracking of operation statuses.