-
Notifications
You must be signed in to change notification settings - Fork 605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Main -> develop #5434
Main -> develop #5434
Conversation
Cherry pick builtin use lists
Add support for custom evaluation metrics
* fix test and make safer * wrap in try finally * use oid from var
Add Model Evaluation panel callbacks for segmentation tasks
Update release-notes for v1.3.0
Release v1.3.0
WalkthroughThis pull request introduces comprehensive enhancements to FiftyOne's evaluation and metrics infrastructure. The changes span multiple components, including core dataset functionality, evaluation methods, and UI components. Key modifications include adding support for custom metrics across various evaluation types (detection, classification, segmentation), improving sample retrieval methods, and enhancing the flexibility of evaluation configurations. The changes aim to provide more detailed and customizable performance analysis for machine learning models. Changes
Sequence DiagramsequenceDiagram
participant User
participant EvaluationConfig
participant EvaluationMethod
participant Results
User->>EvaluationConfig: Configure evaluation
User->>EvaluationConfig: Specify custom metrics
EvaluationConfig->>EvaluationMethod: Initialize with config
EvaluationMethod->>Results: Compute standard and custom metrics
Results-->>User: Return comprehensive evaluation results
Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
🧹 Nitpick comments (45)
plugins/panels/model_evaluation/__init__.py (3)
17-17
: Remove the unused import to keep the codebase tidy.
The importimport fiftyone.core.fields as fof
is not used anywhere in the file. Removing it will resolve the static analysis warning and reduce clutter.-import fiftyone.core.fields as fof
🧰 Tools
🪛 Ruff (0.8.2)
17-17:
fiftyone.core.fields
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
589-660
: Validate array indexing for unrecognized classes.
The segmentation logic appears correct, but it relies on_classes_map[x]
,_classes_map[y]
existing. If the user specifies a class label that is not present in_classes_map
, aKeyError
could occur. Consider adding a check that gracefully handles unknown classes by returning an empty view or a user-facing error message.if x not in results._classes_map or y not in results._classes_map: + ctx.ops.notify(f"Class '{x}' or '{y}' not found in segmentation classes", variant="warning") + return
681-803
: Consolidate docstrings and type checks in the new segmentation helper functions.
The newly introduced functions (_get_mask_targets
,_init_segmentation_results
,_get_segmentation_class_ids
,_get_segmentation_conf_mat_ids
,_get_segmentation_tp_fp_fn_ids
,_to_object_ids
) are coherent and well-structured. However, adding short docstrings and minimal type checks will improve readability and reduce runtime errors if unexpected data is passed in.def _get_mask_targets(dataset, gt_field): + """ + Retrieves the mask targets from the dataset for the given ground-truth field. + Returns None if no mask targets are defined. + """ ... def _to_object_ids(ids): + """ + Converts a list of string IDs to bson.ObjectId. + """ return [ObjectId(_id) for _id in ids]fiftyone/operators/evaluation_metric.py (3)
12-43
: Ensure explicit type hints for additional clarity.While the docstring provides thorough explanations of the parameters, embedding type hints (PEP 484) would increase maintainability and clarity for developers consuming this class.
- def __init__( - self, - name, - label=None, - description=None, - eval_types=None, - aggregate_key=None, - lower_is_better=True, - **kwargs, - ): + def __init__( + self, + name: str, + label: str = None, + description: str = None, + eval_types: Optional[List[str]] = None, + aggregate_key: Optional[str] = None, + lower_is_better: bool = True, + **kwargs + ) -> None:
81-94
: Avoid returning an empty list in get_fields when usage isn't planned.If no fields are intended to be populated by default, ensure you have tests verifying that this empty list doesn't cause unexpected behavior downstream.
95-115
: Consider concurrency or multi-user scenario.Some evaluation contexts might involve parallel runs or concurrent modifications of dataset fields. If concurrency is possible, ensure that rename operations are thread-safe or protected by appropriate locks or transaction logic.
fiftyone/utils/eval/regression.py (8)
43-44
: Ensure parameter ordering matches docstring.The docstring references extra parameters, but
custom_metrics
was added aftermethod
. Consider aligning the parameter order in the docstring to reflect the code.
83-84
: Clarify custom_metrics usage in docstring.It’s beneficial to mention how to provide multiple metrics (e.g., list vs dict) in detail. This helps users correctly pass arguments.
99-105
: Validate method parameter earlier.Parsing config after the docstring checks is fine, but you might want to validate the method parameter’s correctness and fallback earlier for better clarity.
115-115
: Monitor potential overhead in compute_custom_metrics.If the number of custom metrics or the dataset size is large, consider implementing caching or concurrency to optimize performance in
compute_custom_metrics
.
254-255
: Use consistent docstring references.In this docstring, “an optional list of custom metrics” is repeated across multiple classes. Factor out a constants definition or inline references to avoid confusion or drift in descriptions across classes.
258-268
: Add type checks or validations for metric.The metric parameter can be a string or callable. Consider at least a runtime check or a robust docstring note for expected signatures to guide user implementations.
587-590
: Adopt the static analysis suggestion for retrieving "custom_metrics".- custom_metrics = kwargs.get("custom_metrics", None) + custom_metrics = kwargs.get("custom_metrics")🧰 Tools
🪛 Ruff (0.8.2)
587-587: Use
kwargs.get("custom_metrics")
instead ofkwargs.get("custom_metrics", None)
Replace
kwargs.get("custom_metrics", None)
withkwargs.get("custom_metrics")
(SIM910)
Line range hint
641-644
: Adopt the static analysis suggestion for retrieving "custom_metrics".- custom_metrics = kwargs.get("custom_metrics", None) + custom_metrics = kwargs.get("custom_metrics")fiftyone/utils/eval/detection.py (5)
46-46
: Improve docstring coverage for custom_metrics.Document the shape/fields of custom_metrics (list vs dict) at the function level for clarity.
139-140
: Maintain alignment between docstring and function signature.Similar to
evaluate_regressions
, mention the newly addedcustom_metrics
param in the docstring so that generated documentation is consistent.
485-486
: Avoid repeated logic in rename_custom_metrics.Check if rename logic can be streamlined with other rename operations to improve maintainability and reduce duplication.
531-532
: Include robust error handling in cleanup.Cleanup operations can become destructive if there’s partial data or unexpected schema changes. Consider checks or logs describing what fields were successfully removed.
Do you want help generating a fallback approach that logs missing fields instead of throwing errors?
641-644
: Adopt the static analysis suggestion for retrieving "custom_metrics".- custom_metrics = kwargs.get("custom_metrics", None) + custom_metrics = kwargs.get("custom_metrics")🧰 Tools
🪛 Ruff (0.8.2)
641-641: Use
kwargs.get("custom_metrics")
instead ofkwargs.get("custom_metrics", None)
Replace
kwargs.get("custom_metrics", None)
withkwargs.get("custom_metrics")
(SIM910)
fiftyone/utils/eval/base.py (6)
75-80
: Narrow the broad exception catchCatching all exceptions might hide critical errors and make debugging difficult. Consider catching more specific exceptions (e.g.,
KeyError
,AttributeError
) to ensure problems can be properly handled.try: operator = foo.get_operator(metric) ... except Exception as e: - logger.warning("Failed to compute metric '%s': Reason: %s", metric, e) + # For instance, catching KeyError or PluginNotFoundError, etc. + logger.warning("Failed to compute metric '%s': %s", metric, e)
85-85
: Remove unnecessary.keys()
usageUse
for metric in self._get_custom_metrics():
instead of iterating over.keys()
. This is cleaner and aligns with Python best practices.-for metric in self._get_custom_metrics().keys(): +for metric in self._get_custom_metrics():🧰 Tools
🪛 Ruff (0.8.2)
85-85: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
101-101
: Use direct dictionary iterationLikewise, omit the
.keys()
call for consistency and clarity.-for metric in self._get_custom_metrics().keys(): +for metric in self._get_custom_metrics():🧰 Tools
🪛 Ruff (0.8.2)
101-101: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
113-113
: Streamline iterationAgain, removing the
.keys()
call results in simpler, more Pythonic code.-for metric in self._get_custom_metrics().keys(): +for metric in self._get_custom_metrics():🧰 Tools
🪛 Ruff (0.8.2)
113-113: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
179-179
: Rename unused loop variableSince
uri
is never used in your loop body, rename it to_uri
to avoid confusion or warnings from linters.-for uri, metric in self.custom_metrics.items(): +for _uri, metric in self.custom_metrics.items():🧰 Tools
🪛 Ruff (0.8.2)
179-179: Loop control variable
uri
not used within loop bodyRename unused
uri
to_uri
(B007)
734-748
: Table printing method is well-designedThe
_print_dict_as_table
function neatly formats numeric values. Consider adding optional headers or table borders for even better clarity.fiftyone/utils/eval/segmentation.py (2)
432-442
: Potential memory overhead for pixel-level matchesCollecting pixel-level matches in a list can be memory-intensive for large segmentation masks. Consider a more aggregated approach (e.g., storing histogram or summary statistics) to reduce memory usage.
602-602
: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)Remove the explicit
None
default to comply with Python best practices and the static analysis suggestion.-custom_metrics = kwargs.get("custom_metrics", None) +custom_metrics = kwargs.get("custom_metrics")🧰 Tools
🪛 Ruff (0.8.2)
602-602: Use
kwargs.get("custom_metrics")
instead ofkwargs.get("custom_metrics", None)
Replace
kwargs.get("custom_metrics", None)
withkwargs.get("custom_metrics")
(SIM910)
fiftyone/utils/eval/classification.py (1)
926-926
: Prefer kwargs.get("custom_metrics") without a defaultThis small tweak simplifies code and aligns with best practices noted by static analysis tools.
-custom_metrics = kwargs.get("custom_metrics", None) +custom_metrics = kwargs.get("custom_metrics")🧰 Tools
🪛 Ruff (0.8.2)
926-926: Use
kwargs.get("custom_metrics")
instead ofkwargs.get("custom_metrics", None)
Replace
kwargs.get("custom_metrics", None)
withkwargs.get("custom_metrics")
(SIM910)
plugins/operators/__init__.py (4)
247-268
: Centralize error handling for nested field validation
The code checks for.
in the field name and bypasses the existence check, which might allow invalid nested paths to slip through. Consider centralizing nested field logic to ensure that whether or not a field name has dots, it is properly validated.
644-670
: Leverage membership checks and reduce boilerplate
- You can replace
for key in schema.keys():
withfor key in schema:
for more concise membership testing.- This logic closely mirrors the “Clear frame field” logic. Consider factoring out the shared patterns into reusable helper functions to keep the code DRY and maintainable.
- for key in schema.keys(): + for key in schema:🧰 Tools
🪛 Ruff (0.8.2)
647-647: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
904-930
: Promote consistent use of dictionary membership
In_delete_sample_field_inputs()
, you can rely onfor key in schema:
rather thanfor key in schema.keys():
. This maintains consistency with modern Python idioms and improves readability.- for key in schema.keys(): + for key in schema:🧰 Tools
🪛 Ruff (0.8.2)
907-907: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
979-1009
: Streamline delete frame field logic
Replaceschema.keys()
withschema
for membership checks, and consider extracting a shared function for both sample and frame field deletion. This reduces duplication and future-proofs maintainability.- for key in schema.keys(): + for key in schema:🧰 Tools
🪛 Ruff (0.8.2)
982-982: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
fiftyone/core/dataset.py (1)
1228-1228
: Consider whether this override is necessaryThis method simply returns
super().first()
. If no additional functionality is needed, you might remove this override and rely on the parent class implementation for clarity.fiftyone/operators/__init__.py (1)
15-15
: Consider clarifying if these imports are for reexports or actual usage.Static analysis flags these imports as unused. If they are meant to be reexported, they are already included in
__all__
by default. Otherwise, remove them or add a reference to avoid confusion.🧰 Tools
🪛 Ruff (0.8.2)
15-15:
.evaluation_metric.EvaluationMetricConfig
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
15-15:
.evaluation_metric.EvaluationMetric
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
fiftyone/__public__.py (1)
166-166
: Remove or justify the unusedSample
import.If
Sample
is not actually consumed or intended for reexport, remove it to satisfy static analysis. Otherwise, confirm that it's needed in the public interface.🧰 Tools
🪛 Ruff (0.8.2)
166-166:
.core.sample.Sample
imported but unusedRemove unused import:
.core.sample.Sample
(F401)
tests/unittests/utils_tests.py (1)
447-490
: Improve database isolation and test reliability
This integration-style test modifies the production database, which could cause interference in shared environments or parallel testing scenarios. Consider using a dedicated test database or mocking database calls to provide isolation and avoid race conditions.fiftyone/utils/eval/activitynet.py (1)
361-361
: Encourage further example-based documentation
Consider adding a usage example or referencing the supported structure forcustom_metrics
to help developers implement custom logic.fiftyone/utils/eval/openimages.py (2)
72-73
: Clarify usage of custom metrics in docstring.
You mention thatcustom_metrics
can be "an optional list of custom metrics or dict mapping metric names to kwargs dicts," but it would be helpful to give examples or clarify how these custom metrics are actually interpreted in the evaluation pipeline.
298-298
: Align docstring with actual support for custom metrics.
InOpenImagesDetectionResults
, the docstring and param name differ slightly from the usage inOpenImagesEvaluationConfig
. If you do allow multiple formats (list/dict), ensure the docstrings and the parameter type hints match the actual use cases. Otherwise, restrict the docstring to the correct type to avoid confusion.Also applies to: 313-313, 323-323
fiftyone/core/fields.py (1)
1654-1668
: Validate shape before convertingrgb_array_to_int
.
The logic is correct for NxMx3 arrays, but consider a shape check or error message if the input is not the expected 3-channel layout, which can lead to mismatched indexing.fiftyone/core/odm/mixins.py (1)
1872-1878
: Improve clarity and efficiency of_remove_nested_paths
.
This helper function filters out paths that are prefixes of others. If performance becomes a concern with large path arrays, consider using sorting or efficient subpath checks. For most use cases, this approach is straightforward and easy to understand.app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (3)
1-1
: Check bundle size impact of new lodash import.
Loading the entire lodash library can increase bundle size. If feasible, import only the methods you need, e.g.,import get from "lodash/get";
, to optimize performance.
1762-1771
: Extend or unify type definitions for custom metrics.
CustomMetric
andCustomMetrics
types provide clarity on shape. If you expect unions or more elaborate metric structures, consider robust type checks or advanced TypeScript features, such as discriminated unions, to handle them.
1773-1782
: Prevent naming collisions inSummaryRow
.
Using top-level property names likeid
orproperty
can conflict if the user’s data has such fields. Consider a more specific naming scheme or adding a prefix to reduce confusion.docs/source/release-notes.rst (1)
Line range hint
1-5
: Consider adding a table of contents for easier navigation.The release notes document is quite long with many versions. A table of contents at the top would help users quickly find specific versions.
Consider adding:
+Table of Contents +================ + +- FiftyOne Teams 2.5.0 +- FiftyOne 1.3.0 +- FiftyOne Teams 2.4.0 +...Also applies to: 143-3876
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (25)
app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx
(3 hunks)docs/source/release-notes.rst
(1 hunks)fiftyone/__public__.py
(1 hunks)fiftyone/core/collections.py
(5 hunks)fiftyone/core/dataset.py
(7 hunks)fiftyone/core/fields.py
(1 hunks)fiftyone/core/odm/mixins.py
(4 hunks)fiftyone/core/view.py
(0 hunks)fiftyone/operators/__init__.py
(1 hunks)fiftyone/operators/evaluation_metric.py
(1 hunks)fiftyone/operators/registry.py
(4 hunks)fiftyone/server/utils.py
(1 hunks)fiftyone/utils/eval/activitynet.py
(5 hunks)fiftyone/utils/eval/base.py
(6 hunks)fiftyone/utils/eval/classification.py
(20 hunks)fiftyone/utils/eval/coco.py
(5 hunks)fiftyone/utils/eval/detection.py
(16 hunks)fiftyone/utils/eval/openimages.py
(5 hunks)fiftyone/utils/eval/regression.py
(18 hunks)fiftyone/utils/eval/segmentation.py
(23 hunks)plugins/operators/__init__.py
(20 hunks)plugins/panels/model_evaluation/__init__.py
(6 hunks)tests/unittests/dataset_tests.py
(0 hunks)tests/unittests/operators/registry_tests.py
(2 hunks)tests/unittests/utils_tests.py
(1 hunks)
💤 Files with no reviewable changes (2)
- fiftyone/core/view.py
- tests/unittests/dataset_tests.py
🧰 Additional context used
📓 Path-based instructions (1)
app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (1)
Pattern **/*.{ts,tsx}
: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.
🪛 Ruff (0.8.2)
fiftyone/operators/__init__.py
15-15: .evaluation_metric.EvaluationMetricConfig
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
15-15: .evaluation_metric.EvaluationMetric
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
fiftyone/__public__.py
166-166: .core.sample.Sample
imported but unused
Remove unused import: .core.sample.Sample
(F401)
plugins/panels/model_evaluation/__init__.py
17-17: fiftyone.core.fields
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
fiftyone/utils/eval/detection.py
641-641: Use kwargs.get("custom_metrics")
instead of kwargs.get("custom_metrics", None)
Replace kwargs.get("custom_metrics", None)
with kwargs.get("custom_metrics")
(SIM910)
fiftyone/utils/eval/classification.py
926-926: Use kwargs.get("custom_metrics")
instead of kwargs.get("custom_metrics", None)
Replace kwargs.get("custom_metrics", None)
with kwargs.get("custom_metrics")
(SIM910)
fiftyone/utils/eval/regression.py
587-587: Use kwargs.get("custom_metrics")
instead of kwargs.get("custom_metrics", None)
Replace kwargs.get("custom_metrics", None)
with kwargs.get("custom_metrics")
(SIM910)
fiftyone/utils/eval/base.py
85-85: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
101-101: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
113-113: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
179-179: Loop control variable uri
not used within loop body
Rename unused uri
to _uri
(B007)
fiftyone/core/dataset.py
1239-1239: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
fiftyone/utils/eval/segmentation.py
602-602: Use kwargs.get("custom_metrics")
instead of kwargs.get("custom_metrics", None)
Replace kwargs.get("custom_metrics", None)
with kwargs.get("custom_metrics")
(SIM910)
plugins/operators/__init__.py
647-647: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
755-755: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
907-907: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
982-982: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
fiftyone/core/collections.py
929-929: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
1019-1019: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.11)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.9)
- GitHub Check: build
🔇 Additional comments (54)
plugins/panels/model_evaluation/__init__.py (2)
151-156
: Graceful fallback forcustom_metrics
.
The newget_custom_metrics
method cleanly handles absent attributes by returningNone
. This is an effective strategy to prevent unexpected errors whencustom_metrics
may not exist. The current approach looks good, but consider logging or capturing these exceptions more explicitly if debugging information is needed.
346-354
: Double-checkresults
validity before proceeding.
Please verify thatctx.dataset.load_evaluation_results
cannot returnNone
or an invalidresults
object. If there's any possibility of a failed load or missing data, consider short-circuiting or handling those conditions explicitly to avoid downstream attribute errors.results = ctx.dataset.load_evaluation_results(computed_eval_key) +if results is None: + ctx.panel.set_data( + f"evaluation_{computed_eval_key}_error", + {"error": "No evaluation results returned"} + ) + returnAlso applies to: 367-367
fiftyone/operators/evaluation_metric.py (1)
45-80
: Verify placeholders for required methods.The methods
get_parameters
,parse_parameters
, andcompute
are stubs. Ensure they are implemented or removed before production to avoid runtime errors in subclasses.fiftyone/utils/eval/regression.py (2)
201-202
: Validate custom metric fields exist.When calling
get_custom_metric_fields
, ensure no exceptions occur if a user accidentally supplies a metric name that’s not actually computed.
415-421
: Consider deeper validation for custom_metrics.Before storing it, you could add checks ensuring all custom metrics conform to the expected
EvaluationMetric
interface. Minimizes runtime issues if developers pass invalid objects.fiftyone/utils/eval/detection.py (2)
23-27
: Check imports from base for versioning.Ensure the imported classes (
BaseEvaluationMethod
, etc.) are always available in the target environment. If older versions of the library are used, users might face import errors.
173-173
: Validate method usage.Before
_parse_config
is called, validate that themethod
is either a string or a class to preemptively catch errors if an invalid type is passed.fiftyone/utils/eval/base.py (3)
27-36
: Well-structured base configuration classThis newly introduced
BaseEvaluationMethodConfig
class cleanly encapsulates configuration details for evaluation methods. The docstring is concise and the inheritance fromfoe.EvaluationMethodConfig
is clearly defined.
45-53
: Efficiently manage custom metrics retrievalThe
_get_custom_metrics
method is a good approach for unifying the parsing of custom metrics from either a list or dictionary. Great job ensuring flexibility for different metric definitions.
159-161
: Consistent merging of standard and custom metricsIt is helpful to see how custom metrics are aggregated into the final dictionary using
_get_custom_metrics()
. This design consolidates all metrics in one place, which aids maintainability.fiftyone/utils/eval/segmentation.py (2)
259-260
: Include custom metric fieldsAppending
self.get_custom_metric_fields
to the list of evaluation fields is consistent with the unified approach to custom metrics. Good integration!
491-495
: Neat addition of custom_metrics to SegmentationResultsThe constructor parameter
custom_metrics
is integrated cleanly, aligning it with the rest of the framework. This approach maintains consistency across various result classes.fiftyone/utils/eval/classification.py (2)
41-41
: Enhanced support for custom metricsAllowing
custom_metrics
inevaluate_classifications
expands the evaluation possibilities. This flexible design supports a wide variety of user-defined metrics.
Line range hint
713-730
: Clean inheritance from BaseClassificationResults
ClassificationResults
now inherits fromBaseClassificationResults
, bringing in a consistent and extensible approach to evaluating custom metrics alongside standard classification metrics.plugins/operators/__init__.py (5)
593-598
: Validate empty or null lists before clearing fields
Ensurefield_names
isn't empty or null when callingclear_sample_fields
. While this code enforces a “required” parameter, consider adding a graceful handling path in case an empty list is still passed unexpectedly.
1124-1167
: Drop Index logic appears solid
No significant issues detected with dropping multiple indexes. The approach is straightforward and correctly iterates through the provided index list.
1443-1481
: Delete summary field logic aligns with multi-field approach
The updated design for deleting multiple summary fields together is consistent with the multi-field pattern used elsewhere. Looks good.
1619-1662
: Group slice deletion approach is coherent
Handling multiple slices and reverting to the default slice if the current slice is deleted is a reasonable design choice. The logic is clearly expressed and robust.
2184-2227
: Workspace deletion handles the current workspace gracefully
The code correctly resets the workspace to the default if the current workspace is removed. This ensures a consistent user experience.fiftyone/core/dataset.py (4)
1255-1258
: Implementation forhead()
looks goodReturning a list comprehension of the first few samples is straightforward and clear.
1272-1275
: Symmetry withhead()
is consistentThis tail implementation closely matches your
head()
approach, ensuring consistent usage of sample slicing.
1306-1306
: Docstring example is clearThe example snippet for handling the
exact
parameter effectively demonstrates the usage.
1318-1319
: Concise query approachUsing
view = self.match(expr)
followed by_aggregate()
is direct and readable. The approach is valid and maintains a clean separation of concerns.tests/unittests/operators/registry_tests.py (2)
8-23
: Test coverage for invalid operator types was removed.The previous test for invalid operator types is no longer present, so potential edge cases may go untested. Consider reintroducing this coverage to maintain robustness.
Would you like me to propose a revised test method to handle invalid operator types?
55-56
: Great approach to verifying the empty contexts scenario.This block effectively confirms that the registry handles no-plugin contexts gracefully. No issues found.
fiftyone/server/utils.py (1)
52-65
: Ensure concurrency safety and robust error handling.Although caching is beneficial, consider verifying concurrency behavior in asynchronous or multithreaded cases. Also, confirm that the dataset name is valid before insertion.
fiftyone/operators/registry.py (3)
9-9
: Proper usage ofetau
utilities
The new import foreta.core.utils
is clear and no issues are detected.
46-47
: Docstring clarifiestype
usage
The updated docstring describing thattype
can be a subclass offiftyone.operators.Operator
is well done.
91-92
: Accurate docstring for list_operators
The docstring expansion here is consistent with the function’s updated capabilities.fiftyone/utils/eval/activitynet.py (5)
43-44
: Docstring addition for custom metrics
Specifyingcustom_metrics
in the docstring is helpful. Adding an example usage in the docstring or reference documentation could benefit users.
55-55
: New parameter in constructor
Thecustom_metrics=None
parameter is integrated cleanly, enhancing the flexibility of the evaluation config.
59-64
: Passing custom_metrics to superclass
No issues found with forwarding thecustom_metrics
argument to the base constructor.
334-334
: Added docstring forcustom_metrics
in results
The added docstring line is consistent with the rest of the code’s structure.
351-351
: Optional parameter in results constructor
The inclusion ofcustom_metrics=None
maintains backwards compatibility and improves extensibility.fiftyone/utils/eval/coco.py (6)
70-71
: Accurate docstring forcustom_metrics
Defining thatcustom_metrics
can be a list or dict sets clear expectations.
88-88
: Constructor extension
Addingcustom_metrics=None
aligns with the broader evaluation framework changes.
92-97
: Forwarding custom metrics to base class
Forwardingcustom_metrics
to the superclass maintains consistency with other evaluation classes.
274-274
: Documentation for custom metrics in results
Providing a mention ofcustom_metrics
in the docstring is beneficial for clarity and completeness.
291-291
: Additional parameter for results
Definingcustom_metrics=None
here is logically consistent with the config changes and ensures a uniform evaluation pipeline.
301-301
: Maintain consistency in the constructor
Passingcustom_metrics
to the superclass ensures that any custom evaluation logic is seamlessly integrated across COCO-style evaluations.fiftyone/core/fields.py (3)
1624-1637
: Ensure robust error handling inhex_to_int
.
While converting from hex to int is correct, consider verifying input format (e.g., presence of#
). A malformed string or shorter length could cause unexpected errors.
1639-1652
: Check color range boundaries inint_to_hex
.
The bitwise approach correctly extracts channels from the integer. Still, validating that the input integer is within the 24-bit range (0–16777215) might prevent silent overflow or negative shifts.
1670-1684
: Consider out-of-range integer values inint_array_to_rgb
.
This function cleanly decodes each channel by shifting bits. However, if the input array contains large or negative integers, slicing them into color components might lead to unintended results.fiftyone/core/odm/mixins.py (3)
673-674
: Prevent misuse of nested paths in_clear_fields
.
By calling_remove_nested_paths(paths)
you ensure only top-level paths are cleared. This is a helpful safeguard against partial or conflicting modifications of nested fields.
721-722
: Maintain consistent behavior for_delete_fields
.
Using_remove_nested_paths
here ensures a uniform approach when deleting fields, just like in_clear_fields
. This consistency strengthens field integrity.
826-827
: Avoid accidental removal of nested fields in_remove_dynamic_fields
.
Applying_remove_nested_paths(paths)
prevents inadvertently removing embedded dynamic fields hidden within larger paths. This is a sensible improvement.app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (2)
469-469
: Ensure type safety forformatCustomMetricRows
.
You’re callingformatCustomMetricRows
withevaluation
andcompareEvaluation
objects. Confirm these objects match the expected shape, especially because the function uses nested property access with_.get
.
1784-1808
: Check for missing operator URIs informatCustomMetricRows
.
You rely onoperatorUri
as an object key. If a user’s custom metric array lacks this field, it could cause indexing issues. Also consider adding fallback logic or placeholders for metrics that are partially defined.fiftyone/core/collections.py (4)
938-938
: LGTM: Efficient implementation oflast()
The implementation correctly uses slicing and
first()
to get the last element efficiently.
950-952
: LGTM: Clear docstring and implementationThe docstring clearly documents the return type and the implementation efficiently uses list comprehension with slicing.
964-966
: LGTM: Consistent withhead()
implementationThe implementation mirrors the
head()
method's approach, maintaining code consistency.
971-998
: LGTM: Well-documented examplesThe docstring provides comprehensive examples that clearly demonstrate the usage of the
one()
method with different scenarios.docs/source/release-notes.rst (2)
6-14
: LGTM! Well-structured release notes for Teams v2.5.0.The release notes are clear, concise and follow a consistent format. They effectively communicate the key changes and bug fixes.
16-141
: LGTM! Comprehensive release notes for v1.3.0.The release notes are well-organized by component (App, SDK, Brain, etc.) and provide detailed information about new features, improvements, and bug fixes. The format is consistent and easy to follow.
Also includes any custom metrics from :attr:`custom_metrics`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be mindful of naming collisions.
Merging custom metrics into the existing metrics dict can cause name conflicts. Consider prefixing custom metric keys or verifying that names don’t collide with built-in metrics.
- metrics.update(self._get_custom_metrics())
+ for key, value in self._get_custom_metrics().items():
+ if key in metrics:
+ raise ValueError(f"Custom metric name '{key}' conflicts with a built-in metric.")
+ metrics[key] = value
Committable suggestion skipped: line range outside the PR's diff.
fields.extend(self.get_custom_metric_fields(samples, eval_key)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against collisions in custom metric fields.
When extending the list of fields with get_custom_metric_fields
, confirm that these fields don’t conflict with existing or future detection fields in the dataset.
- fields.extend(self.get_custom_metric_fields(samples, eval_key))
+ cm_fields = self.get_custom_metric_fields(samples, eval_key)
+ for f in cm_fields:
+ if f in fields:
+ raise ValueError(f"Custom metric field '{f}' conflicts with existing fields.")
+ fields.extend(cm_fields)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
fields.extend(self.get_custom_metric_fields(samples, eval_key)) | |
cm_fields = self.get_custom_metric_fields(samples, eval_key) | |
for f in cm_fields: | |
if f in fields: | |
raise ValueError(f"Custom metric field '{f}' conflicts with existing fields.") | |
fields.extend(cm_fields) |
field_names = ctx.params.get("field_names", None) or [] | ||
|
||
field_prop = inputs.enum( | ||
"field_name", | ||
field_selector.values(), | ||
field_choices = types.AutocompleteView(allow_duplicates=False) | ||
for key in schema.keys(): | ||
if not any(key.startswith(f + ".") for f in field_names): | ||
field_choices.add_choice(key, label=key) | ||
|
||
field_prop = inputs.list( | ||
"field_names", | ||
types.String(), | ||
label="Frame field", | ||
description="The frame field to clear", | ||
view=field_selector, | ||
description="The frame field(s) to clear", | ||
required=True, | ||
view=field_choices, | ||
) | ||
|
||
field_name = ctx.params.get("field_name", None) | ||
if field_name not in schema: | ||
return | ||
for field_name in field_names: | ||
if field_name not in schema and "." not in field_name: | ||
field_prop.invalid = True | ||
field_prop.error_message = ( | ||
f"Frame field '{field_name}' does not exist" | ||
) | ||
return | ||
|
||
field = ctx.dataset.get_field(ctx.dataset._FRAMES_PREFIX + field_name) | ||
if field is not None and field.read_only: | ||
field_prop.invalid = True | ||
field_prop.error_message = f"Frame field '{field_name}' is read-only" | ||
field = ctx.dataset.get_field(ctx.dataset._FRAMES_PREFIX + field_name) | ||
if field is not None and field.read_only: | ||
field_prop.invalid = True | ||
field_prop.error_message = ( | ||
f"Frame field '{field_name}' is read-only" | ||
) | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Adopt consistent membership checks and consider deduplication
Similar to the “Clear sample field” handling, lines like for key in schema.keys():
can be replaced with simpler membership checks. Furthermore, this entire flow is nearly identical to the sample field variant—factoring out shared logic could reduce potential bugs and effort when updates are needed.
- for key in schema.keys():
+ for key in schema:
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
field_names = ctx.params.get("field_names", None) or [] | |
field_prop = inputs.enum( | |
"field_name", | |
field_selector.values(), | |
field_choices = types.AutocompleteView(allow_duplicates=False) | |
for key in schema.keys(): | |
if not any(key.startswith(f + ".") for f in field_names): | |
field_choices.add_choice(key, label=key) | |
field_prop = inputs.list( | |
"field_names", | |
types.String(), | |
label="Frame field", | |
description="The frame field to clear", | |
view=field_selector, | |
description="The frame field(s) to clear", | |
required=True, | |
view=field_choices, | |
) | |
field_name = ctx.params.get("field_name", None) | |
if field_name not in schema: | |
return | |
for field_name in field_names: | |
if field_name not in schema and "." not in field_name: | |
field_prop.invalid = True | |
field_prop.error_message = ( | |
f"Frame field '{field_name}' does not exist" | |
) | |
return | |
field = ctx.dataset.get_field(ctx.dataset._FRAMES_PREFIX + field_name) | |
if field is not None and field.read_only: | |
field_prop.invalid = True | |
field_prop.error_message = f"Frame field '{field_name}' is read-only" | |
field = ctx.dataset.get_field(ctx.dataset._FRAMES_PREFIX + field_name) | |
if field is not None and field.read_only: | |
field_prop.invalid = True | |
field_prop.error_message = ( | |
f"Frame field '{field_name}' is read-only" | |
) | |
return | |
field_names = ctx.params.get("field_names", None) or [] | |
field_choices = types.AutocompleteView(allow_duplicates=False) | |
for key in schema: | |
if not any(key.startswith(f + ".") for f in field_names): | |
field_choices.add_choice(key, label=key) | |
field_prop = inputs.list( | |
"field_names", | |
types.String(), | |
label="Frame field", | |
description="The frame field(s) to clear", | |
required=True, | |
view=field_choices, | |
) | |
for field_name in field_names: | |
if field_name not in schema and "." not in field_name: | |
field_prop.invalid = True | |
field_prop.error_message = ( | |
f"Frame field '{field_name}' does not exist" | |
) | |
return | |
field = ctx.dataset.get_field(ctx.dataset._FRAMES_PREFIX + field_name) | |
if field is not None and field.read_only: | |
field_prop.invalid = True | |
field_prop.error_message = ( | |
f"Frame field '{field_name}' is read-only" | |
) | |
return |
🧰 Tools
🪛 Ruff (0.8.2)
755-755: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
"Expected one matching sample, but found %d matches" | ||
% len(view) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid a potentially expensive len(view)
Calling len(view)
on a large view could be expensive because it must iterate over all matches. If you're only verifying that more than one match exists, you can streamline by catching the second match without iterating the entire view. For instance:
try:
next(matches)
raise ValueError("Expected one matching sample, but found multiple matches")
except StopIteration:
pass
Committable suggestion skipped: line range outside the PR's diff.
sample_view = self[-1:].first() | ||
except ValueError: | ||
raise ValueError("%s is empty" % self.__class__.__name__) | ||
|
||
return self._make_sample(d) | ||
return fos.Sample.from_doc(sample_view._doc, dataset=self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refine exception handling for clarity
Instead of re-raising a new ValueError
without context, capture the original exception and raise a new error chained from it. This makes debugging simpler.
try:
sample_view = self[-1:].first()
-except ValueError:
- raise ValueError("%s is empty" % self.__class__.__name__)
+except ValueError as err:
+ raise ValueError("%s is empty" % self.__class__.__name__) from err
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
sample_view = self[-1:].first() | |
except ValueError: | |
raise ValueError("%s is empty" % self.__class__.__name__) | |
return self._make_sample(d) | |
return fos.Sample.from_doc(sample_view._doc, dataset=self) | |
sample_view = self[-1:].first() | |
except ValueError as err: | |
raise ValueError("%s is empty" % self.__class__.__name__) from err | |
return fos.Sample.from_doc(sample_view._doc, dataset=self) |
🧰 Tools
🪛 Ruff (0.8.2)
1239-1239: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
if etau.is_str(type): | ||
type = etau.get_class(type) | ||
|
||
operators = [op for op in operators if isinstance(op, type)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against potential arbitrary class loading
Dynamically loading a class via etau.get_class(type)
can pose security risks if user-provided input is not sanitized or constrained. Consider limiting permissible modules or adding validation checks to avoid unintended code execution.
Here’s a possible approach that checks a known-safe whitelist of modules:
if etau.is_str(type):
+ allowed_modules = ["fiftyone.operators"]
+ if not any(type.startswith(m + ".") for m in allowed_modules):
+ raise ValueError(f"Type `{type}` is not in the list of allowed modules")
type = etau.get_class(type)
operators = [op for op in operators if isinstance(op, type)]
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if etau.is_str(type): | |
type = etau.get_class(type) | |
operators = [op for op in operators if isinstance(op, type)] | |
if etau.is_str(type): | |
allowed_modules = ["fiftyone.operators"] | |
if not any(type.startswith(m + ".") for m in allowed_modules): | |
raise ValueError(f"Type `{type}` is not in the list of allowed modules") | |
type = etau.get_class(type) | |
operators = [op for op in operators if isinstance(op, type)] |
@@ -88,10 +90,16 @@ def __init__( | |||
neg_label_field=None, | |||
expand_gt_hierarchy=True, | |||
expand_pred_hierarchy=False, | |||
custom_metrics=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Maintain consistent parameter documentation for custom_metrics
.
The code sets custom_metrics
in the constructor and passes it to the superclass. However, the docstring mentions that it can be a list or dict, while the underlying usage seems to require a dict. Consider harmonizing these references and possibly validating the type if both forms are truly accepted.
Also applies to: 97-102
view = self.match(expr) | ||
matches = iter(view) | ||
|
||
try: | ||
sample = next(matches) | ||
except StopIteration: | ||
raise ValueError("No samples match the given expression") | ||
|
||
if exact: | ||
try: | ||
next(matches) | ||
raise ValueError( | ||
"Expected one matching sample, but found %d matches" | ||
% len(view) | ||
) | ||
except StopIteration: | ||
pass | ||
|
||
return sample |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error context when raising exceptions
The error handling should preserve the context of the original error by using raise ... from err
syntax.
Apply this diff to improve error handling:
try:
sample = next(matches)
except StopIteration as err:
- raise ValueError("No samples match the given expression")
+ raise ValueError("No samples match the given expression") from err
if exact:
try:
next(matches)
raise ValueError(
"Expected one matching sample, but found %d matches"
% len(view)
)
except StopIteration:
pass
return sample
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
view = self.match(expr) | |
matches = iter(view) | |
try: | |
sample = next(matches) | |
except StopIteration: | |
raise ValueError("No samples match the given expression") | |
if exact: | |
try: | |
next(matches) | |
raise ValueError( | |
"Expected one matching sample, but found %d matches" | |
% len(view) | |
) | |
except StopIteration: | |
pass | |
return sample | |
view = self.match(expr) | |
matches = iter(view) | |
try: | |
sample = next(matches) | |
except StopIteration as err: | |
raise ValueError("No samples match the given expression") from err | |
if exact: | |
try: | |
next(matches) | |
raise ValueError( | |
"Expected one matching sample, but found %d matches" | |
% len(view) | |
) | |
except StopIteration: | |
pass | |
return sample |
🧰 Tools
🪛 Ruff (0.8.2)
1019-1019: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
try: | ||
return next(iter(self)) | ||
except StopIteration: | ||
raise ValueError("%s is empty" % self.__class__.__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error context when raising exceptions
The error handling should preserve the context of the original error by using raise ... from err
syntax.
Apply this diff to improve error handling:
try:
return next(iter(self))
except StopIteration as err:
- raise ValueError("%s is empty" % self.__class__.__name__)
+ raise ValueError("%s is empty" % self.__class__.__name__) from err
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
try: | |
return next(iter(self)) | |
except StopIteration: | |
raise ValueError("%s is empty" % self.__class__.__name__) | |
try: | |
return next(iter(self)) | |
except StopIteration as err: | |
raise ValueError("%s is empty" % self.__class__.__name__) from err |
🧰 Tools
🪛 Ruff (0.8.2)
929-929: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
Addressed merge conflicts in #5435. |
What changes are proposed in this pull request?
Merging main back into develop after the 1.3.0 release.
How is this patch tested? If it is not, please explain why.
(Details)
Release Notes
Is this a user-facing change that should be mentioned in the release notes?
notes for FiftyOne users.
(Details in 1-2 sentences. You can just refer to another PR with a description
if this PR is part of a larger change.)
What areas of FiftyOne does this PR affect?
fiftyone
Python library changesSummary by CodeRabbit
Based on the comprehensive summary of changes, here are the release notes:
Release Notes
New Features
Improvements
Bug Fixes
Performance Optimizations
Breaking Changes
first
,last
,head
,tail
, andone
methods fromDatasetView