test tests #5426

kaixi-wang · 2025-01-23T15:05:21Z

What changes are proposed in this pull request?

(Please fill in changes proposed in this fix)

How is this patch tested? If it is not, please explain why.

(Details)

Release Notes

Is this a user-facing change that should be mentioned in the release notes?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release
notes for FiftyOne users.

(Details in 1-2 sentences. You can just refer to another PR with a description
if this PR is part of a larger change.)

What areas of FiftyOne does this PR affect?

App: FiftyOne application changes
Build: Build and test infrastructure changes
Core: Core fiftyone Python library changes
Documentation: FiftyOne documentation changes
Other

Summary by CodeRabbit

Tests
- Enhanced unit tests for dataset collection addition by ensuring unique dataset names.
- Updated dataset instantiation to include unique names using uuid.
- Improved assertions for clearer and more consistent field value retrieval in tests.
Bug Fixes
- Added error handling in the sample insertion process to ensure all samples are accounted for.
Chores
- Introduced logging for better tracking of potential issues during sample synchronization.

coderabbitai · 2025-01-23T15:05:29Z

Walkthrough

The pull request introduces modifications to the test_add_collection_new_ids method in the DatasetTests class and updates the VideoTests class within the unit test files. Key changes include initializing Dataset instances with unique names generated using uuid.uuid4(), adding a save operation immediately after creation, updating the dataset cloning process to also use unique names, and modifying assertions to utilize the values() method for retrieving field contents. Additionally, the add_samples method in the Dataset class now includes error handling for sample insertion, and logging has been added to the _sync_docs_for_sample method in the FrameSingleton class.

Changes

File	Change Summary
`tests/unittests/dataset_tests.py`	Updated `Dataset` initialization to use `uuid.uuid4()` for unique names, added save operation after creation, modified dataset cloning to include `uuid.uuid4()` in the name, and added print statements for sample outputs.
`tests/unittests/video_tests.py`	Updated `Dataset` initialization to use `uuid.uuid4()` for unique names in multiple places, ensured cloned datasets also have unique names, and updated assertions for clarity.
`fiftyone/core/dataset.py`	Enhanced `add_samples` method with error handling for sample insertion, ensuring correct `_id` assignment from inserted IDs.
`fiftyone/core/singletons.py`	Added logging import and a warning message in `_sync_docs_for_sample` for missing collection names.

Possibly related PRs

fix tagger e2e tests #4783: This PR modifies the instantiation of fo.Dataset() to include a unique name generated by uuid.uuid4(), similar to the changes in the main PR where unique names are assigned to datasets in the test_add_collection_new_ids method.
Fix COCO category IDs #4884: This PR includes changes to the handling of category IDs in COCO datasets, which may relate to dataset management and testing, although it does not directly modify the same functions as the main PR.
Exection store interface cleanup #5086: This PR enhances the SampleCollection class with new methods for managing execution stores, which could relate to dataset operations, but does not directly overlap with the changes made in the main PR.

Suggested reviewers

brimoor
ritch

Poem

🐰 A Test of Rabbit Might
Assertions dance, a coding delight,
values() method shines so bright.
From first to last, our data's clear,
With new IDs, no need to fear.
A test that hops with pure insight! 🧪

✨ Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

tests/unittests/dataset_tests.py (1)
10-10: Remove unused import

The uuid module is imported but never used in the code.
-import uuid
🧰 Tools

🪛 Ruff (0.8.2)

10-10: uuid imported but unused

Remove unused import: uuid

(F401)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e606113 and 2d71e70.

📒 Files selected for processing (1)

tests/unittests/dataset_tests.py (4 hunks)

🧰 Additional context used

🪛 Ruff (0.8.2)

tests/unittests/dataset_tests.py

10-10: uuid imported but unused

Remove unused import: uuid

(F401)

🔇 Additional comments (3)

tests/unittests/dataset_tests.py (3)

2573-2575: Ensure unique dataset names using datetime

Good practice to use datetime for generating unique dataset names and explicitly saving the dataset after creation.

2578-2578: Maintain consistency in dataset naming

Good practice to use the same datetime-based naming pattern for the cloned dataset.

2604-2605: Improve test assertions using values() method

Good refactor to use the values() method for retrieving field contents consistently.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

fiftyone/core/singletons.py (1)
339-342: Enhance the warning message for better debugging.

The warning message is incomplete and could provide more context. Consider using f-strings for better readability.
-            logging.warning(
-                "Cannot sync frames to corresponding sample collection `%s` ",
-                collection_name,
-            )
+            logging.warning(
+                f"Cannot sync frames to corresponding sample collection `{collection_name}` "
+                f"as it was not found in {cls.__name__}._instances"
+            )
fiftyone/core/dataset.py (2)
3377-3383: Improve error handling for MongoDB insert operation.

The error handling for insert_many has been enhanced to properly capture and handle MongoDB bulk write errors.

Consider adding a more descriptive error message that includes the number of samples that failed to insert:
-            msg = bwe.details["writeErrors"][0]["errmsg"]
-            raise ValueError(msg) from bwe
+            msg = bwe.details["writeErrors"][0]["errmsg"]
+            raise ValueError(
+                f"Failed to insert samples. First error: {msg}. "
+                f"Number of errors: {len(bwe.details['writeErrors'])}"
+            ) from bwe
3388-3390: Add validation to ensure all samples were inserted.

The code now validates that all samples were successfully inserted by comparing the number of inserted IDs to the number of samples.

Consider providing more context in the error message:
-            raise BulkWriteError("Failed to insert all samples")
+            raise BulkWriteError(
+                f"Failed to insert all samples: expected {len(samples)} "
+                f"but inserted {len(new_ids)}"
+            )

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 89dbb2c and ee3f15f.

📒 Files selected for processing (3)

fiftyone/core/dataset.py (1 hunks)
fiftyone/core/singletons.py (2 hunks)
tests/unittests/dataset_tests.py (3 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tests/unittests/dataset_tests.py

🔇 Additional comments (2)

fiftyone/core/singletons.py (1)

8-8: LGTM!

The logging import is appropriately placed with other standard library imports.

fiftyone/core/dataset.py (1)

3391-3396: LGTM: Sample document creation and video frame handling.

The code correctly:

Assigns new IDs to the sample documents

Creates backing documents for samples

Handles video frame saving for video samples

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

tests/unittests/dataset_tests.py (3)

2573-2577: Consider removing or converting print statements to a logger.
Printing datasets and samples is helpful for debugging, but using test assertions or logging might be more consistent with typical testing practices.

2586-2588: Prefer structured assertions over printing.
Relying on print statements alone to verify correctness might hide issues; assertions provide clearer test outcomes.

2608-2610: Improve test clarity by using assertions.
Instead of manually inspecting prints, consider verifying that dataset2 contains the expected samples, with or without fields, to ensure correctness.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ee3f15f and 793ea24.

📒 Files selected for processing (1)

tests/unittests/dataset_tests.py (3 hunks)

🔇 Additional comments (4)

tests/unittests/dataset_tests.py (4)

2580-2582: Unique dataset naming approach is acceptable.
Generating random names via uuid.uuid4() can minimize collisions and works well for ephemeral tests.

2602-2604: Fine usage of ephemeral dataset clones.
Cloning with unique UUID names avoids conflicts when running tests in parallel.

2605-2606: Looks good.
Capturing created_at and last_modified_at for subsequent comparisons is a standard approach in these tests.

2607-2607: Verify consistency of new IDs.
When adding a collection with new_ids=True, ensure any references are handled properly so downstream steps remain consistent.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

tests/unittests/dataset_tests.py (1)

2576-2577: Consider removing debug print statements.

While print statements are helpful during development, they should be removed or replaced with proper logging for production code.

-        print("dataset1")
-        print([s for s in dataset1.iter_samples()])
-        print("exact clone") 
-        print([s for s in dataset.iter_samples()])
-        print("merged dataset1")
-        print([s for s in dataset.iter_samples()])
-        print("dataset2 exact clone")
-        print([s for s in dataset1.iter_samples()])
-        print("merge view (ds with no foo)")
-        print([s for s in dataset2.iter_samples()])

Also applies to: 2581-2582, 2586-2587, 2603-2604, 2608-2609

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 793ea24 and 6853b32.

📒 Files selected for processing (1)

tests/unittests/dataset_tests.py (3 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (8)

GitHub Check: test / test-python (ubuntu-latest-m, 3.11)
GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
GitHub Check: test / test-app
GitHub Check: lint / eslint
GitHub Check: e2e / test-e2e
GitHub Check: build / build
GitHub Check: build / changes
GitHub Check: build

🔇 Additional comments (4)

tests/unittests/dataset_tests.py (4)

10-10: Import uuid module for generating unique identifiers.

The addition of the uuid module import is appropriate for generating unique dataset names to prevent collisions in tests.

2573-2575: Use uuid for unique dataset names and ensure persistence.

Good practice to:

Use uuid.uuid4() for guaranteed unique dataset names

Call save() immediately after creation to persist the dataset

2580-2580: Use uuid for clone names to prevent collisions.

Good practice to use uuid.uuid4() when cloning datasets to ensure unique names.

Also applies to: 2602-2602

2605-2617: Test logic for dataset cloning and merging.

The test logic appropriately verifies:

Dataset creation timestamps

Last modified timestamps

Field values and counts

Dataset uniqueness

test rm sort

a611d34

kaixi-wang added 2 commits January 23, 2025 07:46

compare views

e606113

ensure unique names

2d71e70

coderabbitai bot reviewed Jan 23, 2025

View reviewed changes

kaixi-wang added 6 commits January 23, 2025 09:10

use uuid

3653a7d

use unique names for video ds

9831dfa

check view

21585f8

just save

89dbb2c

add ids

cf6dcff

add print for debugging

ee3f15f

coderabbitai bot reviewed Jan 23, 2025

View reviewed changes

add more prints

793ea24

coderabbitai bot reviewed Jan 24, 2025

View reviewed changes

fix rename

6853b32

coderabbitai bot reviewed Jan 24, 2025

View reviewed changes

Base automatically changed from release/v1.3.0 to main January 24, 2025 20:00

kaixi-wang closed this Feb 20, 2025

kaixi-wang deleted the kacey/test-add-collection-test branch February 20, 2025 05:15

coderabbitai bot mentioned this pull request Feb 23, 2025

map_samples() and update_samples() #5480

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test tests #5426

test tests #5426

kaixi-wang commented Jan 23, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 23, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

test tests #5426

test tests #5426

Conversation

kaixi-wang commented Jan 23, 2025 • edited by coderabbitai bot Loading

What changes are proposed in this pull request?

How is this patch tested? If it is not, please explain why.

Release Notes

Is this a user-facing change that should be mentioned in the release notes?

What areas of FiftyOne does this PR affect?

Summary by CodeRabbit

coderabbitai bot commented Jan 23, 2025 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

kaixi-wang commented Jan 23, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 23, 2025 •

edited

Loading