-
Notifications
You must be signed in to change notification settings - Fork 605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated snippets for exporting dataset to use load_dataset
#4545
Conversation
WalkthroughThese changes enhance the error handling and functionality of the FiftyOne dataset management system. A new Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant FiftyOne
User->>FiftyOne: Call load_dataset(name)
alt Dataset exists
FiftyOne->>User: Return existing dataset
else Dataset does not exist
alt Creation allowed
FiftyOne->>User: Create and return new dataset
else Creation not allowed
FiftyOne->>User: Raise DatasetNotFoundError
end
end
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
Thanks @manivoxel51 and @allenleetc for helping me to debug my export/import snippet 💯 |
Hmm I think this is partially just user learning curve. On the other side of the coin, if the dataset did not exist then calling The intention is that it's not a fully executable code snippet, but that Perhaps there is another way to denote that we aren't dictating how to load the dataset or view in the snippet but the focus is on what happens afterwards. |
@swheaton : since the tutorial is about exporting dataset, I think it is safe to assume that the dataset exists. The same code snippet is also shared when clicking the export button on Fiftyone as well. |
fiftyone/core/dataset.py
Outdated
try: | ||
return Dataset(name, _create=False) | ||
except MissingDatasetError as ex: | ||
if create_if_missing: | ||
return Dataset(name) | ||
else: | ||
raise ex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd use fo.dataset_exists()
over try-except.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree with @sashankaryal 's comment. While it might look nicer, this causes an extra database lookup for the happy path.
We should assume it exists and fall back to creating rather than checking for existence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, how much do we want to optimize database lookup? I don't think the users will be calling load_dataset
very often, probably once at the beginning of the script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you might be surprised. especially when going through pymongo proxy (dont wanna get into it lol). where each call to mongodb must be absolutely necessary haha.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, @sashankaryal what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chatted with @sashankaryal, let's go with @swheaton's suggestion
21f7a8c
to
a4fa2b9
Compare
@sashankaryal @benjaminpkane @swheaton for the UI path to generate code to export dataset, do you know where it lives? Somehow I can't find it in either |
487a5fc
to
87b446d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files selected for processing (3)
- fiftyone/core/dataset.py (4 hunks)
- tests/unittests/dataset_tests.py (1 hunks)
- tests/unittests/utils_tests.py (5 hunks)
Files skipped from review as they are similar to previous changes (2)
- fiftyone/core/dataset.py
- tests/unittests/dataset_tests.py
Additional comments not posted (3)
tests/unittests/utils_tests.py (3)
462-466
: Ensure consistent mocking ofdataset_exists
.The
dataset_exists
mock patch is correctly added to thetest_load_dataset_by_id
method. Ensure that the mock patch is consistently applied across all relevant test methods.
488-492
: Ensure consistent mocking ofdataset_exists
.The
dataset_exists
mock patch is correctly added to thetest_load_dataset_by_alt_id
method. Ensure that the mock patch is consistently applied across all relevant test methods.
513-515
: Ensure consistent mocking ofdataset_exists
.The
dataset_exists
mock patch is correctly added to thetest_load_dataset_by_name
method. Ensure that the mock patch is consistently applied across all relevant test methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disagree with review comment
@minhtuev so that actually is coming from the Teams app which is in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files selected for processing (1)
- fiftyone/core/dataset.py (4 hunks)
Files skipped from review as they are similar to previous changes (1)
- fiftyone/core/dataset.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files selected for processing (1)
- fiftyone/core/dataset.py (4 hunks)
Files skipped from review as they are similar to previous changes (1)
- fiftyone/core/dataset.py
0c50588
to
32b93af
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files selected for processing (4)
- fiftyone/public.py (1 hunks)
- fiftyone/core/dataset.py (4 hunks)
- fiftyone/migrations/runner.py (1 hunks)
- tests/unittests/dataset_tests.py (1 hunks)
Files skipped from review due to trivial changes (1)
- fiftyone/core/dataset.py
Files skipped from review as they are similar to previous changes (1)
- tests/unittests/dataset_tests.py
Additional context used
Ruff
fiftyone/__public__.py
36-36:
.core.dataset.DatasetNotFoundError
imported but unusedRemove unused import
(F401)
Additional comments not posted (1)
fiftyone/migrations/runner.py (1)
62-62
: Improved error handling.Raising
DatasetNotFoundError
instead ofValueError
improves error specificity and clarity.
32b93af
to
df3333d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files selected for processing (4)
- fiftyone/public.py (1 hunks)
- fiftyone/core/dataset.py (4 hunks)
- fiftyone/migrations/runner.py (1 hunks)
- tests/unittests/dataset_tests.py (1 hunks)
Files skipped from review as they are similar to previous changes (3)
- fiftyone/core/dataset.py
- fiftyone/migrations/runner.py
- tests/unittests/dataset_tests.py
Additional context used
Ruff
fiftyone/__public__.py
36-36:
.core.dataset.DatasetNotFoundError
imported but unusedRemove unused import
(F401)
@@ -33,6 +33,7 @@ | |||
from .core.config import AppConfig | |||
from .core.dataset import ( | |||
Dataset, | |||
DatasetNotFoundError, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused import.
The DatasetNotFoundError
import is unused in this file and should be removed to avoid clutter.
- DatasetNotFoundError,
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
DatasetNotFoundError, |
Tools
Ruff
36-36:
.core.dataset.DatasetNotFoundError
imported but unusedRemove unused import
(F401)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New load_dataset
LGTM!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great thanks
What changes are proposed in this pull request?
fo.Dataset(...)
, which is incorrect because this is the syntax for creating a new dataset. To load an existing dataset, usefo.load_dataset(...)
instead.Example snippet:
Error:
Correct snippet should be:
create_if_necessary
toload_dataset
to return an empty dataset if none currently exists.How is this patch tested? If it is not, please explain why.
create_if_necessary
toload_dataset
should create a new dataset if an existing one does not exist.Testing:
Release Notes
Is this a user-facing change that should be mentioned in the release notes?
notes for FiftyOne users.
(Details in 1-2 sentences. You can just refer to another PR with a description
if this PR is part of a larger change.)
What areas of FiftyOne does this PR affect?
fiftyone
Python library changesSummary by CodeRabbit
DatasetNotFoundError
for missing datasets.ValueError
withDatasetNotFoundError
when datasets are not found.