-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding support for remotely-sourced zoo models #4786
Conversation
WalkthroughThe pull request introduces significant enhancements to the FiftyOne Model Zoo, focusing on the management of remote model sources. Key updates include new CLI commands for listing, registering, and deleting remote sources, as well as modifications to existing commands to accommodate remote models. Documentation has been updated to reflect these changes, providing clearer guidance on using the Model Zoo and its API. Additionally, the version requirement for the Changes
Possibly related PRs
Suggested labels
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
33f9685
to
592f183
Compare
ab0f212
to
80d632d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range and nitpick comments (1)
fiftyone/core/cli.py (1)
Line range hint
2293-2320
: Looks good, but consider using a ternary operatorAdding the "remote" column to indicate if a model is remotely-sourced looks great and fits well with the goal of supporting remote models.
As suggested by the static analysis tool, you could simplify the code by replacing the
if
-else
block with a ternary operator:is_remote = "✓" if isinstance(model, fozm.RemoteZooModel) else ""This is a non-blocking suggestion, feel free to keep the current code if you prefer it for clarity.
Tools
Ruff
2303-2306: Use ternary operator
is_remote = "✓" if isinstance(model, fozm.RemoteZooModel) else ""
instead ofif
-else
-blockReplace
if
-else
-block withis_remote = "✓" if isinstance(model, fozm.RemoteZooModel) else ""
(SIM108)
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files selected for processing (9)
- docs/scripts/make_model_zoo_docs.py (1 hunks)
- docs/source/cli/index.rst (9 hunks)
- docs/source/user_guide/model_zoo/api.rst (0 hunks)
- docs/source/user_guide/model_zoo/design.rst (1 hunks)
- docs/source/user_guide/model_zoo/index.rst (6 hunks)
- docs/source/user_guide/model_zoo/remote.rst (1 hunks)
- fiftyone/core/cli.py (12 hunks)
- fiftyone/zoo/models/init.py (18 hunks)
- setup.py (1 hunks)
Files not reviewed due to no reviewable changes (1)
- docs/source/user_guide/model_zoo/api.rst
Files skipped from review due to trivial changes (1)
- docs/scripts/make_model_zoo_docs.py
Additional context used
Ruff
fiftyone/zoo/models/__init__.py
275-278: Use ternary operator
name = model_name if model_name is not None else name_or_url
instead ofif
-else
-blockReplace
if
-else
-block withname = model_name if model_name is not None else name_or_url
(SIM108)
648-651: Use ternary operator
repo = GitHubRepository(url_or_gh_repo) if "github" in url else None
instead ofif
-else
-blockReplace
if
-else
-block withrepo = GitHubRepository(url_or_gh_repo) if "github" in url else None
(SIM108)
754-754: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
762-762: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
fiftyone/core/cli.py
2303-2306: Use ternary operator
is_remote = "✓" if isinstance(model, fozm.RemoteZooModel) else ""
instead ofif
-else
-blockReplace
if
-else
-block withis_remote = "✓" if isinstance(model, fozm.RemoteZooModel) else ""
(SIM108)
Additional comments not posted (38)
setup.py (1)
78-78
: Verify compatibility with the updatedvoxel51-eta
version.Updating the
voxel51-eta
version requirement is a good change. However, please ensure that:
- The codebase has been thoroughly tested with version 0.13.0 of
voxel51-eta
.- Any necessary changes have been made to leverage new features or updates in
voxel51-eta
0.13.0.- There are no breaking changes in
voxel51-eta
0.13.0 that could negatively impact the functionality of this project.You can verify the compatibility by running the following commands:
Also, review the changelog or release notes of
voxel51-eta
0.13.0 to identify any breaking changes or updates that may require corresponding changes in this codebase.Verification successful
Version update confirmed, no direct usage found
The update to
voxel51-eta>=0.13.0,<0.14
insetup.py
is confirmed. However, no direct imports or usage ofvoxel51.eta
were found in the codebase. This suggests thatvoxel51-eta
might be used as a dependency for other packages or modules.To ensure compatibility:
- Thoroughly test the project with the new version of
voxel51-eta
.- Review the changelog for
voxel51-eta
0.13.0 to identify any breaking changes or new features that might affect your project indirectly.- Pay attention to any warnings or deprecation notices during testing.
The lack of direct usage reduces the risk of incompatibility, but careful testing is still recommended.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify that the codebase imports and uses `voxel51.eta` correctly rg --type python -A 5 $'import voxel51.eta' # Check if there are any references to deprecated or removed features from `voxel51.eta` rg --type python -i $'voxel51.eta'Length of output: 140
Script:
#!/bin/bash # Find Python files and search for voxel51.eta imports fd -e py -x grep -Hn "import.*voxel51\.eta" {} + # Find Python files and search for any mentions of voxel51.eta fd -e py -x grep -Hn "voxel51\.eta" {} + # Check for any potential usage of voxel51-eta in other file types (e.g., configuration files) grep -r "voxel51-eta" .Length of output: 26798
docs/source/user_guide/model_zoo/index.rst (5)
8-13
: LGTM!The introduction provides a clear and concise overview of the FiftyOne Model Zoo's capabilities, highlighting its key features and supported model sources. The content aligns well with the AI-generated summary.
29-45
: LGTM!The "Built-in models" section effectively focuses on the pre-trained models available in the Model Zoo. The custom button provides a clear call-to-action for users to explore the available models, and the note about custom models is a relevant and informative addition. The content aligns well with the AI-generated summary.
46-65
: LGTM!The addition of the "Remotely-sourced models" and "Model interface" sections enhances the documentation by introducing key aspects of the Model Zoo's functionality and design. The content aligns well with the AI-generated summary, and the custom buttons provide clear navigation to pages with more detailed information.
70-75
: LGTM!The "API reference" section effectively informs users about the dual access methods (Python library and CLI) for managing models in the Model Zoo. The custom button provides a clear link to the API reference page for more detailed information. The content aligns well with the AI-generated summary.
Line range hint
88-239
: LGTM!The "Basic recipe" section provides clear and comprehensive guidance on using the Model Zoo for prediction, embeddings, and logits. The code samples and explanations are well-structured and easy to follow. The content aligns well with the AI-generated summary, particularly regarding the improved instructions on checking for logits and storing them during predictions.
docs/source/user_guide/model_zoo/design.rst (7)
1-10
: LGTM!The introduction provides a clear and concise overview of the Model Zoo design and the Model interface. It highlights the key aspects of the Model class, such as providing a common interface for loading models and generating predictions with defined input and output data formats, which is important for consistency and ease of use.
12-25
: Informative note for custom model integration!The note provides useful information for users who want to integrate their custom models with FiftyOne. It encourages implementing the Model interface to leverage built-in methods and mentions the TorchImageModel class as a utility for deploying models in custom frameworks, which is helpful for users.
27-201
: Comprehensive explanation of the prediction process!The code segment provides a detailed explanation of the prediction process using the Model interface. The code examples for image and video models, along with the description of the predict() method's input and output formats, provide clarity on how to use the interface correctly. The mentions of the context manager convention, predict_all() method for batching, and using DataLoaders for PyTorch models are helpful for users to leverage the full capabilities of the Model interface.
203-234
: Clear explanation of embeddings computation!The code segment provides a clear explanation of how models can expose embeddings computation capability using the EmbeddingsMixin. The similarity to the predict() method in terms of usage and the convention of returning a numpy array make it easy for users to understand and use the embed() method. The note about embedding shapes and the mention of the embed_all() method for batching provide flexibility and efficiency for embeddings computation.
236-249
: Clear explanation of logits computation!The code segment provides a clear explanation of how models can expose logits computation capability using the LogitsMixin. The mention of the store_logits property and how logits are stored in the Label instances during inference provides clarity on how users can access and utilize the logits generated by the model.
251-349
: Detailed explanation of using custom models!The code segment provides a detailed explanation of how to use custom models, particularly Torch models, with FiftyOne. The example of loading a pretrained model from torchvision and the explanation of the TorchImageModelConfig class and its parameters provide a clear understanding of the process. The descriptions of model loading, preprocessing options, output processors, and the embeddings_layer parameter cover all the essential aspects of integrating custom models with FiftyOne, making it easier for users to leverage their own models.
351-360
: Useful information on registering custom models!The note provides useful information for users who want to register and reuse their custom models. The ability to register custom models under a chosen name allows for easy reference and reusability. The example code snippet clearly demonstrates how to load and use a registered custom model, making it convenient for users to leverage their custom models across different projects.
docs/source/user_guide/model_zoo/remote.rst (5)
1-194
: Excellent documentation!The introduction and usage sections provide a comprehensive guide on working with remotely-sourced zoo models. The instructions are clear, well-structured, and easy to follow. The inclusion of code snippets for both Python and CLI usage is particularly helpful.
195-211
: Great overview of creating remotely-sourced models!The section provides a clear and concise overview of the components required for creating a remote source of models. The directory structure and required files are well-explained without overwhelming the reader with too much detail.
212-319
: Comprehensive description of the manifest.json file!The section provides a detailed and informative description of the
manifest.json
file and its fields. The table with field descriptions is well-structured and easy to understand. The inclusion of an examplemanifest.json
file is a great addition to help users grasp the structure and content of the file.
320-354
: Clear explanation of the download_model() function!The section provides a concise and informative description of the
download_model()
function, its signature, and its purpose. The explanation of when the function is called under-the-hood is particularly helpful for understanding its role in the overall process.
355-412
: Excellent documentation of the load_model() function!The section provides a comprehensive and well-structured explanation of the
load_model()
function, its signature, and its purpose. The mention of optional keyword arguments for customized loading is a great addition, as it highlights the flexibility of the function. The reference to an example remote model source is also a valuable resource for users who want to see a real-world implementation.fiftyone/zoo/models/__init__.py (10)
Line range hint
35-69
: LGTM!The addition of the
source
parameter to filter models based on registered remote sources is a useful enhancement. The implementation looks good.
Line range hint
72-88
: Looks good!The function has been updated to correctly handle the new
source
parameter and filter the manifest accordingly. The implementation is straightforward and easy to follow.
Line range hint
124-159
: Great work!The updates to the
download_zoo_model
function provide more flexibility in specifying the model to download by accepting either a model name or a remote source URL. The added note about using a GitHub personal access token for private repositories is also helpful for users.The implementation looks solid and handles the different scenarios correctly.
Line range hint
219-308
: Excellent updates!The enhancements to the
load_zoo_model
function provide more flexibility by allowing users to specify either a model name or a remote source URL. The added note about using a GitHub personal access token for private repositories is a helpful addition.The implementation handles various scenarios, such as caching loaded models, downloading models if necessary, installing requirements, and loading remote models. The code is well-structured and easy to follow.
Great job on these improvements!
363-370
: New function looks good!The addition of the
list_zoo_model_sources
function is a helpful utility for users to easily retrieve a list of all registered remote model sources. The implementation is straightforward and leverages the existing_load_zoo_models_manifest
function to get the necessary data.Nice work on adding this functionality!
373-393
: Great addition!The
register_zoo_model_source
function is a valuable addition that allows users to easily register a new remote model source. The flexibility in accepting various formats for theurl_or_gh_repo
parameter, such as GitHub repo URLs, refs, or public archive URLs, makes it convenient for users to specify the remote source.The inclusion of the
overwrite
flag is a good touch, giving users control over whether to overwrite existing files during the registration process.The note about using a GitHub personal access token for private repositories is a helpful reminder for users.
Overall, this function is a great enhancement to the FiftyOne Model Zoo functionality.
396-421
: Solid implementation!The
delete_zoo_model_source
function is a useful addition that allows users to remove a registered remote source and its associated downloaded models. The flexibility in accepting various formats for theurl_or_gh_repo
parameter, such as GitHub repo URLs or refs, makes it convenient for users to specify the remote source to delete.The function handles different scenarios appropriately:
- If the manifest is found and its directory is not the top-level model zoo directory, it deletes the directory.
- If the manifest is found but its directory is the top-level model zoo directory, it logs a warning message to prevent accidental deletion.
- If the manifest is not found, it raises a
ValueError
with a clear message.The implementation is clean and easy to understand.
Great job on adding this functionality to manage remote sources in the FiftyOne Model Zoo!
497-504
: Nice addition!The
RemoteZooModel
class is a good addition to represent remotely-sourced models in the FiftyOne Model Zoo. By inheriting from theZooModel
class, it inherits all the necessary attributes and methods.The override of the
__init__
method is clean and handles the case when themanager
attribute isNone
by creating a defaultRemoteModelManagerConfig
instance with the model name. This ensures that remotely-sourced models have a manager configuration even if one is not explicitly provided.The class is well-defined and fits nicely into the existing model zoo architecture.
Great work on adding this class to support remotely-sourced models!
506-510
: Looks good!The
RemoteModelManagerConfig
class is a straightforward addition to represent the configuration for a remote model manager. By inheriting from theetam.ModelManagerConfig
class, it inherits the necessary functionality for a model manager configuration.The override of the
__init__
method is simple and clear. It calls the superclass constructor with the provided dictionary and then extracts themodel_name
using theparse_string
method, assigning it to themodel_name
attribute. This ensures that the model name is easily accessible as a separate attribute.The class is well-defined and serves its purpose of storing the configuration for a remote model manager.
Nice work on adding this class to support the configuration of remote model managers!
512-515
: Good implementation!The
RemoteModelManager
class is a nice addition to handle thedocs/source/cli/index.rst (4)
2560-2560
: LGTM!The addition of the
list-sources
command to list remote zoo model sources is a valuable enhancement that aligns with the PR objectives.
2572-2583
: LGTM!The addition of the
list-sources
,register-source
, anddelete-source
commands to manage remote zoo model sources is a valuable enhancement that aligns with the PR objectives.
Line range hint
2590-2635
: LGTM!The addition of the
--source
argument to thefiftyone zoo models list
command for listing models from a specific remote source is a valuable enhancement that aligns with the PR objectives.
2744-2789
: LGTM!The updates to the
fiftyone zoo models download
command to support downloading remotely-sourced zoo models, along with the addition of the--model-name
argument, are valuable enhancements that align with the PR objectives.fiftyone/core/cli.py (6)
2202-2210
: LGTM!The new subcommands for managing remote model sources look good.
Line range hint
2218-2235
: Looks good!The updated docstring and new
--source
argument for filtering models by remote source look great.
2258-2263
: LGTM!Adding the
source
argument to handle the--source
CLI option looks good.
2269-2275
: Looks good!Extracting the
tags
andsource
arguments, and splitting thetags
on commas, looks correct and consistent with the CLI options.
2279-2287
: LGTM!Retrieving the filtered models using
fozm._list_zoo_models()
with thetags
andsource
arguments, and passing the result to_print_zoo_models_list()
, looks great.
2480-2536
: LGTM!The changes to support downloading remotely-sourced zoo models look great:
- The updated docstring clearly explains the supported formats for specifying remote models.
- Renaming the argument to
name_or_url
and updating the help text matches the new behavior.- The
--model-name
argument is a good addition to handle remote sources with multiple models.- Calling
fozm.download_zoo_model()
with thename_or_url
andmodel_name
arguments is the right approach.Overall, this looks like a solid implementation of the remote model downloading feature.
Reviewed live w/ ML team |
Adds support for loading zoo models whose definitions are stored in a GitHub repository or publicly accessible URL! 🎉
Example datasets published in this format:
Notes
Requires voxel51/eta#633
Usage
Creating remotely-sourced models
This is extensively documented in the PR, but the basic ingredients to defining a remotely-sourced zoo model are a
manifest.json
file:and an
__init__.py
file that containsdownload_model()
andload_model()
methods:Summary by CodeRabbit
New Features
list-sources
,register-source
, anddelete-source
.Documentation
Bug Fixes