v0.4.0
Changes
General
- Loaders for Train and Evaluate now have the same format
- The functions above have an identical interface for both PyTorch and Sklearn
Estimators
-
Fixed: Model Saving & Loading
- loaded models can continue to be trained
- upon their initialization when loading the model, you can redefine its previous config, such as n_epoch, lr
- optimizer dict state is correctly saved and loaded
- optimizer state is moved to cpu or gpu depending on a setup (catalyst doesn't do it on its own, which caused issues when evaluating a loaded model)
- Added dummy estimators for loading (otherwise all estimators have
load
andsave
)load_estimator()
for torchload_sklearn_estimator()
for sklearn
-
Reworked how the models are initialized
- upon calling the estimator, ex:
estimator = CNN3d(loaders=loaders)
- before: when training the model, upon
estimator.train
- before: when training the model, upon
- model initialization requires to provide
loaders
now
- upon calling the estimator, ex:
-
All
self
vars inModelConfig()
get recorded in tracking by default -
Added options in
ModelConfig()
lr
andmin_lr
- learning rate parameters are no longer hard-codeddevice
- sets a specific device to run the models on, either cpu or cuda
-
Added
sklearn_backend
andtorch_backend
to be used by all estimators- sklearn-based estimators have a structure close to torch-based
- pytorch_estimator -> torch_backend
- cleared up variable name conventions throughout
Evaluation
- Evaluate and Data loader accept data without target
- useful when there is no ground truth to compare to
- will still output pdf, cdf, and spatial, without comparison metrics
Evaluate.run()
now output adict
of"pred_cube"
and"target_cube"
(if the latter is provided)
- PDF and CDF plots are now combined under a single figure
- recorded as 'pdf_cdf.png' in MLflow
- Fixed: definition of
n_output_channel
in Evaluate()
Command Line Interface (CLI)
-
Added new option:
sapsan create --ddp
option copiestorch_backend.py
- gives ability to customize Catalyst Runner
- adjust DDP settings based on the linked Catalyst DDP tutorial in the Wiki
- will be useful when running on HPC
- refer to Parallel GPU Training on the Wiki for more details
-
Fixed: CLI click initialization
Graphical User Interface (GUI)
- Up to date with Streamlit 0.87.0
- PDF and CDF plots are now showed as well
- Fixed: data loading issue in regards to
train_fraction
MLflow
- MLflow: evaluate runs will be nested under the recent train run
- significantly aids organization
- Added
estimator.model.forward()
to be recorded by MLflow (if torch is used)
Plotting
- Plotting routines return
Axes
object - All parameters are changed for the
Axes
instead ofplt
which allows individual tweaking afterreturn
figsize
andax
arguments added to most plotting routines- useful if you create a figure and subplots outside of the plotting routines
- Universal plotting params expanded and were made easily accessible through
plot_params()
Other
- Edited the examples, tests, and estimator template to reflect model initialization changes
- Requirements Updated:
- streamlit >= 0.87.0
- plotly >= 5.2.0
- tornado >= 6.1.0
- notebook >= 6.4.3 (fixes security vulnerabilities)
- Added a few data_loader warnings
- Cleaned up debug prints throughout the code
- Expanded code comments