Skip to content

v0.4.0

Compare
Choose a tag to compare
@github-actions github-actions released this 27 Aug 23:52
· 197 commits to master since this release
59d8f63

Changes

General

  • Loaders for Train and Evaluate now have the same format
  • The functions above have an identical interface for both PyTorch and Sklearn

Estimators

  • Fixed: Model Saving & Loading

    • loaded models can continue to be trained
    • upon their initialization when loading the model, you can redefine its previous config, such as n_epoch, lr
    • optimizer dict state is correctly saved and loaded
    • optimizer state is moved to cpu or gpu depending on a setup (catalyst doesn't do it on its own, which caused issues when evaluating a loaded model)
    • Added dummy estimators for loading (otherwise all estimators have load and save)
      • load_estimator() for torch
      • load_sklearn_estimator() for sklearn
  • Reworked how the models are initialized

    • upon calling the estimator, ex: estimator = CNN3d(loaders=loaders)
      • before: when training the model, upon estimator.train
    • model initialization requires to provide loaders now
  • All self vars in ModelConfig() get recorded in tracking by default

  • Added options in ModelConfig()

    • lr and min_lr - learning rate parameters are no longer hard-coded
    • device - sets a specific device to run the models on, either cpu or cuda
  • Added sklearn_backend and torch_backend to be used by all estimators

    • sklearn-based estimators have a structure close to torch-based
    • pytorch_estimator -> torch_backend
    • cleared up variable name conventions throughout

Evaluation

  • Evaluate and Data loader accept data without target
    • useful when there is no ground truth to compare to
    • will still output pdf, cdf, and spatial, without comparison metrics
    • Evaluate.run() now output a dict of "pred_cube" and "target_cube" (if the latter is provided)
  • PDF and CDF plots are now combined under a single figure
    • recorded as 'pdf_cdf.png' in MLflow
  • Fixed: definition of n_output_channel in Evaluate()

Command Line Interface (CLI)

  • Added new option: sapsan create --ddp option copies torch_backend.py

    • gives ability to customize Catalyst Runner
    • adjust DDP settings based on the linked Catalyst DDP tutorial in the Wiki
    • will be useful when running on HPC
    • refer to Parallel GPU Training on the Wiki for more details
  • Fixed: CLI click initialization

Graphical User Interface (GUI)

  • Up to date with Streamlit 0.87.0
  • PDF and CDF plots are now showed as well
  • Fixed: data loading issue in regards to train_fraction

MLflow

  • MLflow: evaluate runs will be nested under the recent train run
    • significantly aids organization
  • Added estimator.model.forward() to be recorded by MLflow (if torch is used)

Plotting

  • Plotting routines return Axes object
  • All parameters are changed for the Axes instead of plt which allows individual tweaking after return
  • figsize and ax arguments added to most plotting routines
    • useful if you create a figure and subplots outside of the plotting routines
  • Universal plotting params expanded and were made easily accessible through plot_params()

Other

  • Edited the examples, tests, and estimator template to reflect model initialization changes
  • Requirements Updated:
    • streamlit >= 0.87.0
    • plotly >= 5.2.0
    • tornado >= 6.1.0
    • notebook >= 6.4.3 (fixes security vulnerabilities)
  • Added a few data_loader warnings
  • Cleaned up debug prints throughout the code
  • Expanded code comments