Chinese translation (microsoft#1795)

chicm-ms · Dec 18, 2019 · 7953911 · 7953911
1 parent 6d9f545
commit 7953911
Show file tree

Hide file tree

Showing 39 changed files with 1,216 additions and 922 deletions.
diff --git a/README_zh_CN.md b/README_zh_CN.md
diff --git a/docs/zh_CN/CommunitySharings/community_sharings.rst b/docs/zh_CN/CommunitySharings/community_sharings.rst
@@ -12,3 +12,4 @@
     神经网络结构搜索（NAS）的对比<NasComparision>
     超参调优算法的对比<HpoComparision>
     TPE 的并行优化<ParallelizingTpeSearch>
+    使用 NNI 自动调优系统 <TuningSystems>
diff --git a/docs/zh_CN/Compressor/AutoCompression.md b/docs/zh_CN/Compressor/AutoCompression.md
@@ -84,7 +84,7 @@ config_list_agp = [{'initial_sparsity': 0, 'final_sparsity': conv0_sparsity,
                    {'initial_sparsity': 0, 'final_sparsity': conv1_sparsity,
                     'start_epoch': 0, 'end_epoch': 3,
                     'frequency': 1,'op_name': 'conv1' },]
-PRUNERS = {'level':LevelPruner(model, config_list_level)，'agp':AGP_Pruner(model, config_list_agp)}
+PRUNERS = {'level':LevelPruner(model, config_list_level), 'agp':AGP_Pruner(model, config_list_agp)}
 pruner = PRUNERS(params['prune_method']['_name'])
 pruner.compress()
 ... # fine tuning

diff --git a/docs/zh_CN/Compressor/Overview.md b/docs/zh_CN/Compressor/Overview.md
@@ -6,16 +6,23 @@ NNI 提供了易于使用的工具包来帮助用户设计并使用压缩算法
 
 ## 支持的算法
 
-NNI 提供了两种朴素压缩算法以及三种流行的压缩算法，包括两种剪枝算法以及三种量化算法：
+NNI 提供了几种压缩算法，包括剪枝和量化算法：
+
+**剪枝**
+
+| 名称                                              | 算法简介                                                                                                                                   |
+| ----------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
+| [Level Pruner](./Pruner.md#level-pruner)        | 根据权重的绝对值，来按比例修剪权重。                                                                                                                     |
+| [AGP Pruner](./Pruner.md#agp-pruner)            | 自动的逐步剪枝（是否剪枝的判断：基于对模型剪枝的效果）[参考论文](https://arxiv.org/abs/1710.01878)                                                                    |
+| [L1Filter Pruner](./Pruner.md#l1filter-pruner)  | 剪除卷积层中最不重要的过滤器 (PRUNING FILTERS FOR EFFICIENT CONVNETS)[参考论文](https://arxiv.org/abs/1608.08710)                                        |
+| [Slim Pruner](./Pruner.md#slim-pruner)          | 通过修剪 BN 层中的缩放因子来修剪卷积层中的通道 (Learning Efficient Convolutional Networks through Network Slimming)[参考论文](https://arxiv.org/abs/1708.06519) |
+| [Lottery Ticket Pruner](./Pruner.md#agp-pruner) | "The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks" 提出的剪枝过程。 它会反复修剪模型。 [参考论文](https://arxiv.org/abs/1803.03635) |
+| [FPGM Pruner](./Pruner.md#fpgm-pruner)          | Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration [参考论文](https://arxiv.org/pdf/1811.00250.pdf)   |
+
+**量化**
 
 | 名称                                                  | 算法简介                                                                                                                                                                       |
 | --------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [Level Pruner](./Pruner.md#level-pruner)            | 根据权重的绝对值，来按比例修剪权重。                                                                                                                                                         |
-| [AGP Pruner](./Pruner.md#agp-pruner)                | 自动的逐步剪枝（是否剪枝的判断：基于对模型剪枝的效果）[参考论文](https://arxiv.org/abs/1710.01878)                                                                                                        |
-| [L1Filter Pruner](./Pruner.md#l1filter-pruner)      | 剪除卷积层中最不重要的过滤器 (PRUNING FILTERS FOR EFFICIENT CONVNETS)[参考论文](https://arxiv.org/abs/1608.08710)                                                                            |
-| [Slim Pruner](./Pruner.md#slim-pruner)              | 通过修剪 BN 层中的缩放因子来修剪卷积层中的通道 (Learning Efficient Convolutional Networks through Network Slimming)[参考论文](https://arxiv.org/abs/1708.06519)                                     |
-| [Lottery Ticket Pruner](./Pruner.md#agp-pruner)     | "The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks" 提出的剪枝过程。 它会反复修剪模型。 [参考论文](https://arxiv.org/abs/1803.03635)                                     |
-| [FPGM Pruner](./Pruner.md#fpgm-pruner)              | Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration [参考论文](https://arxiv.org/pdf/1811.00250.pdf)                                       |
 | [Naive Quantizer](./Quantizer.md#naive-quantizer)   | 默认将权重量化为 8 位                                                                                                                                                               |
 | [QAT Quantizer](./Quantizer.md#qat-quantizer)       | 为 Efficient Integer-Arithmetic-Only Inference 量化并训练神经网络。 [参考论文](http://openaccess.thecvf.com/content_cvpr_2018/papers/Jacob_Quantization_and_Training_CVPR_2018_paper.pdf) |
 | [DoReFa Quantizer](./Quantizer.md#dorefa-quantizer) | DoReFa-Net: 通过低位宽的梯度算法来训练低位宽的卷积神经网络。 [参考论文](https://arxiv.org/abs/1606.06160)                                                                                              |
@@ -24,25 +31,26 @@ NNI 提供了两种朴素压缩算法以及三种流行的压缩算法，包括
 
 通过简单的示例来展示如何修改 Trial 代码来使用压缩算法。 比如，需要通过 Level Pruner 来将权重剪枝 80%，首先在代码中训练模型前，添加以下内容（[完整代码](https://github.com/microsoft/nni/tree/master/examples/model_compress)）。
 
-TensorFlow 代码
+PyTorch 代码
 
 ```python
-from nni.compression.tensorflow import LevelPruner
+from nni.compression.torch import LevelPruner
 config_list = [{ 'sparsity': 0.8, 'op_types': ['default'] }]
-pruner = LevelPruner(tf.get_default_graph(), config_list)
+pruner = LevelPruner(model, config_list)
 pruner.compress()
 ```
 
-PyTorch 代码
+TensorFlow 代码
 
 ```python
-from nni.compression.torch import LevelPruner
+from nni.compression.tensorflow import LevelPruner
 config_list = [{ 'sparsity': 0.8, 'op_types': ['default'] }]
-pruner = LevelPruner(model, config_list)
+pruner = LevelPruner(tf.get_default_graph(), config_list)
 pruner.compress()
 ```
 
-可使用 `nni.compression` 中的其它压缩算法。 此算法分别在 `nni.compression.torch` 和 `nni.compression.tensorflow` 中实现，支持 PyTorch 和 TensorFlow。 参考 [Pruner](./Pruner.md) 和 [Quantizer](./Quantizer.md) 进一步了解支持的算法。
+
+可使用 `nni.compression` 中的其它压缩算法。 此算法分别在 `nni.compression.torch` 和 `nni.compression.tensorflow` 中实现，支持 PyTorch 和 TensorFlow。 参考 [Pruner](./Pruner.md) 和 [Quantizer](./Quantizer.md) 进一步了解支持的算法。 此外，如果要使用知识蒸馏算法，可参考 [KD 示例](../TrialExample/KDExample.md)
 
 函数调用 `pruner.compress()` 来修改用户定义的模型（在 Tensorflow 中，通过 `tf.get_default_graph()` 来获得模型，而 PyTorch 中 model 是定义的模型类），并修改模型来插入 mask。 然后运行模型时，这些 mask 即会生效。 mask 可在运行时通过算法来调整。
 

diff --git a/docs/zh_CN/FeatureEngineering/Overview.md b/docs/zh_CN/FeatureEngineering/Overview.md
@@ -240,16 +240,17 @@ print("Pipeline Score: ", pipeline.score(X_train, y_train))
 
 # 基准测试
 
-`Baseline` 表示没有进行特征选择，直接将数据传入 LogisticRegression。 此基准测试中，仅用了 10% 的训练数据作为测试数据。
-
-| 数据集           | Baseline | GradientFeatureSelector | TreeBasedClassifier | 训练次数       | 特征数量      |
-| ------------- | -------- | ----------------------- | ------------------- | ---------- | --------- |
-| colon-cancer  | 0.7547   | 0.7368                  | 0.7223              | 62         | 2,000     |
-| gisette       | 0.9725   | 0.89416                 | 0.9792              | 6,000      | 5,000     |
-| avazu         | 0.8834   | N/A                     | N/A                 | 40,428,967 | 1,000,000 |
-| rcv1          | 0.9644   | 0.7333                  | 0.9615              | 20,242     | 47,236    |
-| news20.binary | 0.9208   | 0.6870                  | 0.9070              | 19,996     | 1,355,191 |
-| real-sim      | 0.9681   | 0.7969                  | 0.9591              | 72,309     | 20,958    |
-
-此基准测试可在[这里](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/)下载
-
+`Baseline` 表示没有进行特征选择，直接将数据传入 LogisticRegression。 此基准测试中，仅用了 10% 的训练数据作为测试数据。 对于 GradientFeatureSelector，仅使用了前 20 个特征。 下列指标是在给定测试数据和标签上的平均精度。
+
+| 数据集           | 所有特征 + LR (acc, time, memory) | GradientFeatureSelector + LR (acc, time, memory) | TreeBasedClassifier + LR (acc, time, memory) | 训练次数       | 特征数量      |
+| ------------- | ----------------------------- | ------------------------------------------------ | -------------------------------------------- | ---------- | --------- |
+| colon-cancer  | 0.7547, 890ms, 348MiB         | 0.7368, 363ms, 286MiB                            | 0.7223, 171ms, 1171 MiB                      | 62         | 2,000     |
+| gisette       | 0.9725, 215ms, 584MiB         | 0.89416, 446ms, 397MiB                           | 0.9792, 911ms, 234MiB                        | 6,000      | 5,000     |
+| avazu         | 0.8834, N/A, N/A              | N/A, N/A, N/A                                    | N/A, N/A, N/A                                | 40,428,967 | 1,000,000 |
+| rcv1          | 0.9644, 557ms, 241MiB         | 0.7333, 401ms, 281MiB                            | 0.9615, 752ms, 284MiB                        | 20,242     | 47,236    |
+| news20.binary | 0.9208, 707ms, 361MiB         | 0.6870, 565ms, 371MiB                            | 0.9070, 904ms, 364MiB                        | 19,996     | 1,355,191 |
+| real-sim      | 0.9681, 433ms, 274MiB         | 0.7969, 251ms, 274MiB                            | 0.9591, 643ms, 367MiB                        | 72,309     | 20,958    |
+
+此基准测试可在[这里](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/)下载
+
+代码参考 `/examples/feature_engineering/gradient_feature_selector/benchmark_test.py`。
diff --git a/docs/zh_CN/NAS/DARTS.md b/docs/zh_CN/NAS/DARTS.md
@@ -0,0 +1,18 @@
+# NNI 中的 DARTS
+
+## 介绍
+
+论文 [DARTS: Differentiable Architecture Search](https://arxiv.org/abs/1806.09055) 通过可微分的方式来解决架构搜索中的伸缩性挑战。 此方法基于架构的连续放松的表示，从而允许在架构搜索时能使用梯度下降。
+
+为了实现，作者在小批量中交替优化网络权重和架构权重。 还进一步探讨了使用二阶优化（unroll）来替代一阶，来提高性能的可能性。
+
+NNI 的实现基于[官方实现](https://github.com/quark0/darts)以及一个[第三方实现](https://github.com/khanrc/pt.darts)。 目前，在 CIFAR10 上从头训练的一阶和二阶优化均已实现。
+
+## 重现结果
+
+为了重现本文的结果，我们做了一阶和二阶优化的实验。 由于时间限制，我们仅从第二阶段重新训练了*一次**最佳架构*。 我们的结果目前与论文的结果相当。 稍后会增加更多结果
+
+|              | 论文中           | 重现   |
+| ------------ | ------------- | ---- |
+| 一阶 (CIFAR10) | 3.00 +/- 0.14 | 2.78 |
+| 二阶（CIFAR10）  | 2.76 +/- 0.09 | 2.89 |
diff --git a/docs/zh_CN/NAS/ENAS.md b/docs/zh_CN/NAS/ENAS.md
@@ -0,0 +1,7 @@
+# NNI 中的 ENAS
+
+## 介绍
+
+论文 [Efficient Neural Architecture Search via Parameter Sharing](https://arxiv.org/abs/1802.03268) 通过在子模型之间共享参数来加速 NAS 过程。 在 ENAS 中，Contoller 学习在大的计算图中搜索最有子图的方式来发现神经网络。 Controller 通过梯度策略训练，从而选择出能在验证集上有最大期望奖励的子图。 同时对与所选子图对应的模型进行训练，以最小化规范交叉熵损失。
+
+NNI 的实现基于 [Tensorflow 的官方实现](https://github.com/melodyguan/enas)，包括了 CIFAR10 上的 Macro/Micro 搜索空间。 NNI 中从头训练的代码还未完成，当前还没有重现结果。
diff --git a/docs/zh_CN/NAS/NasInterface.md b/docs/zh_CN/NAS/NasInterface.md
@@ -2,8 +2,6 @@
 
 我们正在尝试通过统一的编程接口来支持各种 NAS 算法，当前处于试验阶段。 这意味着当前编程接口可能会进行重大变化。
 
-*先前的 [NAS annotation](../AdvancedFeature/GeneralNasInterfaces.md) 接口会很快被弃用。*
-
 ## 模型的编程接口
 
 在两种场景下需要用于设计和搜索模型的编程接口。
@@ -55,7 +53,7 @@ def forward(self, x):
     out = self.input_switch([in_tensor1, in_tensor2, in_tensor3])
     ...
 ```
-`InputChoice` 是一个 PyTorch module，初始化时需要元信息，例如，从多少个输入后选中选择多少个输入，初始化的 `InputChoice` 名称。 真正候选的输入张量只能在 `forward` 函数中获得。 在 `InputChoice` 中，`forward` 会在调用时传入实际的候选输入张量。
+`InputChoice` 是一个 PyTorch module，初始化时需要元信息，例如，从多少个输入后选中选择多少个输入，以及初始化的 `InputChoice` 名称。 真正候选的输入张量只能在 `forward` 函数中获得。 在 `forward` 函数中，`InputChoice` 模块需要在 `__init__` 中创建 (如, `self.input_switch`)，其会在有了实际候选输入 Tensor 的时候被调用。
 
 一些 [NAS Trainer](#one-shot-training-mode) 需要知道输入张量的来源层，因此在 `InputChoice` 中添加了输入参数 `choose_from` 来表示每个候选输入张量的来源层。 `choose_from` 是 str 的 list，每个元素都是 `LayerChoice` 和`InputChoice` 的 `key`，或者 module 的 name (详情参考[代码](https://github.com/microsoft/nni/blob/master/src/sdk/pynni/nni/nas/pytorch/mutables.py))。
 
@@ -75,7 +73,7 @@ class Cell(mutables.MutableScope):
 
 ## 两种训练模式
 
-在使用上述 API 在模型中嵌入 搜索空间后，下一步是从搜索空间中找到最好的模型。 有两种驯良模式：[one-shot 训练模式](#one-shot-training-mode) and [经典的分布式搜索](#classic-distributed-search)。
+在使用上述 API 在模型中嵌入 搜索空间后，下一步是从搜索空间中找到最好的模型。 有两种训练模式：[one-shot 训练模式](#one-shot-training-mode) and [经典的分布式搜索](#classic-distributed-search)。
 
 ### One-shot 训练模式
 
@@ -100,9 +98,7 @@ trainer.export(file='./chosen_arch')
 
 不同的 Trainer 可能有不同的输入参数，具体取决于其算法。 详细参数可参考具体的 [Trainer 代码](https://github.com/microsoft/nni/tree/master/src/sdk/pynni/nni/nas/pytorch)。 训练完成后，可通过 `trainer.export()` 导出找到的最好的模型。 无需通过 `nnictl` 来启动 NNI Experiment。
 
-[这里](./Overview.md#supported-one-shot-nas-algorithms)是所有支持的 Trainer。 [这里](https://github.com/microsoft/nni/tree/master/examples/nas/simple/train.py)是使用 NNI NAS API 的简单示例。
-
-[这里]()是完整示例的代码。
+[这里](Overview.md#supported-one-shot-nas-algorithms)是所有支持的 Trainer。 [这里](https://github.com/microsoft/nni/tree/master/examples/nas/simple/train.py)是使用 NNI NAS API 的简单示例。
 
 ### 经典分布式搜索
 
@@ -174,4 +170,4 @@ NNI 中的 NAS Tuner 需要自动生成搜索空间。 `LayerChoice` 和 `InputC
         "_idex": [1]
     }
 }
-```
+```
diff --git a/docs/zh_CN/NAS/Overview.md b/docs/zh_CN/NAS/Overview.md
@@ -6,11 +6,11 @@
 
 以此为动力，NNI 的目标是提供统一的体系结构，以加速NAS上的创新，并将最新的算法更快地应用于现实世界中的问题上。
 
-通过 [统一的接口](NasInterface.md)，有两种方式进行架构搜索。 [第一种](#supported-one-shot-nas-algorithms)称为 one-shot NAS，基于搜索空间构建了一个超级网络，并使用 one-shot 训练来生成性能良好的子模型。 [第二种](.ClassicNas.md)是传统的搜索方法，搜索空间中每个子模型作为独立的 Trial 运行，将性能结果发给 Tuner，由 Tuner 来生成新的子模型。
+通过[统一的接口](./NasInterface.md)，有两种方式进行架构搜索。 [第一种](#supported-one-shot-nas-algorithms)称为 one-shot NAS，基于搜索空间构建了一个超级网络，并使用 one-shot 训练来生成性能良好的子模型。 [第二种](./NasInterface.md#classic-distributed-search)是传统的搜索方法，搜索空间中每个子模型作为独立的 Trial 运行，将性能结果发给 Tuner，由 Tuner 来生成新的子模型。
 
 * [支持的 One-shot NAS 算法](#supported-one-shot-nas-algorithms)
-* [使用 NNI Experiment 的经典分布式 NAS](.NasInterface.md#classic-distributed-search)
-* [NNI NAS 编程接口](.NasInterface.md)
+* [使用 NNI Experiment 的经典分布式 NAS](./NasInterface.md#classic-distributed-search)
+* [NNI NAS 编程接口](./NasInterface.md)
 
 ## 支持的 One-shot NAS 算法
 
@@ -37,7 +37,7 @@ NNI 现在支持以下 NAS 算法，并且正在添加更多算法。 用户可
 
 #### 用法
 
-NNI 中的 ENAS 还在开发中，当前仅支持在 CIFAR10 上 Macro/Micro 搜索空间的搜索阶段。 在 PTB 上从头开始训练及其搜索空间尚未完成。
+NNI 中的 ENAS 还在开发中，当前仅支持在 CIFAR10 上 Macro/Micro 搜索空间的搜索阶段。 在 PTB 上从头开始训练及其搜索空间尚未完成。 [详细说明](ENAS.md)。
 
 ```bash
 ＃如果未克隆 NNI 代码。 如果代码已被克隆，请忽略此行并直接进入代码目录。
@@ -58,7 +58,7 @@ python3 search.py -h
 
 ### DARTS
 
-[DARTS: Differentiable Architecture Search](https://arxiv.org/abs/1806.09055) 在算法上的主要贡献是，引入了一种在两级网络优化中使用的可微分算法。
+[DARTS: Differentiable Architecture Search](https://arxiv.org/abs/1806.09055) 在算法上的主要贡献是，引入了一种在两级网络优化中使用的可微分算法。 [详细说明](DARTS.md)。
 
 #### 用法
 
@@ -97,8 +97,6 @@ python3 retrain.py --arc-checkpoint ../pdarts/checkpoints/epoch_2.json
 
 注意，我们正在尝试通过统一的编程接口来支持各种 NAS 算法，当前处于试验阶段。 这意味着当前编程接口将来会有变化。
 
-*先前的 [NAS annotation](../AdvancedFeature/GeneralNasInterfaces.md) 接口会很快被弃用。*
-
 ### 编程接口
 
 在两种场景下需要用于设计和搜索模型的编程接口。