Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge master #272

Merged
merged 13 commits into from
Sep 21, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,12 @@ _END := $(shell echo -e '\033[0m')
UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S), Linux)
OS_SPEC := linux
NODE_URL := https://nodejs.org/dist/v10.22.1/node-v10.22.1-linux-x64.tar.xz
else ifeq ($(UNAME_S), Darwin)
OS_SPEC := darwin
NODE_URL := https://nodejs.org/dist/v10.22.1/node-v10.22.1-darwin-x64.tar.xz
else
$(error platform $(UNAME_S) not supported)
$(error platform $(UNAME_S) not supported)
endif

## Install directories
Expand Down Expand Up @@ -143,11 +145,11 @@ clean:

$(NNI_NODE_TARBALL):
#$(_INFO) Downloading Node.js $(_END)
wget https://aka.ms/nni/nodejs-download/$(OS_SPEC) -O $(NNI_NODE_TARBALL)
wget $(NODE_URL) -O $(NNI_NODE_TARBALL)

$(NNI_YARN_TARBALL):
#$(_INFO) Downloading Yarn $(_END)
wget https://aka.ms/yarn-download -O $(NNI_YARN_TARBALL)
wget https://github.com/yarnpkg/yarn/releases/download/v1.22.5/yarn-v1.22.5.tar.gz -O $(NNI_YARN_TARBALL)

.PHONY: install-dependencies
install-dependencies: $(NNI_NODE_TARBALL) $(NNI_YARN_TARBALL)
Expand Down
9 changes: 5 additions & 4 deletions README_zh_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

**NNI (Neural Network Intelligence)** 是一个轻量但强大的工具包,帮助用户**自动**的进行[特征工程](docs/zh_CN/FeatureEngineering/Overview.md),[神经网络架构搜索](docs/zh_CN/NAS/Overview.md),[超参调优](docs/zh_CN/Tuner/BuiltinTuner.md)以及[模型压缩](docs/zh_CN/Compressor/Overview.md)。

NNI 管理自动机器学习 (AutoML) 的 Experiment,**调度运行**由调优算法生成的 Trial 任务来找到最好的神经网络架构和/或超参,支持**各种训练环境**,如[本机](docs/zh_CN/TrainingService/LocalMode.md),[远程服务器](docs/zh_CN/TrainingService/RemoteMachineMode.md),[OpenPAI](docs/zh_CN/TrainingService/PaiMode.md),[Kubeflow](docs/zh_CN/TrainingService/KubeflowMode.md),[基于 K8S 的 FrameworkController(如,AKS 等)](docs/zh_CN/TrainingService/FrameworkControllerMode.md), [DLWorkspace (又称 DLTS)](docs/zh_CN/TrainingService/DLTSMode.md) 和其它云服务
NNI 管理自动机器学习 (AutoML) 的 Experiment,**调度运行**由调优算法生成的 Trial 任务来找到最好的神经网络架构和/或超参,支持**各种训练环境**,如[本机](docs/zh_CN/TrainingService/LocalMode. md),[远程服务器](docs/zh_CN/TrainingService/RemoteMachineMode. md),[OpenPAI](docs/zh_CN/TrainingService/PaiMode. md),[Kubeflow](docs/zh_CN/TrainingService/KubeflowMode. md),[基于 K8S 的 FrameworkController(如,AKS 等)](docs/zh_CN/TrainingService/FrameworkControllerMode. md), [DLWorkspace](docs/zh_CN/TrainingService/DLTSMode. md) (又称 DLTS)</a>, [AML](docs/zh_CN/TrainingService/AMLMode.md) (Azure Machine Learning) 以及其它环境

## **使用场景**

Expand All @@ -19,7 +19,7 @@ NNI 管理自动机器学习 (AutoML) 的 Experiment,**调度运行**由调优
* 想要更容易**实现或试验新的自动机器学习算法**的研究员或数据科学家,包括:超参调优算法,神经网络搜索算法以及模型压缩算法。
* 在机器学习平台中**支持自动机器学习**。

### **[NNI v1.6 已发布!](https://github.com/microsoft/nni/releases) &nbsp;[<img width="48" src="docs/img/release_icon.png" />](#nni-released-reminder)**
### **[NNI v1.8 已发布!](https://github.com/microsoft/nni/releases) &nbsp;[<img width="48" src="docs/img/release_icon.png" />](#nni-released-reminder)**

## **NNI 功能一览**

Expand Down Expand Up @@ -164,6 +164,7 @@ NNI 提供命令行工具以及友好的 WebUI 来管理训练的 Experiment。
<ul>
<li><a href="docs/zh_CN/TrainingService/LocalMode.md">本机</a></li>
<li><a href="docs/zh_CN/TrainingService/RemoteMachineMode.md">远程计算机</a></li>
<li><a href="docs/zh_CN/TrainingService/AMLMode.md">AML(Azure Machine Learning)</a></li>
<li><b>基于 Kubernetes 的平台</b></li>
<ul><li><a href="docs/zh_CN/TrainingService/PaiMode.md">OpenPAI</a></li>
<li><a href="docs/zh_CN/TrainingService/KubeflowMode.md">Kubeflow</a></li>
Expand Down Expand Up @@ -208,7 +209,7 @@ NNI 提供命令行工具以及友好的 WebUI 来管理训练的 Experiment。

### **安装**

NNI 支持并在 Ubuntu >= 16.04, macOS >= 10.14.1, 和 Windows 10 >= 1809 通过了测试。 在 `python 64-bit >= 3.5` 的环境中,只需要运行 `pip install` 即可完成安装。
NNI 支持并在 Ubuntu >= 16.04, macOS >= 10.14.1, 和 Windows 10 >= 1809 通过了测试。 在 `python 64-bit >= 3.6` 的环境中,只需要运行 `pip install` 即可完成安装。

Linux 或 macOS

Expand Down Expand Up @@ -239,7 +240,7 @@ Linux 和 macOS 下 NNI 系统需求[参考这里](https://nni.readthedocs.io/zh
* 通过克隆源代码下载示例。

```bash
git clone -b v1.6 https://github.com/Microsoft/nni.git
git clone -b v1.8 https://github.com/Microsoft/nni.git
```

* 运行 MNIST 示例。
Expand Down
3 changes: 2 additions & 1 deletion azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,8 @@ jobs:

steps:
- script: |
echo "##vso[task.setvariable variable=PATH]/usr/local/Cellar/python@3.7/3.7.8_1/bin:${HOME}/Library/Python/3.7/bin:${PATH}"
export PYTHON37_BIN_DIR=/usr/local/Cellar/python@3.7/`ls /usr/local/Cellar/python@3.7`/bin
echo "##vso[task.setvariable variable=PATH]${PYTHON37_BIN_DIR}:${HOME}/Library/Python/3.7/bin:${PATH}"
python3 -m pip install --upgrade pip setuptools
displayName: 'Install python tools'
- script: |
Expand Down
20 changes: 11 additions & 9 deletions deployment/docker/README_zh_CN.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,20 @@
# Dockerfile
# Dockerfile

## 1. 说明

这是 NNI 项目的 Dockerfile 文件。 其中包含了 NNI 以及多个流行的深度学习框架。 在 `Ubuntu 16.04 LTS` 上进行过测试:

CUDA 9.0, CuDNN 7.0
numpy 1.14.3,scipy 1.1.0
TensorFlow-gpu 1.10.0
Keras 2.1.6
PyTorch 0.4.1
scikit-learn 0.20.0
CUDA 9.0
CuDNN 7.0
numpy 1.14.3
scipy 1.1.0
tensorflow-gpu 1.15.0
keras 2.1.6
torch 1.4.0
scikit-learn 0.23.2
pandas 0.23.4
lightgbm 2.2.2
NNI v0.7
nni


此 Dockerfile 可作为定制的参考。
Expand Down Expand Up @@ -47,4 +49,4 @@

使用下列命令从 docker Hub 中拉取 NNI docker 映像。

docker pull msranni/nni:latest
docker pull msranni/nni:latest
6 changes: 4 additions & 2 deletions deployment/pypi/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@ UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S), Linux)
OS_SPEC := linux
WHEEL_SPEC := manylinux1_x86_64
NODE_URL := https://nodejs.org/dist/v10.22.1/node-v10.22.1-linux-x64.tar.xz
else ifeq ($(UNAME_S), Darwin)
OS_SPEC := darwin
WHEEL_SPEC := macosx_10_9_x86_64
NODE_URL := https://nodejs.org/dist/v10.22.1/node-v10.22.1-darwin-x64.tar.xz
else
$(error platform $(UNAME_S) not supported)
endif
Expand All @@ -28,11 +30,11 @@ NNI_YARN := PATH=$(CWD)node-$(OS_SPEC)-x64/bin:$${PATH} $(NNI_YARN_FOLDER)/bin/y
build:
# Building version $(NNI_VERSION_VALUE)
python3 -m pip install --user --upgrade setuptools wheel
wget -q https://aka.ms/nni/nodejs-download/$(OS_SPEC) -O $(CWD)node-$(OS_SPEC)-x64.tar.xz
wget -q $(NODE_URL) -O $(CWD)node-$(OS_SPEC)-x64.tar.xz
rm -rf $(CWD)node-$(OS_SPEC)-x64
mkdir $(CWD)node-$(OS_SPEC)-x64
tar xf $(CWD)node-$(OS_SPEC)-x64.tar.xz -C node-$(OS_SPEC)-x64 --strip-components 1
wget -q https://aka.ms/yarn-download -O $(NNI_YARN_TARBALL)
wget -q https://github.com/yarnpkg/yarn/releases/download/v1.22.5/yarn-v1.22.5.tar.gz -O $(NNI_YARN_TARBALL)
rm -rf $(NNI_YARN_FOLDER)
mkdir $(NNI_YARN_FOLDER)
tar -xf $(NNI_YARN_TARBALL) -C $(NNI_YARN_FOLDER) --strip-components 1
Expand Down
5 changes: 3 additions & 2 deletions deployment/pypi/install.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,12 @@ $OS_SPEC = "windows"
if($version_os -eq 64){
$OS_VERSION = 'win64'
$WHEEL_SPEC = 'win_amd64'
$NODE_URL = 'https://nodejs.org/download/release/v10.22.1/node-v10.22.1-win-x64.zip'
}
else{
$OS_VERSION = 'win32'
$WHEEL_SPEC = 'win32'
$NODE_URL = 'https://nodejs.org/download/release/v10.22.1/node-v10.22.1-win-x86.zip'
}

$TIME_STAMP = date -u "+%y%m%d%H%M"
Expand All @@ -28,11 +30,10 @@ $NNI_VERSION_TEMPLATE = "999.0.0-developing"

python -m pip install --upgrade setuptools wheel

$nodeUrl = "https://aka.ms/nni/nodejs-download/" + $OS_VERSION
$NNI_NODE_ZIP = "$CWD\node-$OS_SPEC.zip"
$NNI_NODE_FOLDER = "$CWD\node-$OS_SPEC"
$unzipNodeDir = "node-v*"
(New-Object Net.WebClient).DownloadFile($nodeUrl, $NNI_NODE_ZIP)
(New-Object Net.WebClient).DownloadFile($NODE_URL, $NNI_NODE_ZIP)
if(Test-Path $NNI_NODE_FOLDER){
Remove-Item $NNI_NODE_FOLDER -Recurse -Force
}
Expand Down
1 change: 1 addition & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ sphinx-markdown-tables==0.0.9
sphinx-rtd-theme==0.4.2
sphinxcontrib-websupport==1.1.0
recommonmark==0.5.0
pygments==2.7.1
hyperopt
json_tricks
numpy
Expand Down
2 changes: 1 addition & 1 deletion docs/zh_CN/Assessor/CustomizeAssessor.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ assessor:

注意在 **2** 中, `trial_history` 对象与 Trial 通过 `report_intermediate_result` 函数返回给 Assessor 的对象完全一致。

Assessor 的工作目录是`<home>/nni/experiments/<experiment_id>/log` 可从环境变量 `NNI_LOG_DIRECTORY` 中获取。
Assessor 的工作目录是`<home>/nni-experiments/<experiment_id>/log` 可从环境变量 `NNI_LOG_DIRECTORY` 中获取。

更多示例,可参考:

Expand Down
Original file line number Diff line number Diff line change
@@ -1,15 +1,14 @@
# 超参数优化的对比

*匿名作者*

超参优化算法(HPO)在几个问题上的对比。

超参数优化算法如下:

- [Random Search(随机搜索)](../Tuner/BuiltinTuner.md)
- [Grid Search(遍历搜索)](../Tuner/BuiltinTuner.md)
- [Random Search](../Tuner/BuiltinTuner.md)
- [Grid Search](../Tuner/BuiltinTuner.md)
- [Evolution](../Tuner/BuiltinTuner.md)
- [Anneal(退火算法)](../Tuner/BuiltinTuner.md)
- [Anneal](../Tuner/BuiltinTuner.md)
- [Metis](../Tuner/BuiltinTuner.md)
- [TPE](../Tuner/BuiltinTuner.md)
- [SMAC](../Tuner/BuiltinTuner.md)
Expand All @@ -20,15 +19,16 @@

环境:

OS: Linux Ubuntu 16.04 LTS
CPU: Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz 2600 MHz
Memory: 112 GB
NNI Version: v0.7
NNI 模式(local|pai|remote): local
Python 版本: 3.6
使用的虚拟环境: Conda
是否在 Docker 中运行: no

```
OS: Linux Ubuntu 16.04 LTS
CPU: Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz 2600 MHz
Memory: 112 GB
NNI Version: v0.7
NNI 模式(local|pai|remote): local
Python 版本: 3.6
使用的虚拟环境: Conda
是否在 Docker 中运行: no
```

## AutoGBDT 示例

Expand Down Expand Up @@ -67,40 +67,40 @@
}
```

总搜索空间为 1, 204, 224 次,将最大 Trial 次数设置为1000。 时间限制为 48 小时。
总搜索空间为 1, 204, 224 次,将最大 Trial 次数设置为 1000。 时间限制为 48 小时。

### 结果

| 算法 | 最好的损失值 | 最好的 5 次损失的平均值 | 最好的 10 次损失的平均 |
| ------------- | ------------ | ------------- | ------------- |
| Random Search | 0.418854 | 0.420352 | 0.421553 |
| Random Search | 0.417364 | 0.420024 | 0.420997 |
| Random Search | 0.417861 | 0.419744 | 0.420642 |
| Grid Search | 0.498166 | 0.498166 | 0.498166 |
| Evolution | 0.409887 | 0.409887 | 0.409887 |
| Evolution | 0.413620 | 0.413875 | 0.414067 |
| Evolution | 0.409887 | 0.409887 | 0.409887 |
| Anneal | 0.414877 | 0.417289 | 0.418281 |
| Anneal | 0.409887 | 0.409887 | 0.410118 |
| Anneal | 0.413683 | 0.416949 | 0.417537 |
| Metis | 0.416273 | 0.420411 | 0.422380 |
| Metis | 0.420262 | 0.423175 | 0.424816 |
| Metis | 0.421027 | 0.424172 | 0.425714 |
| TPE | 0.414478 | 0.414478 | 0.414478 |
| TPE | 0.415077 | 0.417986 | 0.418797 |
| TPE | 0.415077 | 0.417009 | 0.418053 |
| SMAC | **0.408386** | **0.408386** | **0.408386** |
| SMAC | 0.414012 | 0.414012 | 0.414012 |
| SMAC | **0.408386** | **0.408386** | **0.408386** |
| BOHB | 0.410464 | 0.415319 | 0.417755 |
| BOHB | 0.418995 | 0.420268 | 0.422604 |
| BOHB | 0.415149 | 0.418072 | 0.418932 |
| HyperBand | 0.414065 | 0.415222 | 0.417628 |
| HyperBand | 0.416807 | 0.417549 | 0.418828 |
| HyperBand | 0.415550 | 0.415977 | 0.417186 |
| GP | 0.414353 | 0.418563 | 0.420263 |
| GP | 0.414395 | 0.418006 | 0.420431 |
| GP | 0.412943 | 0.416566 | 0.418443 |
| 算法 | 最好的损失值 | 最好的 5 次损失的平均值 | 最好的 10 次损失的平均值 |
| ------------- | ------------ | ------------- | -------------- |
| Random Search | 0.418854 | 0.420352 | 0.421553 |
| Random Search | 0.417364 | 0.420024 | 0.420997 |
| Random Search | 0.417861 | 0.419744 | 0.420642 |
| Grid Search | 0.498166 | 0.498166 | 0.498166 |
| Evolution | 0.409887 | 0.409887 | 0.409887 |
| Evolution | 0.413620 | 0.413875 | 0.414067 |
| Evolution | 0.409887 | 0.409887 | 0.409887 |
| Anneal | 0.414877 | 0.417289 | 0.418281 |
| Anneal | 0.409887 | 0.409887 | 0.410118 |
| Anneal | 0.413683 | 0.416949 | 0.417537 |
| Metis | 0.416273 | 0.420411 | 0.422380 |
| Metis | 0.420262 | 0.423175 | 0.424816 |
| Metis | 0.421027 | 0.424172 | 0.425714 |
| TPE | 0.414478 | 0.414478 | 0.414478 |
| TPE | 0.415077 | 0.417986 | 0.418797 |
| TPE | 0.415077 | 0.417009 | 0.418053 |
| SMAC | **0.408386** | **0.408386** | **0.408386** |
| SMAC | 0.414012 | 0.414012 | 0.414012 |
| SMAC | **0.408386** | **0.408386** | **0.408386** |
| BOHB | 0.410464 | 0.415319 | 0.417755 |
| BOHB | 0.418995 | 0.420268 | 0.422604 |
| BOHB | 0.415149 | 0.418072 | 0.418932 |
| HyperBand | 0.414065 | 0.415222 | 0.417628 |
| HyperBand | 0.416807 | 0.417549 | 0.418828 |
| HyperBand | 0.415550 | 0.415977 | 0.417186 |
| GP | 0.414353 | 0.418563 | 0.420263 |
| GP | 0.414395 | 0.418006 | 0.420431 |
| GP | 0.412943 | 0.416566 | 0.418443 |

此例中,所有算法都使用了默认参数。 Metis 算法因为其高斯计算过程的复杂度为 O(n^3) 而运行非常慢,因此仅执行了 300 次 Trial。

Expand All @@ -114,21 +114,22 @@

#### 计算机配置

RocksDB: version 6.1
CPU: 6 * Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
CPUCache: 35840 KB
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000

```
RocksDB: version 6.1
CPU: 6 * Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
CPUCache: 35840 KB
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
```

#### 存储性能

**延迟**:每个 IO 请求都需要一些时间才能完成,这称为平均延迟。 有几个因素会影响此时间,包括网络连接质量和硬盘IO性能。

**IOPS**: **每秒的 IO 操作数量**,这意味着可以在一秒钟内完成的*读取或写入操作次数*
**IOPS**:**每秒的 IO 操作数量**,这意味着可以在一秒钟内完成的_读取或写入操作次数_

**IO 大小**: **每个 IO 请求的大小**。 根据操作系统和需要磁盘访问的应用程序、服务,它将同时发出读取或写入一定数量数据的请求。
**IO 大小**:**每个 IO 请求的大小**。 根据操作系统和需要磁盘访问的应用程序、服务,它将同时发出读取或写入一定数量数据的请求。

**吞吐量(以 MB/s 为单位)= 平均 IO 大小 x IOPS **

Expand Down Expand Up @@ -200,7 +201,7 @@ IOPS 与在线处理能力有关,我们在实验中使用 IOPS 作为指标。
| SMAC | 491067 | 490472 | **491136** |
| Metis | 444920 | 457060 | 454438 |

Figure:
图:

![](../../img/hpo_rocksdb_fillrandom.png)

Expand All @@ -215,6 +216,6 @@ Figure:
| SMAC | 2270874 | 2284904 | 2282266 |
| Metis | **2287696** | 2283496 | 2277701 |

Figure:
图:

![](../../img/hpo_rocksdb_readrandom.png)
![](../../img/hpo_rocksdb_readrandom.png)
Loading