Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Remove outputDir and dataDir in config file #1361

Merged
merged 8 commits into from
Jul 30, 2019
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions docs/en_US/TrainingService/PaiMode.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,6 @@ trial:
cpuNum: 1
memoryMB: 8196
image: msranni/nni:latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I want to output my models, where is the output path?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PAI does not have dataDir and outputDir in their doc anymore, so we remove these two path accordingly. In other training modes, users use OUTPUTDIR in trial code to output the data they want, @SparkSnail could you check the content of OUTPUTDIR in PAI mode?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any doc to tell?

Copy link
Contributor Author

@SparkSnail SparkSnail Jul 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users could use PAI_OUTPUT_DIR to get the output directory of PAI, the defaulet value is $PAI_DEFAULT_FS_URI/Output/$jobName, refer https://github.com/microsoft/pai/blob/b2324866d0280a2d22958717ea6025740f71b9f0/docs/job_tutorial.md.
In NNI, we use a variable NNI_OUTPUT_DIR to store the log data of trial_keeper, and the data will be uploaded to hdfs after trial is finished, so users may use os.environ['NNI_OUTPUT_DIR'] in their code to store model data, the data will be uploaded to hdfs finally. refer https://github.com/microsoft/nni/blob/master/docs/en_US/TrainingService/PaiMode.md#run-an-experiment

dataDir: hdfs://10.1.1.1:9000/nni
outputDir: hdfs://10.1.1.1:9000/nni
# Configuration to access OpenPAI Cluster
paiConfig:
userName: your_pai_nni_user
Expand All @@ -51,10 +49,6 @@ Compared with [LocalMode](LocalMode.md) and [RemoteMachineMode](RemoteMachineMod
* image
* Required key. In pai mode, your trial program will be scheduled by OpenPAI to run in [Docker container](https://www.docker.com/). This key is used to specify the Docker image used to create the container in which your trial will run.
* We already build a docker image [nnimsra/nni](https://hub.docker.com/r/msranni/nni/) on [Docker Hub](https://hub.docker.com/). It contains NNI python packages, Node modules and javascript artifact files required to start experiment, and all of NNI dependencies. The docker file used to build this image can be found at [here](https://github.com/Microsoft/nni/tree/master/deployment/docker/Dockerfile). You can either use this image directly in your config file, or build your own image based on it.
* dataDir
* Optional key. It specifies the HDFS data direcotry for trial to download data. The format should be something like hdfs://{your HDFS host}:9000/{your data directory}
* outputDir
* Optional key. It specifies the HDFS output directory for trial. Once the trial is completed (either succeed or fail), trial's stdout, stderr will be copied to this directory by NNI sdk automatically. The format should be something like hdfs://{your HDFS host}:9000/{your output directory}
* virtualCluster
* Optional key. Set the virtualCluster of OpenPAI. If omitted, the job will run on default virtual cluster.
* shmMB
Expand Down
6 changes: 0 additions & 6 deletions docs/zh_CN/PaiMode.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,6 @@ trial:
cpuNum: 1
memoryMB: 8196
image: msranni/nni:latest
dataDir: hdfs://10.1.1.1:9000/nni
outputDir: hdfs://10.1.1.1:9000/nni
# 配置访问的 OpenPAI 集群
paiConfig:
userName: your_pai_nni_user
Expand All @@ -54,10 +52,6 @@ paiConfig:
* image
* 必填。 在 pai 模式中,Trial 程序由 OpenPAI 在 [Docker 容器](https://www.docker.com/)中安排运行。 此键用来指定 Trial 程序的容器使用的 Docker 映像。
* [Docker Hub](https://hub.docker.com/) 上有预制的 NNI Docker 映像 [nnimsra/nni](https://hub.docker.com/r/msranni/nni/)。 它包含了用来启动 NNI Experiment 所依赖的所有 Python 包,Node 模块和 JavaScript。 生成此 Docker 映像的文件在[这里](https://github.com/Microsoft/nni/tree/master/deployment/docker/Dockerfile)。 可以直接使用此映像,或参考它来生成自己的映像。
* dataDir
* 可选。 指定了 Trial 用于下载数据的 HDFS 数据目录。 格式应为 hdfs://{your HDFS host}:9000/{数据目录}
* outputDir
* 可选。 指定了 Trial 的 HDFS 输出目录。 Trial 在完成(成功或失败)后,Trial 的 stdout, stderr 会被 NNI 自动复制到此目录中。 格式应为 hdfs://{your HDFS host}:9000/{输出目录}
* virtualCluster
* 可选。 设置 OpenPAI 的 virtualCluster,即虚拟集群。 如果未设置此参数,将使用默认的虚拟集群。
* shmMB
Expand Down
4 changes: 0 additions & 4 deletions examples/trials/auto-gbdt/config_pai.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,6 @@ trial:
memoryMB: 8196
#The docker image to run nni job on pai
image: msranni/nni:latest
#The hdfs directory to store data on pai, format 'hdfs://host:port/directory'
dataDir: hdfs://10.10.10.10:9000/username/nni
#The hdfs directory to store output data generated by nni, format 'hdfs://host:port/directory'
outputDir: hdfs://10.10.10.10:9000/username/nni
paiConfig:
#The username to login pai
userName: username
Expand Down
4 changes: 0 additions & 4 deletions examples/trials/cifar10_pytorch/config_pai.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,6 @@ trial:
memoryMB: 8196
#The docker image to run nni job on pai
image: msranni/nni:latest
#The hdfs directory to store data on pai, format 'hdfs://host:port/directory'
dataDir: hdfs://10.10.10.10:9000/username/nni
#The hdfs directory to store output data generated by nni, format 'hdfs://host:port/directory'
outputDir: hdfs://10.10.10.10:9000/username/nni
paiConfig:
#The username to login pai
userName: username
Expand Down
4 changes: 0 additions & 4 deletions examples/trials/ga_squad/config_pai.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,6 @@ trial:
memoryMB: 32869
#The docker image to run nni job on pai
image: msranni/nni:latest
#The hdfs directory to store data on pai, format 'hdfs://host:port/directory'
dataDir: hdfs://10.10.10.10:9000/username/nni
#The hdfs directory to store output data generated by nni, format 'hdfs://host:port/directory'
outputDir: hdfs://10.10.10.10:9000/username/nni
paiConfig:
#The username to login pai
userName: username
Expand Down
4 changes: 0 additions & 4 deletions examples/trials/mnist-advisor/config_pai.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,6 @@ trial:
memoryMB: 8196
#The docker image to run nni job on pai
image: msranni/nni:latest
#The hdfs directory to store data on pai, format 'hdfs://host:port/directory'
dataDir: hdfs://10.10.10.10:9000/username/nni
#The hdfs directory to store output data generated by nni, format 'hdfs://host:port/directory'
outputDir: hdfs://10.10.10.10:9000/username/nni
paiConfig:
#The username to login pai
userName: username
Expand Down
4 changes: 0 additions & 4 deletions examples/trials/mnist-annotation/config_pai.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,6 @@ trial:
memoryMB: 8196
#The docker image to run nni job on pai
image: msranni/nni:latest
#The hdfs directory to store data on pai, format 'hdfs://host:port/directory'
dataDir: hdfs://10.10.10.10:9000/username/nni
#The hdfs directory to store output data generated by nni, format 'hdfs://host:port/directory'
outputDir: hdfs://10.10.10.10:9000/username/nni
paiConfig:
#The username to login pai
userName: username
Expand Down
4 changes: 0 additions & 4 deletions examples/trials/mnist-batch-tune-keras/config_pai.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@ trial:
memoryMB: 8196
#The docker image to run nni job on pai
image: msranni/nni:latest
#The hdfs directory to store data on pai, format 'hdfs://host:port/directory'
dataDir: hdfs://10.10.10.10:9000/username/nni
#The hdfs directory to store output data generated by nni, format 'hdfs://host:port/directory'
outputDir: hdfs://10.10.10.10:9000/username/nni
paiConfig:
#The username to login pai
userName: username
Expand Down
4 changes: 0 additions & 4 deletions examples/trials/mnist-keras/config_pai.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,6 @@ trial:
memoryMB: 8196
#The docker image to run nni job on pai
image: msranni/nni:latest
#The hdfs directory to store data on pai, format 'hdfs://host:port/directory'
dataDir: hdfs://10.10.10.10:9000/username/nni
#The hdfs directory to store output data generated by nni, format 'hdfs://host:port/directory'
outputDir: hdfs://10.10.10.10:9000/username/nni
paiConfig:
#The username to login pai
userName: username
Expand Down
4 changes: 0 additions & 4 deletions examples/trials/mnist/config_pai.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,6 @@ trial:
memoryMB: 8196
#The docker image to run nni job on pai
image: msranni/nni:latest
#The hdfs directory to store data on pai, format 'hdfs://host:port/directory'
dataDir: hdfs://10.10.10.10:9000/username/nni
#The hdfs directory to store output data generated by nni, format 'hdfs://host:port/directory'
outputDir: hdfs://10.10.10.10:9000/username/nni
paiConfig:
#The username to login pai
userName: username
Expand Down
4 changes: 0 additions & 4 deletions examples/trials/network_morphism/FashionMNIST/config_pai.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,6 @@ trial:
memoryMB: 8196
#The docker image to run nni job on pai
image: msranni/nni:latest
#The hdfs directory to store data on pai, format 'hdfs://host:port/directory'
dataDir: hdfs://10.10.10.10:9000/username/nni
#The hdfs directory to store output data generated by nni, format 'hdfs://host:port/directory'
outputDir: hdfs://10.10.10.10:9000/username/nni
paiConfig:
#The username to login pai
userName: username
Expand Down
4 changes: 0 additions & 4 deletions examples/trials/network_morphism/cifar10/config_pai.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,6 @@ trial:
memoryMB: 8196
#The docker image to run nni job on pai
image: msranni/nni:latest
#The hdfs directory to store data on pai, format 'hdfs://host:port/directory'
dataDir: hdfs://10.10.10.10:9000/username/nni
#The hdfs directory to store output data generated by nni, format 'hdfs://host:port/directory'
outputDir: hdfs://10.10.10.10:9000/username/nni
paiConfig:
#The username to login pai
userName: username
Expand Down
4 changes: 0 additions & 4 deletions examples/trials/sklearn/classification/config_pai.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,6 @@ trial:
memoryMB: 8196
#The docker image to run nni job on pai
image: msranni/nni:latest
#The hdfs directory to store data on pai, format 'hdfs://host:port/directory'
dataDir: hdfs://10.10.10.10:9000/username/nni
#The hdfs directory to store output data generated by nni, format 'hdfs://host:port/directory'
outputDir: hdfs://10.10.10.10:9000/username/nni
paiConfig:
#The username to login pai
userName: username
Expand Down
4 changes: 0 additions & 4 deletions examples/trials/sklearn/regression/config_pai.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,6 @@ trial:
memoryMB: 8196
#The docker image to run nni job on pai
image: msranni/nni:latest
#The hdfs directory to store data on pai, format 'hdfs://host:port/directory'
dataDir: hdfs://10.10.10.10:9000/username/nni
#The hdfs directory to store output data generated by nni, format 'hdfs://host:port/directory'
outputDir: hdfs://10.10.10.10:9000/username/nni
paiConfig:
#The username to login pai
userName: username
Expand Down
14 changes: 2 additions & 12 deletions src/nni_manager/training_service/pai/paiConfig.ts
Original file line number Diff line number Diff line change
Expand Up @@ -69,10 +69,6 @@ export class PAIJobConfig {
public readonly jobName: string;
// URL pointing to the Docker image for all tasks in the job
public readonly image: string;
// Data directory existing on HDFS
public readonly dataDir: string;
// Output directory on HDFS
public readonly outputDir: string;
// Code directory on HDFS
public readonly codeDir: string;

Expand All @@ -90,12 +86,10 @@ export class PAIJobConfig {
* @param outputDir Output directory on HDFS
* @param taskRoles List of taskRole, one task role at least
*/
constructor(jobName: string, image : string, dataDir : string, outputDir : string, codeDir : string,
constructor(jobName: string, image : string, codeDir : string,
taskRoles : PAITaskRole[], virtualCluster: string) {
this.jobName = jobName;
this.image = image;
this.dataDir = dataDir;
this.outputDir = outputDir;
this.codeDir = codeDir;
this.taskRoles = taskRoles;
this.virtualCluster = virtualCluster;
Expand Down Expand Up @@ -130,22 +124,18 @@ export class NNIPAITrialConfig extends TrialConfig {
public readonly cpuNum: number;
public readonly memoryMB: number;
public readonly image: string;
public readonly dataDir: string;
public outputDir: string;

//The virtual cluster job runs on. If omitted, the job will run on default virtual cluster
public virtualCluster?: string;
//Shared memory for one task in the task role
public shmMB?: number;

constructor(command : string, codeDir : string, gpuNum : number, cpuNum: number, memoryMB: number,
image: string, dataDir: string, outputDir: string, virtualCluster?: string, shmMB?: number) {
image: string, virtualCluster?: string, shmMB?: number) {
super(command, codeDir, gpuNum);
this.cpuNum = cpuNum;
this.memoryMB = memoryMB;
this.image = image;
this.dataDir = dataDir;
this.outputDir = outputDir;
this.virtualCluster = virtualCluster;
this.shmMB = shmMB;
}
Expand Down
3 changes: 0 additions & 3 deletions src/nni_manager/training_service/pai/paiData.ts
Original file line number Diff line number Diff line change
Expand Up @@ -70,9 +70,6 @@ export const PAI_TRIAL_COMMAND_FORMAT: string =
--pai_hdfs_output_dir '{9}' --pai_hdfs_host '{10}' --pai_user_name {11} --nni_hdfs_exp_dir '{12}' --webhdfs_path '/webhdfs/api/v1' \
--nni_manager_version '{13}' --log_collection '{14}'`;

export const PAI_OUTPUT_DIR_FORMAT: string =
`hdfs://{0}:9000/`;

// tslint:disable:no-http-string
export const PAI_LOG_PATH_FORMAT: string =
`http://{0}/webhdfs/explorer.html#{1}`;
Loading