-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Support private registry in PAI, Kubeflow and FrameworkController mode #1354
Conversation
@@ -198,6 +198,8 @@ Trial configuration in kubeflow mode have the following configuration keys: | |||
* image | |||
* Required key. In kubeflow mode, your trial program will be scheduled by Kubernetes to run in [Pod](https://kubernetes.io/docs/concepts/workloads/pods/pod/). This key is used to specify the Docker image used to create the pod where your trail program will run. | |||
* We already build a docker image [msranni/nni](https://hub.docker.com/r/msranni/nni/) on [Docker Hub](https://hub.docker.com/). It contains NNI python packages, Node modules and javascript artifact files required to start experiment, and all of NNI dependencies. The docker file used to build this image can be found at [here](https://github.com/Microsoft/nni/tree/master/deployment/docker/Dockerfile). You can either use this image directly in your config file, or build your own image based on it. | |||
* privateRegistryFilePath |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about privateRegistryAuthPath
or privateRegistryTokenPath
or privateRegistryCredentialPath
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to privateRegistryAuthPath
@@ -59,6 +59,8 @@ Compared with [LocalMode](LocalMode.md) and [RemoteMachineMode](RemoteMachineMod | |||
* Optional key. Set the virtualCluster of OpenPAI. If omitted, the job will run on default virtual cluster. | |||
* shmMB | |||
* Optional key. Set the shmMB configuration of OpenPAI, it set the shared memory for one task in the task role. | |||
* authFile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the name of this field is different from kubeflowMode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because the private registry logic in PAI and Kubernetes is different. In PAI, users need to pass a field authFile in job config, and submit the job, refer https://github.com/microsoft/pai/blob/12dd8d3a7379c264b337b308a320776127ed3ee4/docs/zh_CN/job_tutorial.md. in Kubernetes, users do not pass a file, they need create a secret according to the auth file, and pass the secret name in job config, refer https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
@@ -59,6 +59,8 @@ Compared with [LocalMode](LocalMode.md) and [RemoteMachineMode](RemoteMachineMod | |||
* Optional key. Set the virtualCluster of OpenPAI. If omitted, the job will run on default virtual cluster. | |||
* shmMB | |||
* Optional key. Set the shmMB configuration of OpenPAI, it set the shared memory for one task in the task role. | |||
* authFile | |||
* Optional key, Set the auth file path for private registry in OpenPAI, [Refer](https://github.com/microsoft/pai/blob/2ea69b45faa018662bc164ed7733f6fdbb4c42b3/docs/faq.md#q-how-to-use-private-docker-registry-job-image-when-submitting-an-openpai-job). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in OpenPAI
is misleading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to while using PAI mode
@@ -329,8 +329,8 @@ class FrameworkControllerTrainingService extends KubernetesTrainingService imple | |||
* @param frameworkcontrollerJobName job name | |||
* @param podResources pod template | |||
*/ | |||
private generateFrameworkControllerJobConfig(trialJobId: string, trialWorkingFolder: string, | |||
frameworkcontrollerJobName : string, podResources : any) : any { | |||
private async generateFrameworkControllerJobConfig(trialJobId: string, trialWorkingFolder: string, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you await
when calling this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
better to add unittest |
spec = { | ||
containers: containers, | ||
initContainers: initContainers, | ||
restartPolicy: 'OnFailure', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated code, it is suggested to eliminate duplication by reusing the common part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
template: { | ||
metadata: { | ||
// tslint:disable-next-line:no-null-keyword | ||
creationTimestamp: null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicated code, can be refactored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
} | ||
]); | ||
|
||
if(privateRegistrySecretName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing a space after 'if'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
|
||
protected async createRegistrySecret(filePath: string | undefined): Promise<string> { | ||
if(filePath === undefined || filePath === '') { | ||
return Promise.resolve(''); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not necessary to wrap '', just return '' will be cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
} | ||
} | ||
); | ||
return Promise.resolve(registrySecretName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not necessary to wrap the return value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
#755