-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Refactor code storage logic for trial #2403
Conversation
SparkSnail
commented
May 5, 2020
•
edited
Loading
edited
- in local platform, copy code files to trial's workding dir.
- in other platform, copy code files to remote machine firstly, and then copy data in remote machine to trial's working folder, not submitted from local every time.
merge master
merge master
Update evolution doc (microsoft#1493)
merge master
merge master
merge master
augment pylintrc (microsoft#1643)
fix console.log (microsoft#1636)
merge master
merge master
merge master
merge master
Filter prune algo implementation (microsoft#1655)
merge master
merge master
merge master
merge master
merge master
merge master
merge master
merge master
merge master
merge master
Support monitor mode when creating or resuming a new experiment (microsoft#1933)
Add test for documentation build (microsoft#1924)
fix pipeline status badge (microsoft#1942)
merge master
merge master
} | ||
const azureKubeflowClusterConfig: FrameworkControllerClusterConfigAzure = <FrameworkControllerClusterConfigAzure>this.fcClusterConfig; | ||
return await this.uploadFolderToAzureStorage(srcDirectory, destDirectory, azureKubeflowClusterConfig.uploadRetryCount); | ||
} else if (this.fcClusterConfig.storage === 'nfs' || this.fcClusterConfig.storage === undefined) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or the storage is undefined
? This should be an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In original design, the storage
field is optional, if it is not set, the default value is nfs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but when storage is undefined, there is no config for storage, how do you mount it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is supposed to be check in NNICTL side, if users didn't set storage field, nnictl should check if usres set nfs configuration.
...anager/training_service/kubernetes/frameworkcontroller/frameworkcontrollerTrainingService.ts
Outdated
Show resolved
Hide resolved
...anager/training_service/kubernetes/frameworkcontroller/frameworkcontrollerTrainingService.ts
Outdated
Show resolved
Hide resolved
...anager/training_service/kubernetes/frameworkcontroller/frameworkcontrollerTrainingService.ts
Outdated
Show resolved
Hide resolved
...anager/training_service/kubernetes/frameworkcontroller/frameworkcontrollerTrainingService.ts
Show resolved
Hide resolved
@SparkSnail should also verify the logic of code dir in DLTS training service, and update the logic if needed. |
would also better to make sure IT is tested for this pr |
src/nni_manager/training_service/kubernetes/kubeflow/kubeflowTrainingService.ts
Show resolved
Hide resolved
src/nni_manager/training_service/kubernetes/kubernetesTrainingService.ts
Show resolved
Hide resolved
src/nni_manager/training_service/kubernetes/kubernetesTrainingService.ts
Show resolved
Hide resolved
for different training services, there should be exactly the same directory structure for trial working directory. suggest to use a folder named |
src/nni_manager/training_service/pai/paiK8S/paiK8STrainingService.ts
Outdated
Show resolved
Hide resolved
Sure, there are two place to store code for a trial, one path is the common folder for all trials under experiment folder, the path is |
Yes, the trial jobs in DLTS share same codeDir folder, and also need to be refactored. |
We can leave this refactor to the next pr. |
for (const [rmMeta, executorManager] of this.machineExecutorManagerMap.entries()) { | ||
const executor: ShellExecutor = await executorManager.getAvailableExecutor(); | ||
if (executor !== undefined) { | ||
await executor.createFolder(this.remoteExpCodeDir); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line can also put in async promise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
Please explain what's changes on folders in PR description. |