-
Notifications
You must be signed in to change notification settings - Fork 1.8k
support tf2 NAS with non-weight-sharing mode #2541
support tf2 NAS with non-weight-sharing mode #2541
Conversation
QuanluZhang
commented
Jun 9, 2020
•
edited
Loading
edited
- support non-weight-sharing mode for tf2
- change naive-tf example to run in non-weight-sharing mode
- add IT for classic nas pytorch
if epoch % 1 == 0: | ||
print("Epoch {:03d}: Loss: {:.3f}, Accuracy: {:.3%}".format(epoch, | ||
epoch_loss_avg.result(), | ||
epoch_accuracy.result())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we report intermediate results here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to report intermediate result, because this example does not use it. do you think this example should use assessor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's for demo purposes. It's fine without it.
test/nni_test/nnitest/run_tests.py
Outdated
@@ -39,6 +39,20 @@ def update_training_service_config(config, training_service): | |||
deep_update(config, it_ts_config['all']) | |||
deep_update(config, it_ts_config[training_service]) | |||
|
|||
def nnictl_generate_search_space(test_yml_config, test_case_config, args): | |||
trial_command = test_yml_config['trial']['command'] | |||
code_dir = args.nni_source_dir + test_case_config['ssgenCodeDir'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use os.path.join
test/nni_test/nnitest/run_tests.py
Outdated
def nnictl_generate_search_space(test_yml_config, test_case_config, args): | ||
trial_command = test_yml_config['trial']['command'] | ||
code_dir = args.nni_source_dir + test_case_config['ssgenCodeDir'] | ||
ss_file_path = args.nni_source_dir + test_case_config['ssFilePath'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use os.path.join
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good suggestion, I just followed the previous code
test/nni_test/nnitest/run_tests.py
Outdated
@@ -51,6 +65,12 @@ def prepare_config_file(test_case_config, it_config, args): | |||
if sys.platform == 'win32' and args.ts == 'local': | |||
test_yml_config['trial']['command'] = test_yml_config['trial']['command'].replace('python3', 'python') | |||
|
|||
# generate search space file for classic nas | |||
if test_case_config.get('doSsgen') is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doSsGen?
ssgenCodeDir: examples/nas/classic_nas | ||
# this file is automatically generated by nnictl ss_gen | ||
ssFilePath: test/config/examples/nni-nas-search-space.json | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can split this test case into 2 cases, then we will not need to handle the special key 'doSsgen' etc only for this case.
1.
name: classic-nas-gen-ss
configFile: test/config/examples/classic-nas-pytorch.yml
launchCommand: nnictl ss_gen --trial_command="python3 mnist.py --epochs 1" --trial_dir=../examples/nas/classic_nas --file=test/config/examples/nni-nas-search-space.json
stopCommand:
experimentStatusCheck: False
name: classic-nas-pytorch
configFile: test/config/examples/classic-nas-pytorch.yml
# remove search space file
stopCommand: nnictl stop; python3 -c 'import os; os.remove("test/config/examples/nni-nas-search-space.json")'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by the way, are the two cases guaranteed to be tested sequentially one after the other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the order is guaranteed. You can run the 2 cases on local system to check whether the case is OK:
python3 nni_test/nnitest/run_tests.py --config config/integration_tests.yml --case classic-nas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems stopCommand cannot be the combine of two commands....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it seems the combination of two commands not working for stopCommand
due to the shlex.split
,
it seems it can be fixed by replacing proc = subprocess.run(shlex.split(launch_command))
with:
proc = subprocess.run(launch_command, shell=True)
test/config/integration_tests.yml
Outdated
@@ -72,6 +72,14 @@ testCases: | |||
- name: nested-ss | |||
configFile: test/config/examples/mnist-nested-search-space.yml | |||
|
|||
- name: classic-nas-pytorch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a tf2 test case? if yes, need to be moved into test/config/integration_tests_tf2.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one can be tested with tf 1.x, I will add another one in test/config/integration_tests_tf2.yml
test/nni_test/nnitest/run_tests.py
Outdated
@@ -75,7 +75,9 @@ def run_test_case(test_case_config, it_config, args): | |||
stop_command = get_command(test_case_config, 'stopCommand') | |||
print('Stop command:', stop_command, flush=True) | |||
if stop_command: | |||
subprocess.run(shlex.split(stop_command)) | |||
for command in stop_command.split('&'): | |||
print('Command:', command, flush=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to fix it like this to support multiple commands:
subprocess.run(launch_command, shell=True)