This repository has been archived by the owner on Sep 18, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
PPO tuner for NAS, supports NNI's NAS interface #1380
Merged
QuanluZhang
merged 32 commits into
microsoft:dev-nas-tuner
from
QuanluZhang:dev-ppo-tuner
Aug 7, 2019
Merged
Changes from 31 commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
1adc820
ppo tuner
zhangql08hit b28fd20
pass simplified mnist-nas example
zhangql08hit be20b10
general ppo_tuner, is debugging
zhangql08hit 8513c69
pass mnist-nas example, converge
zhangql08hit 9806a50
remove unused files
zhangql08hit 36f96f2
move the specified logic from tuner.py to ppo_tuner.py
zhangql08hit 0da4d60
fix bug
zhangql08hit 78796f3
remove unused function
zhangql08hit a51130f
remove useless comments and print
zhangql08hit 7e3253e
fix python syntax error
zhangql08hit 1b07b2c
add comments
zhangql08hit d6bdb10
add optional arguments
zhangql08hit f823d3f
add requirements
zhangql08hit a5fd738
support package install
zhangql08hit 4584e54
update doc
zhangql08hit d8a40c6
support unified search space
zhangql08hit 86dc53d
Merge branch 'master' of github.com:Microsoft/nni into dev-ppo-tuner
zhangql08hit ddc54e8
fix bug
zhangql08hit 7131458
fix pylint in ppo_tuner.py
zhangql08hit 98d234a
fix pylint in policy.py
zhangql08hit 1410a39
fix pylint in util.py
zhangql08hit 1bed65f
fix pylint in distri.py
zhangql08hit 99de362
fix pylint in model.py
zhangql08hit bbcfef7
remove newlines
zhangql08hit c719f96
update doc
zhangql08hit 7f72174
update doc
zhangql08hit 65cc38c
fix bug
zhangql08hit 5df3195
fix bug
zhangql08hit 0770849
add one arg for ppotuner, add callback in msg_dispatcher
zhangql08hit b209e04
add fault tolerance to tolerate trial failure
zhangql08hit 8d9d44a
fix bug
zhangql08hit 5e78027
trivial change
zhangql08hit File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
authorName: NNI-example | ||
experimentName: example_mnist | ||
trialConcurrency: 1 | ||
maxExecDuration: 100h | ||
maxTrialNum: 10000 | ||
#choice: local, remote, pai | ||
trainingServicePlatform: local | ||
#choice: true, false | ||
useAnnotation: true | ||
tuner: | ||
#choice: TPE, Random, Anneal, Evolution, BatchTuner, MetisTuner | ||
#SMAC (SMAC should be installed through nnictl) | ||
#codeDir: ~/nni/nni/examples/tuners/random_nas_tuner | ||
builtinTunerName: PPOTuner | ||
classArgs: | ||
optimize_mode: maximize | ||
trial: | ||
command: python3 mnist.py | ||
codeDir: . | ||
gpuNum: 0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,198 @@ | ||
# Copyright (c) Microsoft Corporation | ||
# All rights reserved. | ||
# | ||
# MIT License | ||
# | ||
# Permission is hereby granted, free of charge, | ||
# to any person obtaining a copy of this software and associated | ||
# documentation files (the "Software"), to deal in the Software without restriction, | ||
# including without limitation the rights to use, copy, modify, merge, publish, | ||
# distribute, sublicense, and/or sell copies of the Software, and | ||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions: | ||
# The above copyright notice and this permission notice shall be included | ||
# in all copies or substantial portions of the Software. | ||
# | ||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING | ||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND | ||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, | ||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. | ||
QuanluZhang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
""" | ||
functions for sampling from hidden state | ||
""" | ||
|
||
import tensorflow as tf | ||
|
||
from .util import fc | ||
|
||
|
||
class Pd: | ||
""" | ||
A particular probability distribution | ||
""" | ||
def flatparam(self): | ||
raise NotImplementedError | ||
def mode(self): | ||
QuanluZhang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
raise NotImplementedError | ||
def neglogp(self, x): | ||
# Usually it's easier to define the negative logprob | ||
raise NotImplementedError | ||
def kl(self, other): | ||
raise NotImplementedError | ||
def entropy(self): | ||
raise NotImplementedError | ||
def sample(self): | ||
raise NotImplementedError | ||
def logp(self, x): | ||
return - self.neglogp(x) | ||
def get_shape(self): | ||
return self.flatparam().shape | ||
@property | ||
def shape(self): | ||
return self.get_shape() | ||
def __getitem__(self, idx): | ||
return self.__class__(self.flatparam()[idx]) | ||
|
||
class PdType: | ||
""" | ||
Parametrized family of probability distributions | ||
""" | ||
def pdclass(self): | ||
raise NotImplementedError | ||
def pdfromflat(self, flat, mask, nsteps, size, is_act_model): | ||
return self.pdclass()(flat, mask, nsteps, size, is_act_model) | ||
def pdfromlatent(self, latent_vector, init_scale, init_bias): | ||
raise NotImplementedError | ||
def param_shape(self): | ||
raise NotImplementedError | ||
def sample_shape(self): | ||
raise NotImplementedError | ||
def sample_dtype(self): | ||
raise NotImplementedError | ||
|
||
def param_placeholder(self, prepend_shape, name=None): | ||
return tf.placeholder(dtype=tf.float32, shape=prepend_shape+self.param_shape(), name=name) | ||
def sample_placeholder(self, prepend_shape, name=None): | ||
return tf.placeholder(dtype=self.sample_dtype(), shape=prepend_shape+self.sample_shape(), name=name) | ||
|
||
class CategoricalPd(Pd): | ||
""" | ||
categorical prossibility distribution | ||
""" | ||
def __init__(self, logits, mask_npinf, nsteps, size, is_act_model): | ||
self.logits = logits | ||
self.mask_npinf = mask_npinf | ||
self.nsteps = nsteps | ||
self.size = size | ||
self.is_act_model = is_act_model | ||
def flatparam(self): | ||
QuanluZhang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return self.logits | ||
def mode(self): | ||
return tf.argmax(self.logits, axis=-1) | ||
|
||
@property | ||
def mean(self): | ||
return tf.nn.softmax(self.logits) | ||
QuanluZhang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
def neglogp(self, x): | ||
""" | ||
return tf.nn.sparse_softmax_cross_entropy_with_logits(logits=self.logits, labels=x) | ||
Note: we can't use sparse_softmax_cross_entropy_with_logits because | ||
the implementation does not allow second-order derivatives... | ||
""" | ||
if x.dtype in {tf.uint8, tf.int32, tf.int64}: | ||
# one-hot encoding | ||
x_shape_list = x.shape.as_list() | ||
logits_shape_list = self.logits.get_shape().as_list()[:-1] | ||
for xs, ls in zip(x_shape_list, logits_shape_list): | ||
if xs is not None and ls is not None: | ||
assert xs == ls, 'shape mismatch: {} in x vs {} in logits'.format(xs, ls) | ||
|
||
x = tf.one_hot(x, self.logits.get_shape().as_list()[-1]) | ||
else: | ||
# already encoded | ||
assert x.shape.as_list() == self.logits.shape.as_list() | ||
|
||
return tf.nn.softmax_cross_entropy_with_logits_v2( | ||
logits=self.logits, | ||
labels=x) | ||
|
||
def kl(self, other): | ||
QuanluZhang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"""kl""" | ||
a0 = self.logits - tf.reduce_max(self.logits, axis=-1, keepdims=True) | ||
a1 = other.logits - tf.reduce_max(other.logits, axis=-1, keepdims=True) | ||
ea0 = tf.exp(a0) | ||
ea1 = tf.exp(a1) | ||
z0 = tf.reduce_sum(ea0, axis=-1, keepdims=True) | ||
z1 = tf.reduce_sum(ea1, axis=-1, keepdims=True) | ||
p0 = ea0 / z0 | ||
QuanluZhang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return tf.reduce_sum(p0 * (a0 - tf.log(z0) - a1 + tf.log(z1)), axis=-1) | ||
|
||
def entropy(self): | ||
QuanluZhang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"""compute entropy""" | ||
a0 = self.logits - tf.reduce_max(self.logits, axis=-1, keepdims=True) | ||
ea0 = tf.exp(a0) | ||
z0 = tf.reduce_sum(ea0, axis=-1, keepdims=True) | ||
p0 = ea0 / z0 | ||
return tf.reduce_sum(p0 * (tf.log(z0) - a0), axis=-1) | ||
|
||
def sample(self): | ||
QuanluZhang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"""sample from logits""" | ||
if not self.is_act_model: | ||
re_res = tf.reshape(self.logits, [-1, self.nsteps, self.size]) | ||
masked_res = tf.math.add(re_res, self.mask_npinf) | ||
re_masked_res = tf.reshape(masked_res, [-1, self.size]) | ||
|
||
u = tf.random_uniform(tf.shape(re_masked_res), dtype=self.logits.dtype) | ||
return tf.argmax(re_masked_res - tf.log(-tf.log(u)), axis=-1) | ||
else: | ||
u = tf.random_uniform(tf.shape(self.logits), dtype=self.logits.dtype) | ||
return tf.argmax(self.logits - tf.log(-tf.log(u)), axis=-1) | ||
|
||
@classmethod | ||
def fromflat(cls, flat): | ||
return cls(flat) | ||
|
||
class CategoricalPdType(PdType): | ||
""" | ||
to create CategoricalPd | ||
""" | ||
def __init__(self, ncat, nsteps, np_mask, is_act_model): | ||
self.ncat = ncat | ||
self.nsteps = nsteps | ||
self.np_mask = np_mask | ||
self.is_act_model = is_act_model | ||
def pdclass(self): | ||
return CategoricalPd | ||
|
||
def pdfromlatent(self, latent_vector, init_scale=1.0, init_bias=0.0): | ||
"""add fc and create CategoricalPd""" | ||
pdparam, mask, mask_npinf = _matching_fc(latent_vector, 'pi', self.ncat, self.nsteps, | ||
init_scale=init_scale, init_bias=init_bias, | ||
np_mask=self.np_mask, is_act_model=self.is_act_model) | ||
return self.pdfromflat(pdparam, mask_npinf, self.nsteps, self.ncat, self.is_act_model), pdparam, mask, mask_npinf | ||
|
||
def param_shape(self): | ||
return [self.ncat] | ||
def sample_shape(self): | ||
return [] | ||
def sample_dtype(self): | ||
return tf.int32 | ||
|
||
def _matching_fc(tensor, name, size, nsteps, init_scale, init_bias, np_mask, is_act_model): | ||
""" | ||
add fc op, and add mask op when not in action mode | ||
""" | ||
if tensor.shape[-1] == size: | ||
assert False | ||
return tensor | ||
else: | ||
mask = tf.get_variable("act_mask", dtype=tf.float32, initializer=np_mask[0], trainable=False) | ||
mask_npinf = tf.get_variable("act_mask_npinf", dtype=tf.float32, initializer=np_mask[1], trainable=False) | ||
res = fc(tensor, name, size, init_scale=init_scale, init_bias=init_bias) | ||
if not is_act_model: | ||
re_res = tf.reshape(res, [-1, nsteps, size]) | ||
masked_res = tf.math.multiply(re_res, mask) | ||
re_masked_res = tf.reshape(masked_res, [-1, size]) | ||
return re_masked_res, mask, mask_npinf | ||
else: | ||
return res, mask, mask_npinf |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the installation command should be mentioned here