Add option to take parameters from bottoms #2166
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
After #2165.
This PR is the counterpart to #2079; together they propose a (temporary?) solution to #1474.
An option is added to
LayerParameter
calledparam_bottoms
which gives a number of parameter blobs to take from (additional) bottoms. This allows one to break the artificial division between bottoms and parameters. It's implemented in theLayer
base class, so one can use existing layers unmodified in this way.One issue with this: while
Forward_*
,Backward_*
, andLayerSetUp
calls are wrapped by calls fromLayer
,Reshape
is not, and is directly passedbottom
. Thus we cannot perform any option-based finagling of bottoms without changing this call. The somewhat hacky solution here is to add an extra nullary wrapper method,ReshapeOnly
, that lets the layer decide for itself what its bottoms and tops are. A better (but more aggressive) solution might be to go ahead and change all the method types to reflect the fact thatbottom
andtop
aren't actually allowed to change between set up and forward/backward/reshape calls.This PR provides an immediate, but perhaps unsatisfying solution to #1474. The issue of extra semantic/naming information between bottoms and params is not addressed (but maybe it's not as important given a higher-level way to write nets, e.g., #2086). Neither is the issue of computing param sizes, which is a little less automatic when using this option. (Further, size checks don't exist for params-from-bottoms).
We could merge this PR, or something like it, as a temporary way of making layers more flexible until a final solution is built. Or we could keep this as a PR until the param/bottom distinction is cleanly removed, and layer code is updated accordingly.
Finally, a usage note: while you can write any number for
param_bottoms
, the only values that will work are zero or the number of parameter blobs expected by the layer (which is not otherwise specified). It's also possible to use a non-zeroparam_bottoms
while also loading params, but that's not likely to work and should probably just be disabled.