-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modular model definitions #1290
Conversation
CHECK(parse) << "Failed to parse NetParameter file: " << import.net(); | ||
CHECK(layer.has_name() && layer.name().length() > 0) | ||
<< "Import layer must have a name"; | ||
LoadImports(net, target, ResolveImportName(layer.name(), pwd)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't look like this recursion checks for self-inclusion. That is, if foo.prototxt imports bar.prototxt, and bar.prototxt imports foo.prototxt, the recursion doesn't end until the stack overflows.
I'm a bit confused by the variable naming scheme. How come in the lenet example, I don't quite get why |
Using different paths for cp1 and 2 was only to show the two ways a module I think adding modules from the same file would be great. No pblm with — |
@@ -252,6 +252,7 @@ message LayerParameter { | |||
HINGE_LOSS = 28; | |||
IM2COL = 11; | |||
IMAGE_DATA = 12; | |||
IMPORT = 39; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use case of this layer is very similar to the HTML template framework commonly used in web development where the variables are usually replaced by data fetched from the databases. Calling it TEMPLATE layer sounds more natural.
Think about how definitions are reused in C++. Header files are "include"d and namespaces are "use"d . So instead of "file" and "reference", "include" and "use" are more familiar choices. |
@cypof well done! I like the file system analogy for naming and the general variable substitution since it encompasses any variation to bottoms / tops, configuration fields, and weight sharing. While all-in-one nets de-duped related definitions, there is still plenty of duplication within model definitions like the convolution + pooling pairs you've highlighted. To keep in line with unified model definitions one could ideally define a module as (1) a group of layers in the same prototxt or (2) a separate prototxt file for import as suggested by @jyegerlehner. (1) could be done by introducing a Naming the layer Thanks for bundling an example and test. To clarify the example usage, you could include comments to explain the absolute / relative naming inline in the definition. @jeffdonahue @longjon and I are all in deadline mode for CVPR but certainly interested in richer and modular net definitions. |
@@ -269,7 +269,7 @@ endif | |||
|
|||
# Debugging | |||
ifeq ($(DEBUG), 1) | |||
COMMON_FLAGS += -DDEBUG -g -O0 | |||
COMMON_FLAGS += -DDEBUG -g -O0 -DBOOST_NOINLINE='__attribute__ ((noinline))' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-DBOOST_NOINLINE='__attribute__ ((noinline))
what's the story here?
I understood the example is showing both relative and absolute ways of naming. My question was about something different, but doesn't seem like it's confusing anyone else. I pulled your dev fork and ran your modulized-lenet and all seems well. Not sure if the intent is to add the ModuleParameter scheme to this PR, as has been discussed above and in #1169. If not, I can try adding that after this PR is merged. Or if our maintainers decide adding that is a prerequisite to merging, I could conceivably submit a PR to your fork. |
@@ -65,6 +65,18 @@ void WriteProtoToBinaryFile(const Message& proto, const char* filename) { | |||
CHECK(proto.SerializeToOstream(&output)); | |||
} | |||
|
|||
string ReadFile(const string& filename) { | |||
std::ifstream in(filename.c_str(), std::ios::in | std::ios::binary); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use std::ifstream in(filename.c_str(), std::ios::in | std::ios::binary | std::ios::ate);
to open the file and seek to the end of file at the same time.
@cypof I have defined GoogleNet using |
I don't want to retard this momentum, but I have some reservations about this (which I've discussed with @shelhamer previously):
I know modularity is important and currently we don't provide it and some are ready to use this PR right now, so I don't want to stall this effort, but I also don't feel comfortable committing to this path. Maybe we can merge to a feature branch? (Personally my view is that rather than confining our net definition language to protobuf, or some hacks on top of protobuf, we should treat protobuf as a human-readable intermediate language, and provide interfaces ("DSLs") in real languages (at least Python) for building nets.) |
@shelhame @longjon has some good points. Where are we on the effort to define nets in python? Does it still makes sense to finish this one? |
@cypof I do agree that @longjon's points block merge to canonical Caffe and Right now one can wield the Python protobuf bindings to make a net like My proposal is to make a caffe.model submodule of pycaffe with helpers and
|
@longjon I understand your concerns, however currently there is no a good alternative to avoid redundant prototxt, which is prone to errors. What about creating tool within Caffe that could read templates and generate the prototxt files, while the python come up to speed? This would keep Caffe Networks as pure prototxt, and serve as another tool to generate networks. |
@sguada I'm happy with the idea of providing tools to produce prototxts, as it's decoupled from defining the input language, and I'm happy with the idea of providing stopgap tools before things are eventually done the right way. But shell is already a better tool for this than what we have here, as far I can tell. Witness: #!/bin/bash
read -d '' MODULE << END
layers {
name: 'example'
type: BORING
}
END
function module_func {
cat << END
layers {
name: 'functional'
type: EXCITING
amount_of_awesome: $1
}
END
}
cat << END
layers {
name: 'input'
}
$MODULE
$MODULE
$(module_func "'a lot'")
END which produces, as you would expect:
Everyone has shell, and you can do anything you want instead of just string substitution, while the latter is still easy. Or you can do the same in Perl or Python if that's more your taste... |
@longjon thanks for your illustrative bash example, however I think writing scripts that know nothing about protobuf will become cumbersome pretty quickly. The chances of introducing errors is high and it will easily need to become a protobuf parser. |
@sguada do you plan to extract the import code in a separate tool? The Net.Init could share this code. Sorry for not helping finishing this one, right now I'm trying to get ImageNet results on the distrib training PR. |
@cypof that's a good division of labor. We're gearing up to take a closer On Tue, Dec 2, 2014 at 10:43 AM, Cyprien Noel notifications@github.com
|
@cypof if you don't mind I will reuse pieces of your code, but renaming it as Usage:
@cypof I'm glad to hear that you are working on the distrib training PR. Let's us know when we can take a look. |
That's exactly my point. Unless I've badly misread it, the code that's doing module insertion here knows exactly as much as shell about protobuf, i.e., nothing. (Okay, it parses modules before insertion, so you can be slightly more abusive with shell in ways no reasonable person would attempt, but string substitution is performed on strings, not any parsed thing, just like shell.) |
@longjon partially agreed, the module insertion only need to worry about one layer, the rest is untouch and parsed against prototxt, and also after the string substitution the net is parsed again which will allow to discover errors sooner. Just to be clear, I'm not against adding other tools to achieve similar results, and this tool could be deprecated in the future if it is superseded by others. |
To get an inception module with something similar with the one liner of Torch7
Then, an inception module equivalent to the above Torch7 example can be defined as simple as the following.
The module definition can be expanded into more verbose layer definitions by an InceptionModuleExpander. |
Using templates #1518 one can define Inceptions layers as
|
A way to import a protobuf model into another one, by specifying a layer of type IMPORT. We went for a file system analogy, to allow referencing layers and blobs from different parts of the model:
Imports can be configured using ${variables}, that are applied during load using simple string replace.
We modified mnist/lenet_train_test.prototxt as an example, by exporting the Conv/Pool part as a module that is imported twice.
In lenet_train_test.prototxt:
lenet_conv_pool.prototxt: