-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SSD support in NNVM #1214
SSD support in NNVM #1214
Conversation
I added SSD inference model json file to avoid git clone mxnet repo and then create symbol. Do we have a better place to store it? dmlc/web? |
do not include json file in the repo. The test environment do have mxnet installed so you can directly import mxnet |
We need to call some API in mxnet example folder. Or we can export the inference model and save it to somewhere. |
OK, then maybe it makes sense to create a github repo, or gist to host the json. In the long term, it would be nice to just use the standard gluon-cv API |
@ZihengJiang can you also help review? |
Can we write something like: block = model_zoo.get_model('ssd_512_resnet50_v1_voc', pretrained=True) |
It seems like that gluoncv implements ssd a slightly different. multibox_detection is split into smaller operators and multibox_prior is replaced by a gluon block. Also the gluon block has three outputs now. In this case from_mxnet(block) will give error. For now maybe we just use the legacy implementation. |
In that case, it would be REALLY nice if we could simply upgrade from_mxnet to support the model from gluoncv, this would make user's life much easier |
This is the list of operators exported by gluoncv SSD network.
where
As long as we have NMS extracted in TVM, we are good to load gluoncv SSD models. |
.gitignore
Outdated
@@ -98,7 +98,6 @@ build_* | |||
Win32 | |||
*.dir | |||
perf | |||
nnvm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could rebase to master, this is already merged.
for input_name in extra_op_graph.symbol.list_input_names(): | ||
if input_name in params: | ||
extra_op_params[input_name] = params[input_name] | ||
params.remove(input_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra_lib op can be used generically.
Removing params !! How about param is being shared ??
A bit unrealistic, but making it to be aware.
nnvm/src/top/vision/nms.cc
Outdated
CHECK_EQ(in_attrs->size(), 2U) << "Inputs: [data, valid_count]"; | ||
TShape dshape = in_attrs->at(0); | ||
TShape vshape = in_attrs->at(1); | ||
CHECK_EQ(dshape.ndim(), 3U) << "Provided: " << dshape; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error msg could be more informative to differentiate.
TShape lshape = in_attrs->at(1); | ||
TShape ashape = in_attrs->at(2); | ||
CHECK_EQ(cshape.ndim(), 3U) << "Provided: " << cshape; | ||
CHECK_EQ(lshape.ndim(), 2U) << "Provided: " << lshape; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err msg could be more informative to differentiate.
@kevinthesun Thanks, you brought the point of dmlc/web. @tqchen |
@@ -146,3 +147,39 @@ def gradients(ys, xs, grad_ys=None): | |||
if isinstance(xs, list) else len(xs.list_output_names()) | |||
ret = [grad_g.symbol[i] for i in range(nx)] | |||
return ret | |||
|
|||
def split_last_op(graph): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hopefully we have a formal graph splitting function in the future rather than this hacky function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. A general subgraph interface requires more thoughts and a more complete design. This is a short term solution for limited use cases.
nnvm/include/nnvm/top/nn.h
Outdated
.describe("Clip out-of-boundary boxes."); | ||
DMLC_DECLARE_FIELD(threshold).set_default(0.01) | ||
.describe("Threshold to be a positive prediction."); | ||
DMLC_DECLARE_FIELD(nms_threshold).set_default(0.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is NMS stripped out of this op or not? I saw a standalone NMS operator below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. NMS is now a standalone operator. multi box_detection is still preserved so that MulriboxDetection in MXNet can be directly converted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conversion process does not have to be one to one, so we can break it into sequence, maybe @zhreshold can have specific thouoghts on what that sequence is.
@@ -294,7 +331,9 @@ def build(graph, target=None, shape=None, dtype="float32", | |||
if params is None: | |||
params = {} | |||
params.update(init_var) | |||
return graph, libmod, params | |||
if not build_extra: | |||
return graph, libmod, params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be opposite as default is to return 3 params.
A.T.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better return 4 params always, no need of if check.
A.T.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree for the long term a consistent interface would be better. For now separating number of returns is for backward compatibility. We don't need to modify every existing tutorial.
""" | ||
graph_idx = graph.index | ||
last_op_node = graph_idx.nodes[-1] | ||
last_op_func = getattr(sym, last_op_node["op"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a failsafe case where last node doesn't have any op.
A.T.
|
||
if __name__ == "__main__": | ||
test_precompute_prune() | ||
test_compile() | ||
test_run() | ||
test_dtypes() | ||
test_compile_extra_lib() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test case for verifying the proper graph splitting in case of extra_lib_target eanabled.
A.T.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test case covers graph splitting. It compiles the major network with cuda and the last op with llvm. Then it compares the result of split graph with the result of running the whole graph on cpu. Is there anything missing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kevinthesun : what i meant is validating the graph output from build() post split, not the runtime output, runtime output validation is good in your test case.
You can create a sample model with small nodes, and then run your build for split, and check the graph output, you can refer other test cases in nnvm/tests/python/compiler.
@@ -32,6 +32,8 @@ class BuildConfig(object): | |||
defaults = { | |||
"opt_level": 2, | |||
"add_pass": None, | |||
"extra_lib_op": None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about name as split_lib_op . The term extra doesn't fit well, basically it's not extra, it's split part of original graph. Accordingly rename extra_lib_target as well.
A.T.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One reason to use extra here is that instead of returning a single lib, now there is an extra lib which compiles from the split operator. Both of the names make sense in some aspects. Since this is not a general complete solution, I name it as extra for now.
@@ -146,3 +147,39 @@ def gradients(ys, xs, grad_ys=None): | |||
if isinstance(xs, list) else len(xs.list_output_names()) | |||
ret = [grad_g.symbol[i] for i in range(nx)] | |||
return ret | |||
|
|||
def split_last_op(graph): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also agree with @zhreshold , lets not limit the split only to the last op node, we can make the function more generic and take input from use like "op_name" and "sequence in the occurrence from the last node", so that the function can be used for formal splitting at any place not only last node...
For example: op_name="flatten", sequence="1" it will split the first occurrence of flatten from the last node. In this way last node split and other node split also can be achieved.
~/A.T.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. But we need a more complete design to cover all cases, especially for networks which have many branches. Simple logics like the last X operators might not work well. A rough idea could be: User selects a set of nodes as subgraph inputs and another set of nodes as outputs. Then we do some checks to make sure this is a valid splitting and compile different subgraphs separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kevinthesun : you are right, it is little tricky, there are some cases where we can not split the graph as simple, there are many cases needs to be handled, lets create a prototype and discuss on top of that, i am not sure whether in your PR needs to be handled, but i brought the point as it is a very important feature. Thanks!
@tqchen : please provide your opinion on this, Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion is to name it more general, but for now only support split on last op (raise exception otherwise). Once you create an API, you just lost chance to remove or rename it.
nnvm/src/top/vision/nms.cc
Outdated
TShape oshape = TShape(3); | ||
oshape[0] = dshape[0]; | ||
oshape[1] = dshape[1]; | ||
oshape[2] = 6; // [id, prob, xmin, ymin, xmax, ymax] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assign dshape[2], as value 6 is already checked in previous statements.
I feel no need to assign in 3 statements, directly assign dshape to oshape.
nnvm/include/nnvm/top/nn.h
Outdated
.describe("Clip out-of-boundary boxes."); | ||
DMLC_DECLARE_FIELD(threshold).set_default(0.01) | ||
.describe("Threshold to be a positive prediction."); | ||
DMLC_DECLARE_FIELD(nms_threshold).set_default(0.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conversion process does not have to be one to one, so we can break it into sequence, maybe @zhreshold can have specific thouoghts on what that sequence is.
Thanks for all the reviewer's comments, I see two major things that we want to improve in here.
|
created #1242 |
@kevinthesun i test a mxnet ssd model,all layer is normal except SoftmaxActivation(mode=channel). Did you find this problem? |
@sanallen Did you test with ssd in mxnet example? That should work since the model is used in tutorial. SoftmaxActivation conversion is also added in this PR. |
Test deploy_ssd.py with mxnet mobilenet-ssd-512 model and got the following error: _contrib_MultiBoxTarget is not supported in nnvm Traceback (most recent call last): Any idea? Thanks, |
@kaishijeng you are more than welcomed to bring more discussions to https://discuss.tvm.ai |
_contrib_MultiBoxTarget is for training. You need to use inference model. |
After running deploy.py to remove _contrib_MultiBoxTarget, I got a new error: NotImplementedError: Operator: L2Normalization is not supported in nnvm. |
@kevinthesun can we compile last operator of the network on a different target device now? |
It occurs to me, too. |
gluon-cv depends on scipy, which is not supported by ARMv7l. |
You can follow the tutorial of deploying to edge device. You don't necessarily need to use rpc. You can simply upload compiled graph json, lib.so and param files to rasp and install tom runtime on rasp to load and run. |
Add SSD support in NNVM. Create a tutorial to show how to deploy MXNet SSD model on CPU. This PR also enhances nnvm.compiler.build so that user can specify the last operator of the network to be compiled on a different target device. This can be a short-term solution for SSD to run on GPU, since currently it's not easy to directly implement multibox operators on GPU. After multibox_prior implemented on GPU(@Laurawly), we can add this into tutorial to show how to run major part of SSD on GPU and the final multi box_detection operator on CPU.