-
Notifications
You must be signed in to change notification settings - Fork 281
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is an important change, I suggest latter we add a tutorial of how to use customized layout, especially what kinds of work and how much effort a developer need to support their own layout. From there we can continue iterating on layout related stuff.
@@ -237,7 +247,6 @@ def _upsampling(inputs, attrs): | |||
'min_axis' : _rename('min'), | |||
'reshape' : _reshape, | |||
'sum_axis' : _rename('sum'), | |||
'UpSampling' : _upsampling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I was removing SSD stuff but by mistake remove this...
include/nnvm/top/nn.h
Outdated
} | ||
}; | ||
|
||
struct MultiBoxPriorParam : public dmlc::Parameter<MultiBoxPriorParam> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove SSD related stuff from this PR. I will make a separate PR to add SSD operators and tests after TVM SSD related PR is merged.
python/nnvm/compiler/build_module.py
Outdated
@@ -204,8 +218,8 @@ def build(graph, target=None, shape=None, dtype="float32", params=None, target_h | |||
By default, llvm is used if it is enabled, | |||
otherwise a stackvm intepreter is used. | |||
|
|||
initialize : bool, optional | |||
Whether to initialize variables in global dict _all_var_init. | |||
layout : dict of str to str or str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does user need to specify opt_level=3 to enable layout transform? Also how does user specify "internal" layout, if the target layout is not NCHWXc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opt_level=3
is for AlterOpLayout
. the param here is for user to specify the input layout. user do not specify internal layout, it is inferred by operators have layout, e.g., conv, pool.
Layout in, last_in, out; | ||
deduce(&in, in_layouts, in_size, "input"); | ||
deduce(&last_in, last_in_layouts, in_size, "input (last infer pass)"); | ||
deduce(&out, out_layouts, out_size, "output"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assume an elemwise_add op has two inputs with layout "NCHW16c" and "NCHW4c", will they be transformed back to "NCHW", or one of them will be transformed to another?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right transform to left. it's defined in ElemwiseBinaryKeepLeftLayout
|
||
template<> | ||
struct extension_class_info<nnvm::compiler::SymbolArray> { | ||
static const int code = 19; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cannot use 19, because it is taken by mxnet tvm bridge https://github.com/apache/incubator-mxnet/blob/master/include/mxnet/tensor_blob.h#L49
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be quite dangerous to expose a vector of pointers here.
std::vector<TLayoutInfo> *olayouts)>; | ||
using FTVMAlterOpLayout = std::function< | ||
Symbol(const NodeAttrs& attrs, | ||
const SymbolArray& inputs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need the input symbols, or is tinfo enough for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ignore this comment, I now get what is going on
std::vector<TLayoutInfo> *ilayouts, | ||
std::vector<TLayoutInfo> *olayouts)>; | ||
using FTVMAlterOpLayout = std::function< | ||
Symbol(const NodeAttrs& attrs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just return the result output layout as vector of string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ignore this
src/pass/infer_correct_layout.cc
Outdated
* \brief A simple layout infer pass that will | ||
* insert layout transform nodes automatically. | ||
*/ | ||
nnvm::Graph InferCorrectLayout(nnvm::Graph src) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can just call it CorrectLayout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. will change.
python/nnvm/top/symbol_array.py
Outdated
_sym_arr_get = tvm.get_global_func("nnvm.compiler._symbol_array_get") | ||
_sym_arr_size = tvm.get_global_func("nnvm.compiler._symbol_array_size") | ||
|
||
class SymbolArray(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just use symbol for this, because we can group multiple nodes into a single Symbol
|
||
class Layout { | ||
public: | ||
using LayoutDim = char; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tqchen
Do you want me to change the name 'dim' to 'axis' as you suggested in https://discuss.tvmlang.org/t/datalayout-structure/80/3 ?
And to my understanding, we don't have to wait for that DataLayout implemented in TVM. We can merge this PR first (so that we can move forward to having significant improvement - at least 50% speedup - for CNN) and switch to that in TVM later, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we can merge this in first, please fix the problems I mentioned, as well as the tests
@merrymercy Can you also do a review when you have time? |
@tqchen The segment fault in CI looks weird. I cannot reproduce. And it happened in PR #435 as well: http://mode-gpu.cs.washington.edu:8080/blue/organizations/jenkins/dmlc%2Fnnvm/detail/PR-435/13/pipeline |
src/pass/correct_layout.cc
Outdated
const Layout& produce = produce_ilayouts[i]; | ||
const Layout& request = request_ilayouts[i]; | ||
if (produce != request && produce.defined()) { | ||
LOG(INFO) << "Insert layout transformer for " << i << "-th input of " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to produce a lot of messages, maybe avoid print this, since this is expected behavior of the pass
Thanks for improving the code through the review! this is now merged |
* [RUNTIME] Enable extension type to PackedFunc. * More comments
This PR allows NNVM to do
Replace an operator via
AlterOpLayout
. For x86 it is more efficient to calculate convolution inNCHW16c
layout. Here is an example of how it is used: https://github.com/yzhliu/topi-intel/blob/master/e2e_general_pack/schedule_pack/avx512_conv_fwd.py#L128-L155Note that kernel (pre-)packing is supported as well: https://github.com/dmlc/nnvm/issues/303
Infer and correct layout automatically
InferCorrectLayout
pass can infer the layout for each operator (both inputs and outputs).__layout_transform__
operator will be inserted.FInferLayout
. Once a model is imported, we do aInferCorrectLayout
pass and store the original layouts for each operator. AfterAlterOpLayout
, theInferCorrectLayout
runs again. This time each operator will see the original layouts it inferred before, which can be used to decide whether to keep the original one. For example,softmax
can still generate correct result after the input layout changes, whileflatten
cannot. Soflatten
claims it needs theoriginal layout
and triggers a layout transform, which make the network produce correct result.NHWC
layout, users can still passNCHW
input as long as the input layout is specified, a layout transform happens automatically.Moreover the convolution kernel layout becomes much clearer: #372 . Now we have kernel layout
OIHW
andHWIO
forNCHW
andNHWC
respectively.Will shoot a pull request for corresponding TVM changes.
@yidawang @kevinthesun