-
Notifications
You must be signed in to change notification settings - Fork 329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
comparison to NN-512/ direct (C) code generation #1668
Comments
I would look into using Let us know how your experiment goes, I am sure others might be interested in the outcome too. Note that there are efforts to link onnx-mlir to mlir via HLO or TOSSA, and a proposal for Torch/MLIR. |
Thanks for the response! The MLIR C backend sounds interesting although it might generate cpp in many cases which would require wrapping that once more into c to make interop with other languages possible. It seems that the current onnx-mlir api is very weakly typed and basically trusts the user to get it right https://github.com/onnx/onnx-mlir/blob/main/docs/mnist_example/Mnist.java / there is no step which actually inspects the onnx file inputs and generates language specific types from it? (this is the first important step I for me) Maybe its still a better idea to write more langauge specific wrappers for onnx-mlir similar to what is already present for e.g. java, I think the main thing I need to look at here is how to generate proper types such that the function arguments from e.g. rust can be properly typed to provide safe/ reliable model execution. This could be a build script which generates the object file and a corresponding c file and rust file and does the linking automagically. For this it would be really good if onnx-mlir would be more easily available as a library rather than a binary as mentioned in #1597. Another problem I foresee are different llvm versions of the rust/ zig/ whatever compiler and the onnx-mlir project which might lead to incompatibilities, maybe also around LTO which seems to be especially difficult in such mixed scenarios. (This is also a reason why I think it would be good idea to generate source code in the language x rather than linking some object file into a project of language x). The krnl ops https://github.com/onnx/onnx-mlir/blob/42c543dd73e282049059f23fa97e1d6fbdf8883a/docs/Dialects/krnl.md look interesting although I'm not sure how soon I'll have time to actually build the project/ modify it to the extend that I could attempt that experiment, but I'd definitely be very interested in trying this. (this would be the second important step for me although it would also probably include the library-fication of onnx-mlir) Some ops seems to be a little bit ad-hoc (find_index) and some bit high level (strlen) but something along the lines of the krnl ops would definitely be important. The main question is whether implementing the onnx ops directly after parsing the protobuf is easier/ mores or less performant than implementing the krnl ops. I think a smaller set of ops along the lines of https://github.com/geohot/tinygrad/blob/master/tinygrad/llops/ops_cpu.py would be preferable.
Given that MLIR does not yet seem to have widespread use compared to the traditional llvm ir I think converting the models directly into existing languages would make widespread adoption easier. (Although MLIR seems to be very promising in the long run for sure) If there is a meeting I could attend to where some ideas could be discussed I'd also try to join #1666 |
Added #1676 for meeting invite. |
Thanks, It's great to see that the meeting is so open, unfortunately it seems that this is in the middle of the night in Germany (where I'm from) so I'll have to pass. If you happen to have any feedback I look forward to hearing it though and I will follow this project some more 👍 |
I figured I would be able to attend on Sept. 20th and made some preliminary slides. I'd be especially interested in feedback/ discussion after the (short) presentation. |
thanks, added pointer to this issue to our wiki agenda. |
I managed to attend the last meeting. The conclusion was that a language specific codegen option would be too large of a maintenance burden for ONNX and would thus need to be added to MLIR directly. The existing C codegen option of MLIR should be explored. This could then be used to generate code which only requires a c compiler for each user of a model, instead of the onnx-mlir compiler. Regarding better language bindings with stronger typedefs it was said, that the existing object files which onnx-mlir produces already contain a function which, when called, produce a json output of the types of the inputs and outputs of the model. This information could be extracted at compile time and could be used to generate more precise types for the language specific bindings. A separate issue with a more precise proposal would be welcome regarding this. |
regarding the portability concerns sonos/tract#580 and sonos/tract#393 (comment) are relevant user stories that fit into this issue I just tried to build onnx-mlir natively on macos and its rather difficult, compiling llvm myself or using docker is quite a bit of overhead which would be added to every developer interacting with a library which uses NN and uses onnx-mlir for doing the inference part. (this could be solved in onnx-mlir if the c backend worked but I did not have enough time to test that yet) |
I was looking into deploying onnx models as native code and came across NN-512 which was basically exactly what I was looking for, except the input format is custom text instead of a common format like onnx and the output is not very portable.
basic comparison to onnx https://jonatron.github.io/test_nn512/
Is a direct c backend for onnx planned/ possible as well? This would make compilation easier as one would just need a onnx -> c compiler and could then use existing tooling to produce the binary. If the resulting c code could be optionally created without simd intrinsics it would immediately be portable to wasm and embedded because it would basically just be iteration and arithmetic without special operations.
A c backend could also probably relatively easily be extended to rust/ zig/ (maybe halide https://github.com/halide/Halide), making it easy to deploy models with minimum friction in each language ecosystem. This would make a onnx -> c compiler effectively be something like https://github.com/shinh/elvm , but for onnx models, the opcodes being https://github.com/onnx/onnx-mlir/blob/d0b60c3c1e948afd12af5f0e0f0b969c0cee5cfa/docs/Dialects/onnx.md .
Generating c code instead of llvm is not uncommon for language implementations such as https://github.com/pervognsen/bitwise/blob/master/ion/gen.c (even llvm had a c backend),
onnx-mlir seems to be one step further such that it targets multiple languages at once and may be able to perform better optimisations but after looking at NN512 I got the impression that a basic/ primitive translation to native code could already provide much value.
Maybe this already exists for onnx and I did not find it?
My understanding of the current state of standard onnx inference is that a runtime loads a protobuf file at runtime and then interprets it (basically like python loads a file and interprets it). This seems to waste the opportunity of compile time optimisations and also makes deployment more difficult because multiple files and io (non pure computation) is involved. These are similar reasons one might choose c over python in the first place, thus it seems counterintuitive to me that one would call onnx this way from a compiled language.
The text was updated successfully, but these errors were encountered: