-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
graph::SetDefaultDevice with shared library build #136
Comments
Hi, does it work correctly for you, if you move
right after
? |
Thanks! yes, it does work that way. Edit: But it is not printing the nodes after that fix so it is actually not setting the device type properly. And more importantly why compiling the original program (without your fix) from within tensorflow using bazel works but not when using a shared tensorflow library. I have also tried the same program from windows using tensorflow.dll files I got from here and got the same problem. Could it be related to the following issues: |
Sorry, I am not really sure. You can try to freeze your graph before trying to load it as it will drop the assigned device from the individual nodes. It could be related to the issues you mention. Let us know if you manage to make it work. |
Hi, you can try the latest master that stands on TF 1.12 if you still have the issue. Feel free to reopen if it persists. |
@dshawul Were you able to resolve this? |
@FloopCZ I am having this issue with the bazel build not being able to call SetDefaultDevice. Is there a solution for this that you know of? |
@ttdd11 No, but I used a workaround of using bazel to compile the program directly with a BUILD file. |
@dshawul Can you explain a bit more? Instead of linking the tensorflow_cc to you app you are building your app with bazel and the source? |
Yes, that is correct. |
Thanks for the advice. My application is unfortunately somewhat complicated and we required debugging regularly in visual studio. However I may need to do this given tensorflow has had this issue for a while and we require rtx support which means cuda 10 and later versions of tensorflow. |
@dshawul I think I am going to try what you are doing and just maintain two build. Can you provide some advice on using the BUILD file or copy some samples on how you got tensorflow source into the build file? |
First I put my app somewhere inside the tensorflow source tree, e.g tensorflow/cc/myapp. This is a must to build with bazel. Then I write a a BUILD file and put in the myapp directory. Example build file:
My app is actually a shared object not an exe. Put in all your *.cpp and *.h in the "srcs". You can define predifined macros in defines -- if you want for example a -DTENORFLOW to be passed to the compiler. Then execute:
Then you should get myapp.dll that is a standalone binary (not libtensorflow dep) in tensorflow/bazel-bin/tensorflow/cc/myapp Hope that helps |
@dshawul Okay well this could actually work. Just need to isolate all tensorflow to a few files and build them as a dll, then link to those to get the tensorflow working. Is this basically what you did? |
Note that you can directly build an exe (your app) with bazel. My app was a dll from the beginning. It is not required that you build a shared library for the workaround. Good luck. |
Okay sounds good. I'm just thinking for ease of debugging if I build a dll then I can link it and test other code pretty easily. Does this sound like it should work? |
Yes that sounds good. It is also good to have all tensorflow code in a separate module. |
@dshawul this has been very helpful. How did you configure tensorflow with different options (such as CUDA 10) before running your build file? Did you just run the configure.py first and then run the build as you described above? |
@dshawul I've gotten the lib to build and sadly and unable to use it. Here is my code for the lib. It's very simple I just want to see if the SetDefaultGraph works on windows. This is my header: #ifndef INFERENCELIB_H #ifdef TF_API //Standard includes //tensorflow includes class TF_API InferenceLib private: #endif //INFERENCELIB_H This is the cpp: #include "InferenceLib.h" InferenceLib::InferenceLib(const char* strPathGraph, bool InferenceLib::loadGraphAndSession(const char* strPathGraph, int nGPU /= -1/)
} InferenceLib::~InferenceLib() } My build file is: load("//tensorflow:tensorflow.bzl", "tf_cc_shared_object") tf_cc_shared_object( I run this in Powershell: And the lib builds. But I get link errors when calling to any function and the .lib seems to be corrupted (at least depends says that its not a valid 32 or 64 bit lib). Does this look correct? I've been on this for about a month now and don't seem to be any closer. Any help would be greatly appreciated. The symbols I need are actually in the dll so I'm thinking its how I built it. |
I think you need to put more dependencies 'deps' for your case that will cover all your tensorflow includes. I had only three because i only included these:
You have many more includes than me. A good test to see if bazel solution works is to compile the example I provided in my original post by putting it in //tensorflow/cc/example/example.cc and using
|
I don't think I need all those includes. I will try again without them. Here is the link error for reference: __cdecl tensorflow::core::GetVarint32PtrFallback unresolved symbol |
@dshawul I think this is working now besides that one symbol. It is located in source\tensorflow\core\lib\core. I look at the BUILD file for core and I should be able to add "//tensorflow/core:lib" to get those symbols but that did not work. Are you running inference from a loaded .pb with your app? Seems like the trouble line is when I use TensorShape which your example doesn't use. Any ideas on how to get this tensorflow::core::GetVarint32PtrFallback symbol in the lib? |
Hi,
I have problem using graph::SetDefaultDevice using the shared library build. I understand why the static build will not work for my example below (it does more than inference). But the tensorflow_cc shared library should work the same as one built with bazel. If I build the code from within tensorflow using bazel it works fine but if I use libtensorflow_cc with cmake it doesn't, why ?
I get the following error when running the resulting program
The text was updated successfully, but these errors were encountered: