-
Notifications
You must be signed in to change notification settings - Fork 963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
link tensorflow with open_spiel. #195
Comments
I noticed the issue (#172), and that is why I am trying TensorFlow c_api lib. If you are interested in the c_api lib and how to reproduce the problem, I can provide more details. |
Hi @Liuweiming , as mentioned in that issue you linked, I've been working on getting Tensorflow compiled with CMake using tensorflow_cc (see https://github.com/FloopCZ/tensorflow_cc). Compiling against a prebuilt Tensorflow c_api sounds promising.. I like simple solutions. I would be quite sad if it was because of two different version of absl. Yes, I am quite curious about the details, so I would be happy to try to reproduce the problem. No rush, though: I unfortunately don't have much time to work on this, so responses might be slow. |
Curious what program you ran to assess this, and what makes you think it's caused by absl::Mutex? Was it a simple example or our C++ AlphaZero? I'd love to know if it runs for a very simple example. You can also look at |
@lanctot , I wrote some code based on the C++ AlphaZero. Actually, I wrote a c_api wrapper, so I can call the c_api lib easily. If you are interested, please take a look here. The implementation is based on these two projects: But in order to make the problem clear. I also wrote a very simple project, which only depends on Absl lib and TensorFlow c_api lib. I put the code here. The Absl version is lts_2020_02_25. The c_api lib was downloaded from https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-gpu-linux-x86_64-1.15.0.tar.gz (GPU_version). The system and the compiler are Ubuntu 16.04 and gcc-7.5.0. The GPU is 1080ti and the driver version is 440.64.00.
Here are the results. example 1: Load a simple model, and do an inference. The model takes input_a and input_b, the result = input_a + input_b. The Python code is shown below. # Two simple inputs
a = tf.placeholder(tf.float32, shape=(1, 100), name="input_a")
b = tf.placeholder(tf.float32, shape=(1, 100), name="input_b")
# Output
c = tf.add(a, b, name='result')
# To add an init operation to the model
i = tf.initializers.global_variables()
# Write the model definition
with open('loading_example.pb', 'wb') as f:
f.write(tf.get_default_graph().as_graph_def().SerializeToString()) Then I import the graph and do an inference in c++. The code is in here. Please note the second line in the main function, where I created an absl::Mutex object:
example 2: I build a one-layer neural network and try to inference and train it from c++. The Python code is still simple: # inputs
input_ = tf.placeholder(tf.float32, shape=(1, 100), name="input")
target_ = tf.placeholder(tf.float32, shape=(1, 3), name="target")
# Output
output_ = tfkl.Activation("relu")(tfkl.Dense(64)(input_))
output_ = tfkl.Dense(3)(output_)
output_ = tf.identity(output_, name="output")
loss = 0.5 * tf.reduce_mean(tf.squared_difference(output_, target_))
loss = tf.identity(loss, name="loss")
optimizer = tf.train.AdamOptimizer(0.01)
train = optimizer.minimize(loss, name="train")
init = tf.variables_initializer(
tf.global_variables(), name="init")
# Write the model definition
with open('training.pb', 'wb') as f:
f.write(tf.get_default_graph().as_graph_def().SerializeToString()) The c++ code is in here. In this example, the result is different.
When running normally, the output should look like this:
However, when running on GPU, it was stuck here:
But anyway, I don't plan to work into this issue more, because I am getting busy now. |
Dear authors, I am currently trying C++ version AlphaZero and I have been trying to link TensorFlow with open_spiel for some time. I have tried to compile TensorFlow c++ libs, but linking them from open_spiel just failed. The reported problems are about Absl lib and Eigen lib.
I also tried to link with the prebuilt TensorFlow c_api lib. It works without linking errors. However, the program may be stuck somehow. It seems the problem is caused by absl::Mutex.
I gauss it is because open_spiel and TensorFlow depend on different versions of Absl, like below.
What do you think? Could you please give me some advice?
The text was updated successfully, but these errors were encountered: