You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 3, 2023. It is now read-only.
I try since some days to get ngraph-tf to run under manjaro and ran into multiple problems.
The goal is to use ngraph-tf with the plaidml backend.
I am testing with the following code:
import tensorflow as tf
import os
import sys
if os.environ.get("USE_TF_KERAS", "1") == "1":
import tensorflow.keras as keras
print("Using tensorflow keras version")
else:
import keras
print("Using keras with backend %s" % keras.backend.backend())
if len(sys.argv) < 2:
backend = "CPU"
else:
backend = sys.argv[1]
if backend == "NONE":
print("NOT using ngraph")
else:
import ngraph_bridge
print("Supported ngraph backend:\n %s" % "\n ".join(ngraph_bridge.list_backends()))
ngraph_bridge.set_backend(backend)
print("Using ngraph backend %s" % ngraph_bridge.get_currently_set_backend_name())
mnist = keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(512, activation="relu"),
keras.layers.Dropout(0.2),
keras.layers.Dense(10, activation="softmax")
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
print("Predict:", model.predict(x_train[:1]))
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)
When trying to run it with tensorflow.keras and the ngraph backend set to PLAIDML (USE_TF_KERAS=1 KERAS_BACKEND="tensorflow" python test_ngrapg_tf.py PLAIDML) i get
a segfault or this stacktrace (sometimes the one, sometimes the other):
Traceback (most recent call last):
File "test_ngrapg_tf.py", line 39, in <module>
model.fit(x_train, y_train, epochs=5)
File "/run/media/nope/data/home/nope/workspace/test/fs/ngraph-tf_master/build_cmake/venv-tf-py3/lib/python3.5/site-packages/tensorflow/python/keras/engine/training.py", line 880, in fit
validation_steps=validation_steps)
File "/run/media/nope/data/home/nope/workspace/test/fs/ngraph-tf_master/build_cmake/venv-tf-py3/lib/python3.5/site-packages/tensorflow/python/keras/engine/training_arrays.py", line 329, in model_iteration
batch_outs = f(ins_batch)
File "/run/media/nope/data/home/nope/workspace/test/fs/ngraph-tf_master/build_cmake/venv-tf-py3/lib/python3.5/site-packages/tensorflow/python/keras/backend.py", line 3076, in __call__
run_metadata=self.run_metadata)
File "/run/media/nope/data/home/nope/workspace/test/fs/ngraph-tf_master/build_cmake/venv-tf-py3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1439, in __call__
run_metadata_ptr)
File "/run/media/nope/data/home/nope/workspace/test/fs/ngraph-tf_master/build_cmake/venv-tf-py3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InternalError: Caught exception while compiling op_backend: get_shape() must be called on a node with exactly one output ()
[[{{node ngraph_cluster_44}}]]
When trying to run it with keras with the keras backend set to tensorflow (USE_TF_KERAS=0 KERAS_BACKEND="tensorflow" python test_ngrapg_tf.py PLAIDML) i reliable get invalid opencl kernels generated by plaidml (see plaidml/plaidml#322)
Both versions can execute the prediction step just fine, altho keras with tensorflow backend seem to produce wrong values.
With only tensorflow or plaidml via keras (or in the case of tf also tf.keras) and without ngraph-tf it runs without a problem (USE_TF_KERAS=1/0 KERAS_BACKEND="tensorflow" python test_ngrapg_tf.py NONE).
Those tests where made with a self build version of ngraph-tf with and without the --use_prebuilt_tensorflow parameter.
Using the CPU ngraph backend it runs with keras with tensorflow as keras backend and tf.keras altho way slower as just tensorflow-cpu without ngraph in both cases.
Additionally when using keras with backend set to tensorflow the results seem to be wrong.
When trying to run it with the ngraph CPU backend via the pypi version of ngraph-tf installed via pip i get an Illegal instruction crash with keras->tensorflow and tf.keras.
Additional info
I am using python 3.5.5 installed via pyenv.
# uname -a
Linux seima-pc 5.0.15-1-MANJARO #1 SMP PREEMPT Fri May 10 19:51:04 UTC 2019 x86_64 GNU/Linux
GPU: Radeon RX 580
When compiling ngraph-tf i need to create a link from lib64 to lib in the artifact dir otherwise the ngraph-tf build fails as it expects the lib dir but creates the lib64 dir (not sure if relevant)
Sorry for the wall of text, but i really don't know where it goes wrong.
Please let me know if additional information are required.
The text was updated successfully, but these errors were encountered:
When solving (although in a very crude way) the invalid opencl kernel generated by plaidml (plaidml/plaidml#322) i now get the same error with tensorflow.keras and keras with the keras backend set to tensorflow, ie:
Traceback (most recent call last):
File "test_ngrapg_tf.py", line 39, in <module>
model.fit(x_train, y_train, epochs=5)
File "/run/media/nope/data/home/nope/workspace/test/fs/ngraph-tf_master/build_cmake/venv-tf-py3/lib/python3.5/site-packages/tensorflow/python/keras/engine/training.py", line 880, in fit
validation_steps=validation_steps)
File "/run/media/nope/data/home/nope/workspace/test/fs/ngraph-tf_master/build_cmake/venv-tf-py3/lib/python3.5/site-packages/tensorflow/python/keras/engine/training_arrays.py", line 329, in model_iteration
batch_outs = f(ins_batch)
File "/run/media/nope/data/home/nope/workspace/test/fs/ngraph-tf_master/build_cmake/venv-tf-py3/lib/python3.5/site-packages/tensorflow/python/keras/backend.py", line 3076, in __call__
run_metadata=self.run_metadata)
File "/run/media/nope/data/home/nope/workspace/test/fs/ngraph-tf_master/build_cmake/venv-tf-py3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1439, in __call__
run_metadata_ptr)
File "/run/media/nope/data/home/nope/workspace/test/fs/ngraph-tf_master/build_cmake/venv-tf-py3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InternalError: Caught exception while compiling op_backend: get_shape() must be called on a node with exactly one output ()
[[{{node ngraph_cluster_44}}]]
or a segfault (some times the one, sometimes the other)
I plan to try an ubuntu based distro tomorrow to see if it is in deed manjaro related
I try since some days to get ngraph-tf to run under manjaro and ran into multiple problems.
The goal is to use ngraph-tf with the plaidml backend.
I am testing with the following code:
When trying to run it with tensorflow.keras and the ngraph backend set to PLAIDML (
USE_TF_KERAS=1 KERAS_BACKEND="tensorflow" python test_ngrapg_tf.py PLAIDML
) i geta segfault or this stacktrace (sometimes the one, sometimes the other):
When trying to run it with keras with the keras backend set to tensorflow (
USE_TF_KERAS=0 KERAS_BACKEND="tensorflow" python test_ngrapg_tf.py PLAIDML
) i reliable get invalid opencl kernels generated by plaidml (see plaidml/plaidml#322)Both versions can execute the prediction step just fine, altho keras with tensorflow backend seem to produce wrong values.
With only tensorflow or plaidml via keras (or in the case of tf also tf.keras) and without ngraph-tf it runs without a problem (
USE_TF_KERAS=1/0 KERAS_BACKEND="tensorflow" python test_ngrapg_tf.py NONE
).Those tests where made with a self build version of
ngraph-tf
with and without the--use_prebuilt_tensorflow
parameter.Using the CPU ngraph backend it runs with keras with tensorflow as keras backend and tf.keras altho way slower as just tensorflow-cpu without ngraph in both cases.
Additionally when using keras with backend set to tensorflow the results seem to be wrong.
When trying to run it with the ngraph CPU backend via the pypi version of ngraph-tf installed via pip i get an
Illegal instruction
crash with keras->tensorflow and tf.keras.Additional info
I am using python 3.5.5 installed via pyenv.
GPU: Radeon RX 580
When compiling ngraph-tf i need to create a link from
lib64
tolib
in the artifact dir otherwise the ngraph-tf build fails as it expects the lib dir but creates the lib64 dir (not sure if relevant)Sorry for the wall of text, but i really don't know where it goes wrong.
Please let me know if additional information are required.
The text was updated successfully, but these errors were encountered: