You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 3, 2023. It is now read-only.
Hi, I've been trying to use ngraph to accelerate my tensorflow detector/testing pipeline, but, unfortunately, without any success so far. The inference process either has the same performance, or becomes painstakingly slow.
I'm not quite sure whether I'm installing and using ngraph right.
I'm not quite sure whether this is the right place to ask these questions, since it might just be something obvious that I've missed, thus not being an actual issue, but I couldn't find any other support channel. If there is a different, more appropriate one, please direct me to it.
For installation, I've used pip inside my own dockerfile to install ngraph-tensorflow-bridge, following the instructions on this repo (and also installed plaidml, since I've noticed ngraph looks for it during initialization; I didn't build the ngraph library myself, since I noticed that the bridge supplies the .so, and it doesn't complain when loading it).
Also, I've tried turning on xla, but it has no effect
Also, I've tested ngraph with intel-tensorflow. When ngraph is off, intel-tf gets about twice as fast as vanilla tf. When ngraph bridge is imported, the performance is really, really low (i.e., I've got bored of waiting for an operation to finish that takes a few seconds when ngraph is not used).
Also, I've tried both 'NCHW' and 'NHWC' under both the vanilla and the intel distributions of tensorflow.
For usage, I only added import ngraph_bridge after importing tensorflow. Is there something else I'm supposed to do?
I didn't get any stdout/stderr message to help me figure out whether ngraph is actually on or not. I've looked through the output of tensorflow.python.client.device_lib.list_local_devices(), but nothing seems to change when adding the import. The only indication that ngraph is used is when I don't disable my GPU (os.environ["CUDA_VISIBLE_DEVICES"] = ""), and I get an error message.
Here is the code I've used for testing out ngraph (it's based on the keras example in this repo). I think the longest I've waited to see some training progress was 10 minutes (without ngraph, I get a progress bar update after under half a minute).
importnumpyasnpimportosos.environ["CUDA_VISIBLE_DEVICES"] =""os.environ["KMP_BLOCKTIME"] ="0"os.environ["OMP_NUM_THREADS"] ="4"os.environ["KMP_AFFINITY"] ="granularity=fine,verbose,compact,1,0"os.environ['KERAS_BACKEND'] ='tensorflow'importtensorflowastffromtensorflow.keras.applications.resnet50importResNet50, preprocess_input, decode_predictionsimportngraph_bridge# A simple script to run inference and training on resnet 50config=tf.ConfigProto()
config.intra_op_parallelism_threads=4config.inter_op_parallelism_threads=4# config.graph_options.optimizer_options.global_jit_level = tf.OptimizerOptions.ON_1tf.keras.backend.set_session(tf.Session(config=config))
# tf.keras.backend.set_image_data_format('channels_first')model=ResNet50(weights=None)
batch_size=128img=np.random.rand(batch_size, 224, 224, 3)
# img = np.random.rand(batch_size, 3, 224, 224)preds=model.predict(preprocess_input(img))
print('Predicted:', decode_predictions(preds, top=3)[0])
model.compile(tf.keras.optimizers.SGD(), loss='categorical_crossentropy')
preds=model.fit(
preprocess_input(img), np.zeros((batch_size, 1000), dtype='float32'))
print('Ran a train round')
I've also tried ngraph for a different code that doesn't use keras (it uses tensorflow's object_detection API instead). Speed is at least 20% lower when using ngraph. For some models the process would have a much larger memory footprint when using ngraph vs. when not (I noticed that because my laptop started paging and crashed).
I've also noticed that while training the keras examples, only one of the 8 logical cores of my cpu is used. This happens when running inference on the detection model, but a fraction of the time more than one core is saturated.
Thanks
The text was updated successfully, but these errors were encountered:
If you run the following code with tf bridge, it does not use GPU but uses multi-threads of CPU.
I am not sure I correctly built nGraph + TF, but under my setup it is 10 times slower than TF + CPU.
Hi, I've been trying to use ngraph to accelerate my tensorflow detector/testing pipeline, but, unfortunately, without any success so far. The inference process either has the same performance, or becomes painstakingly slow.
I'm not quite sure whether I'm installing and using ngraph right.
I'm not quite sure whether this is the right place to ask these questions, since it might just be something obvious that I've missed, thus not being an actual issue, but I couldn't find any other support channel. If there is a different, more appropriate one, please direct me to it.
For installation, I've used pip inside my own dockerfile to install
ngraph-tensorflow-bridge
, following the instructions on this repo (and also installedplaidml
, since I've noticedngraph
looks for it during initialization; I didn't build thengraph
library myself, since I noticed that the bridge supplies the .so, and it doesn't complain when loading it).Also, I've tried turning on xla, but it has no effect
Also, I've tested ngraph with intel-tensorflow. When ngraph is off, intel-tf gets about twice as fast as vanilla tf. When ngraph bridge is imported, the performance is really, really low (i.e., I've got bored of waiting for an operation to finish that takes a few seconds when ngraph is not used).
Also, I've tried both 'NCHW' and 'NHWC' under both the vanilla and the intel distributions of tensorflow.
For usage, I only added
import ngraph_bridge
after importingtensorflow
. Is there something else I'm supposed to do?I didn't get any stdout/stderr message to help me figure out whether ngraph is actually on or not. I've looked through the output of
tensorflow.python.client.device_lib.list_local_devices()
, but nothing seems to change when adding the import. The only indication that ngraph is used is when I don't disable my GPU (os.environ["CUDA_VISIBLE_DEVICES"] = ""
), and I get an error message.Here is the code I've used for testing out ngraph (it's based on the keras example in this repo). I think the longest I've waited to see some training progress was 10 minutes (without ngraph, I get a progress bar update after under half a minute).
I've also tried ngraph for a different code that doesn't use keras (it uses tensorflow's object_detection API instead). Speed is at least 20% lower when using ngraph. For some models the process would have a much larger memory footprint when using ngraph vs. when not (I noticed that because my laptop started paging and crashed).
I've also noticed that while training the keras examples, only one of the 8 logical cores of my cpu is used. This happens when running inference on the detection model, but a fraction of the time more than one core is saturated.
Thanks
The text was updated successfully, but these errors were encountered: