You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've developed allreduce, broadcast and allgather ops for TensorFlow based on Rabit ops. While digging into Rabit ops, I realized that they are not thread safe. So I limited to 1 the number of threads used by TensorFlow to compute the graph so far.
Now, I wonder if there is a way to execute several allreduce/broadcast/allgather in parallel. I've looked into the code of XGBoost to get any hint but I did not manage to find parallel calls of Rabit ops. Is there any plan to make Rabit ops thread safe ?
Thanks in advance for your help.
The text was updated successfully, but these errors were encountered:
@nateagr Can you elaborate your use cases a bit more. At least for XGBoost, those calls were sequential and supposed to be blocked. Underneath, yes, it's possible to init several rabit instances. But you need to spawn multiple trackers as well.
Hi everyone,
I've developed allreduce, broadcast and allgather ops for TensorFlow based on Rabit ops. While digging into Rabit ops, I realized that they are not thread safe. So I limited to 1 the number of threads used by TensorFlow to compute the graph so far.
Now, I wonder if there is a way to execute several allreduce/broadcast/allgather in parallel. I've looked into the code of XGBoost to get any hint but I did not manage to find parallel calls of Rabit ops. Is there any plan to make Rabit ops thread safe ?
Thanks in advance for your help.
The text was updated successfully, but these errors were encountered: