Performance Differences between Tensorflow and Pytorch #1

karanchahal · 2019-01-18T08:59:13Z

I cloned your repo and ran the vpg algo and compared the perf with the tensorflow version. I did an average of 5 runs to take care of the random seed and I saw some interesting results

Tensorflow: Avg Episode Return 81
Pytorch: Avg Episode Return 31
Why do you think this might be the case.

Disclaimer: I haven't read your code thoroughly so there might be some very small mistake. But is diff in performance of RL algos substantial in tf and pytorch ?

kashif · 2019-01-18T09:14:12Z

thanks @karanchahal so can you kindly run with cpu 1 to make sure I am not doing anything wrong with the MPI versions of Adam, and then the other big difference is how the Dense layers are initialized in Tensorflow (is it glorot? dont remember) and how they are initialized in Pytorch... Perhaps also try to run it for a bit more epochs etc.

I should rename this project to something else, so that both tensorflow and pytorch versions can exist together which might help too...

thanks again for your help!

chutaklee · 2019-08-07T15:13:31Z

Hey guys, as to weight initializing, OpenAI-Baselines has custom numpy-based orthogonal initializers. I haven't tested them in firedup thoroughly yet but having them seems to help, at least we can leave the different init methods between pytorch and tensorflow.

https://github.com/openai/baselines/blob/master/baselines/a2c/utils.py

kashif · 2019-08-07T16:14:45Z

Thanks @chutaklee I think pytorch also has orthogonal_ which I could test with. Great work BTW on your fork!

windweller · 2019-10-27T22:12:38Z

Has there been any followup on this? This is an amazing library....would hate to go back to Tensorflow...

kashif · 2019-10-28T09:11:26Z

@windweller thanks! I havent had time to try out the orthogonal initialisation... if you want to give it a go that would be great, otherwise I will try in a week or so.

windweller · 2019-10-30T16:14:17Z

@kashif Ahhh I'm mostly trying to extend the library to do something else...not trying to get to the top of the benchmark. ChutakLee also emailed me explaining why it's difficult to do a direct performance comparison.

A slightly more relevant question -- I notice that both Spinningup and Firedup are not trained on GPU right? I guess it's hard to allocate GPU intelligently with the OpenMPI tool...

kashif · 2019-10-30T16:18:08Z

@windweller right so the networks here are more on the small size so I thought it would be better to just keep everything simple and not introduce gpu/cpu device and scale the training by using more mpi workers... It would not be too hard to add a device flag and then on the mpi side we copy to cpu and do the broadcast and then copy back or use pytorch's distributed api...

windweller · 2019-10-30T18:26:04Z

@kashif makes sense :) I do appreciate how easy to follow this set of code is. Translating from TF to PyTorch is a big deal and appreciate so much that you've done it 👍

kashif closed this as completed Nov 1, 2019

kashif mentioned this issue Nov 13, 2019

Run on GPU #7

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Differences between Tensorflow and Pytorch #1

Performance Differences between Tensorflow and Pytorch #1

karanchahal commented Jan 18, 2019

kashif commented Jan 18, 2019

chutaklee commented Aug 7, 2019 •

edited

Loading

kashif commented Aug 7, 2019

windweller commented Oct 27, 2019

kashif commented Oct 28, 2019

windweller commented Oct 30, 2019

kashif commented Oct 30, 2019

windweller commented Oct 30, 2019

Performance Differences between Tensorflow and Pytorch #1

Performance Differences between Tensorflow and Pytorch #1

Comments

karanchahal commented Jan 18, 2019

kashif commented Jan 18, 2019

chutaklee commented Aug 7, 2019 • edited Loading

kashif commented Aug 7, 2019

windweller commented Oct 27, 2019

kashif commented Oct 28, 2019

windweller commented Oct 30, 2019

kashif commented Oct 30, 2019

windweller commented Oct 30, 2019

chutaklee commented Aug 7, 2019 •

edited

Loading