Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issues with gym #5

Closed
justheuristic opened this issue Feb 23, 2017 · 4 comments
Closed

issues with gym #5

justheuristic opened this issue Feb 23, 2017 · 4 comments

Comments

@justheuristic
Copy link
Contributor

If there's something wrong with openai gym and chat didn't resolve it in 10 minutes, feel free to complain here.

@justheuristic
Copy link
Contributor Author

Contributed by Sergey Kolesnikov:

no GLX on GPU nodes

GLXInfoException: pyglet requires an X server with GLX
Solution: re-install GPU drivers without opengl.
openai/gym#366
https://davidsanwald.github.io/2016/11/13/building-tensorflow-with-gpu-support.html

@justheuristic
Copy link
Contributor Author

Game ends in 200 ticks

The current newest version of gym force-stops environment in 200 steps even if you don't use env.monitor.
This may ruin CEM on MountainCar and others.(week1 homework)
To avoid this, use env = gym.make("MountainCar-v0").env

@oscartsai
Copy link

oscartsai commented Jul 23, 2017

For the bonus part of week 0 homework, what is the expected time to solve "taxi-v1" with a genetic algorithm? The problem is that game ends in 200 ticks and in the end all policies in the pool are stuck at -200. Then I tried not to monitor the is_done flag and tried to monitor when the reward is 20 instead. I found that it would probably take several days to get a score of -100 on my notebook (CPU=i5). So I interrupted the process. I just want to check is it normal to take such a long time?

Below is what I got in 8 hours (I set t_max = 11000).
Epoch 0:
best score: -44656.76
Epoch 1:
best score: -42678.38
Epoch 2:
best score: -42678.56
Epoch 3:
best score: -32778.74
Epoch 4:
best score: -32778.74
Epoch 5:
best score: -44657.84
Epoch 6:
best score: -42678.56
Epoch 7:
best score: -38718.74
Epoch 8:
best score: -38717.12
Epoch 9:
best score: -38718.74
Epoch 10:
best score: -36738.02
Epoch 11:
best score: -32777.84
Epoch 12:
best score: -38717.12
Epoch 13:
best score: -36738.02
Epoch 14:
best score: -32779.1
Epoch 15:
best score: -28819.28
Epoch 16:
best score: -30799.28
Epoch 17:
best score: -30799.64
Epoch 18:
best score: -32779.28
Epoch 19:
best score: -30800.0
Epoch 20:
best score: -32779.1
Epoch 21:
best score: -30799.1
Epoch 22:
best score: -32777.48
Epoch 23:
best score: -32778.38
Epoch 24:
best score: -30799.64

@justheuristic
Copy link
Contributor Author

For the record, we fixed it by removing time limit (env = gym.make('Taxi-v1').env)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants