Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can fail to gather a node #3

Open
hans-ekbrand opened this issue Feb 13, 2019 · 1 comment
Open

Can fail to gather a node #3

hans-ekbrand opened this issue Feb 13, 2019 · 1 comment

Comments

@hans-ekbrand
Copy link
Owner

hans-ekbrand commented Feb 13, 2019

In special cases it seems lc0-match can fail to gather any nodes. At least that what I believe to be the underlying problem that causes this cuda error:

0213 02:56:32.703446 139662739896064 /home/hans/src/lc0-match/src/utils/exception.h:39] Exception: CUDNN error: CUDNN_STATUS_INTERNAL_ERROR (../../src/neural/cuda/layers.cc:218) 0213 02:56:32.703450 139662834362112 /home/hans/src/lc0-match/src/utils/exception.h:39] Exception: CUDNN error: CUBLAS_STATUS_INTERNAL_ERROR (../../src/neural/cuda/layers.cc:552)

To reproduce, use a system with at least two GPU:s and run

lc0 --policy-softmax-temp=1 --cpuct=40 --fpu-value=0.55 --temperature=0.1 -w 32603 -l lc0log.txt --backend=multiplexing --backend-opts=(backend=cudnn,gpu=0),(backend=cudnn,gpu=1) --minibatch-size=16 --smart-pruning-factor=0 --nncache=1000000 --verbose-move-stats

And provide the following UCI commands:
position startpos moves d2d4 f7f5 g2g3 g8f6 f1g2 e7e6 g1f3 d7d5 c2c4 c7c6 e1g1 f8d6 b1c3 e8g8 d1c2 f6e4 a1b1 d8e7 c1f4 d6f4 g3f4 b8d7 b1c1 g7g5 c3e4 f5e4 f3g5 f8f4 g5h3 f4h4 c2d2 d7b6 b2b3 e6e5 c1c3 c8h3 g2h3 g8h8 c3g3 a8g8 h3g2 d5c4 d4e5 c4b3 d2d4 b6d5 a2b3 g8e8 e5e6 e7f6 d4a7 d5f4 a7a2 f6e6 e2e3 f4d5 f1c1 h4h6 b3b4 h6f6 a2b2 e6e5 b2e5 e8e5 b4b5 f6e6 b5c6 e6c6 c1d1 c6c4 g3g4 b7b5
go nodes 262144

@hans-ekbrand
Copy link
Owner Author

I still get this with version 22a1b4e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant