Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【报bug测试3】OSError: (External) CUDNN error(7), CUDNN_STATUS_MAPPING_ERROR. #6

Open
Ligoml opened this issue Jan 21, 2022 · 0 comments
Labels
status/following-up 跟进中 type/debug 帮用户debug

Comments

@Ligoml
Copy link
Owner

Ligoml commented Jan 21, 2022

bug复现环境(bug reproduction environment)

标题:特定CUDA版本下稳定复现CUDNN error

版本、环境信息:
1)PaddlePaddle版本:2.2.1
2)CPU:---
3)GPU:V100 16G/32G
4)系统环境:ubuntu 16.04,python 3.7

bug复现步骤及最小代码集(Bug reproduction steps and minimal code set)

代码中所有算子都是直接调用Paddle提供的卷积块,主要包含Conv3D,BN3D,Conv3DTranspose等模块。

期望结果(Desired result)

不报错,正常训练

实际结果(actual result)

在CUDA版本为10.1/10.2时,稳定报如下错误。但是当CUDA版本为11.0以上版本时,可以正常训练完成。
File "train_mgpu.py", line 284, in train
loss.backward()
File "", line 2, in backward
File "/opt/_internal/cpython-3.7.0/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in impl
return wrapped_func(*args, **kwargs)
File "/opt/_internal/cpython-3.7.0/lib/python3.7/site-packages/paddle/fluid/framework.py", line 229, in impl
return func(*args, **kwargs)
File "/opt/_internal/cpython-3.7.0/lib/python3.7/site-packages/paddle/fluid/dygraph/varbase_patch_methods.py", line 249, in backward
framework._dygraph_tracer())
OSError: (External) CUDNN error(7), CUDNN_STATUS_MAPPING_ERROR.
[Hint: 'CUDNN_STATUS_MAPPING_ERROR'. An access to GPU memory space failed, which is usually caused by a failure to bind a texture. To correct, prior to the function call, unbind any previously bound textures. Otherwise, this may indicate an internal error/bug in the library. ] (at /paddle/paddle/fluid/operators/conv_cudnn_op.cu:758)

其他补充

No response

@Ligoml Ligoml added status/new-issue 新建 type/bug-report 报bug初始标签 direction/training 单机训练方向 type/debug 帮用户debug and removed type/bug-report 报bug初始标签 labels Jan 21, 2022
@Ligoml Ligoml added status/following-up 跟进中 and removed direction/training 单机训练方向 status/new-issue 新建 labels Mar 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/following-up 跟进中 type/debug 帮用户debug
Projects
None yet
Development

No branches or pull requests

1 participant