torch.lstm raised an error with backend #613

Mithzyl · 2024-07-23T14:02:52Z

I made a custom model with lstm layers, and found it may be possible to use directml, then encountered the error:

NotImplementedError: Could not run 'aten::_thnn_fused_lstm_cell' with arguments from the 'CPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::_thnn_fused_lstm_cell' is only available for these backends: [PrivateUse1, Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

PrivateUse1: registered at C:__w\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:58 [backend fallback]

It seems like the dml device 'privateuseone' does not match with predefined backend for lstm 'PrivateUse1', very tricky

Reproduce:
simply define a lstm network then do inference the error will pop out

yigedabuliu · 2024-08-02T08:56:14Z

Hello, excuse me, I had the same problem, did you solve it?

Mithzyl · 2024-08-06T11:42:47Z

Hello, excuse me, I had the same problem, did you solve it?

No, I had to train my model on another device with Nvidia graphics, I guess it is the compatibility issue and it needs to be solved by the team

yixin-cfd · 2024-08-09T05:13:00Z

I encountered the same issue. To add more details:
File 'D:\Software\Anaconda3\envs\A770_directml\lib\site-packages\torch\nn\modules\rnn.py', line 911, in forward result = _VF.lstm(input, hx, self._flat_weights, self.bias, self.num_layers,
NotImplementedError: Could not run 'aten::_thnn_fused_lstm_cell' with arguments from the 'CPU' backend.

yixin-cfd · 2024-08-09T05:52:09Z

Based on my tests, the LSTM module works correctly in versions 230426 and 240521. However, the latest two versions, 240614 and 240715, have issues with the LSTM module.@Mithzyl @yigedabuliu

pin24 · 2024-09-02T16:54:25Z

amd Vega 64 and R9 Nano
torch-directml 0.2.2.dev240614

result = _VF.lstm(input, hx, self._flat_weights, self.bias, self.num_layers,

NotImplementedError: Could not run 'aten::_thnn_fused_lstm_cell' with arguments from the 'CPU' backend.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torch.lstm raised an error with backend #613

torch.lstm raised an error with backend #613

Mithzyl commented Jul 23, 2024

yigedabuliu commented Aug 2, 2024

Mithzyl commented Aug 6, 2024

yixin-cfd commented Aug 9, 2024

yixin-cfd commented Aug 9, 2024

pin24 commented Sep 2, 2024 •

edited

Loading

torch.lstm raised an error with backend #613

torch.lstm raised an error with backend #613

Comments

Mithzyl commented Jul 23, 2024

yigedabuliu commented Aug 2, 2024

Mithzyl commented Aug 6, 2024

yixin-cfd commented Aug 9, 2024

yixin-cfd commented Aug 9, 2024

pin24 commented Sep 2, 2024 • edited Loading

pin24 commented Sep 2, 2024 •

edited

Loading