You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The implementation for the Reformer model allows for the reconstruction of the full attention matrix (https://github.com/lucidrains/reformer-pytorch#research). There, the Recorder class can expand the attention matrix to it's original form.
How can one get this full attention matrix for the Routing transformer? The Recorder class is only compatible with the Reformer transformer.
The full attention matrix is needed for Transformer Interpretability/Explanation, such as the one described here: https://github.com/hila-chefer/Transformer-Explainability
@KatarinaYuan Hi, unfortunately not, I don't think it's trivial. I decided to use the full attention matrix but with more efficient implementations such as in PyTorch 2.0 and DeepSpeed. Hope it helps!
Hello,
The implementation for the Reformer model allows for the reconstruction of the full attention matrix (https://github.com/lucidrains/reformer-pytorch#research). There, the Recorder class can expand the attention matrix to it's original form.
How can one get this full attention matrix for the Routing transformer? The Recorder class is only compatible with the Reformer transformer.
The full attention matrix is needed for Transformer Interpretability/Explanation, such as the one described here: https://github.com/hila-chefer/Transformer-Explainability
I believe it would involve the lines here:
routing-transformer/routing_transformer/routing_transformer.py
Lines 407 to 417 in 3f6c461
The text was updated successfully, but these errors were encountered: