Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug Report] Review current matmul function usages #720

Open
1 task done
bryce13950 opened this issue Sep 10, 2024 · 0 comments
Open
1 task done

[Bug Report] Review current matmul function usages #720

bryce13950 opened this issue Sep 10, 2024 · 0 comments
Labels
bug Something isn't working complexity-high Very complicated changes for people to address who are quite familiar with the code

Comments

@bryce13950
Copy link
Collaborator

Describe the bug

Originally from @ArthurConmy via slack

We seem to use batch_addmm stuff (link) as GPT-2 uses Conv1D (link), but this doesn't really fix the problem as Pythia (link), Llama (link) etc do not use Conv1D and I think their matmul implementations are not batchaddmm at all

We probably need to review all usages for various models to make sure we have the right usages that are needed for specific models. As the project has grown, the implementations have been moved around a lot, and we need to figure out a way to route through a larger variety of components based on config to handle a larger variety of requirements.

System Info
Describe the characteristic of your environment:

  • Describe how transformer_lens was installed (pip, docker, source, ...)
  • What OS are you using? (Linux, MacOS, Windows)
  • Python version (We suppourt 3.7 -3.10 currently)

Additional context
Add any other context about the problem here.

Checklist

  • I have checked that there is no similar issue in the repo (required)
@bryce13950 bryce13950 added bug Something isn't working complexity-high Very complicated changes for people to address who are quite familiar with the code labels Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working complexity-high Very complicated changes for people to address who are quite familiar with the code
Projects
None yet
Development

No branches or pull requests

1 participant