-
Notifications
You must be signed in to change notification settings - Fork 275
Issues: TransformerLensOrg/TransformerLens
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug Report] Review current matmul function usages
bug
Something isn't working
complexity-high
Very complicated changes for people to address who are quite familiar with the code
#720
opened Sep 10, 2024 by
bryce13950
1 task done
[Proposal] Add frequency-based RoPE support for Llama 3.1 models
#719
opened Sep 9, 2024 by
frances720
1 task done
[Bug Report] Torch FutureWarning when calling
utils.download_file_from_hf
with torch==2.4.1
#714
opened Sep 6, 2024 by
albertsgarde
[Proposal] Add MVP Support For 1-2 Models Per-Modality
#710
opened Aug 31, 2024 by
4gatepylon
1 task done
[Bug Report]
tokenize_and_concatenate
doesn't work with small datasets.
#707
opened Aug 23, 2024 by
yash-srivastava19
1 task done
[Proposal] Add support for TracrBench
complexity-high
Very complicated changes for people to address who are quite familiar with the code
new-architecture
This card involves adding a new architecture .
#704
opened Aug 14, 2024 by
HannesThurnherr
How to get the Activation cache while the LLM is generating new tokens?
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
#697
opened Aug 7, 2024 by
Meehaohao
[Bug Report] Gemma-2-2b-it output logit doesn't match with huggingface
complexity-high
Very complicated changes for people to address who are quite familiar with the code
implementation-inaccuracy
Any issues related to our implementation being off from the official version
#693
opened Aug 2, 2024 by
yeutong
1 task done
[Proposal] Add Lllama 3.1 support
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
new-architecture
This card involves adding a new architecture .
#691
opened Jul 31, 2024 by
ssuukk
1 task done
[Bug Report] Different results from HuggingFace when using the GPT2 small example
complexity-high
Very complicated changes for people to address who are quite familiar with the code
implementation-inaccuracy
Any issues related to our implementation being off from the official version
needs-investigation
Issues that need to be recreated, or investigated before work can be done
#685
opened Jul 27, 2024 by
nreHieW
1 task done
[Question] Why does Transformer Lens only support quantized LLaMA models?
#684
opened Jul 26, 2024 by
miguel-kjh
[Bug Report] Qwen model implementation is too inaccurate
complexity-high
Very complicated changes for people to address who are quite familiar with the code
implementation-inaccuracy
Any issues related to our implementation being off from the official version
needs-investigation
Issues that need to be recreated, or investigated before work can be done
#683
opened Jul 23, 2024 by
bryce13950
1 task done
[Proposal] Allow tied embeddings
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
enhancement
New feature or request
#671
opened Jul 12, 2024 by
neelnanda-io
does run_with_cache method support data parallel , how can I do it ?
#669
opened Jul 12, 2024 by
Yang-bug-star
[Proposal] Allow recent versions of beartype
complexity-simple
Simple issues, which may be good for beginners
tooling
Anything pertaining to outside tools used within the codebase
#665
opened Jul 10, 2024 by
jettjaniak
1 task done
[Bug Report] Pythia output inconsistent across batch sizes when use_split_qkv_input=True
bug
Something isn't working
complexity-high
Very complicated changes for people to address who are quite familiar with the code
implementation-inaccuracy
Any issues related to our implementation being off from the official version
#661
opened Jul 8, 2024 by
oliveradk
1 task done
[Bug Report] RMSNormPre in Transformer_lens is maybe different from Llama source code?
#657
opened Jul 6, 2024 by
wangyifei0047
Is it possible to use a locally downloaded model without accessing HF?
#655
opened Jul 4, 2024 by
ccp123456
[Proposal] Documentation: Map the Act Names to the Transformer
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
documentation
Improvements or additions to documentation
#644
opened Jun 21, 2024 by
JuVogt
1 task done
[Proposal] Remove the overhead caused by full_hook.__name__ = (hook.__repr__())?
#631
opened Jun 8, 2024 by
verlocks
[Proposal] Add support for Baichuan1 and Baichuan2
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
#622
opened Jun 3, 2024 by
StarrySeas1
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.