You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In LlamaModeling.py, the LlamaRMSNorm function outputs the weights * scaled hidden_states like below
RMSNormPre definition in Transformer_lens: it seems that this function just outputs the scaled hidden_states
The way RMSNormPre by which Transformer_Block uses
it seems that in the forward process in Transformer_Block, the weights of LlamaRMSNorm still not be added.
I want to hook the values after applying RMSNorm on each residual stream, so I try to find the parameters in RMSNorm and find something weird.
[yes] I have checked that there is no similar issue in the repo (required)
The text was updated successfully, but these errors were encountered:
In LlamaModeling.py, the LlamaRMSNorm function outputs the weights * scaled hidden_states like below
RMSNormPre definition in Transformer_lens: it seems that this function just outputs the scaled hidden_states
The way RMSNormPre by which Transformer_Block uses
it seems that in the forward process in Transformer_Block, the weights of LlamaRMSNorm still not be added.
I want to hook the values after applying RMSNorm on each residual stream, so I try to find the parameters in RMSNorm and find something weird.
The text was updated successfully, but these errors were encountered: