Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate TransformerEmbedding layer #33

Merged
merged 1 commit into from
Feb 2, 2024
Merged

Separate TransformerEmbedding layer #33

merged 1 commit into from
Feb 2, 2024

Conversation

wconstab
Copy link
Contributor

@wconstab wconstab commented Feb 2, 2024

Make it easier to chop Transformer into pieces for PP

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 2, 2024
Copy link
Contributor

@wanchaol wanchaol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@@ -278,6 +278,40 @@ def forward(self, x):
return self.w2(F.silu(self.w1(x)) * self.w3(x))


class TransformerEmbedding(nn.Module):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe call it RotaryEmbedding as it does not only do plain embedding but also compute the freqs_cis?

Make it easier to chop Transformer into pieces for PP
@wconstab wconstab merged commit b99af33 into main Feb 2, 2024
1 of 2 checks passed
@wconstab wconstab deleted the whc/modular branch February 2, 2024 18:46
wconstab added a commit that referenced this pull request Apr 10, 2024
Avoid diverging the model structure (FQNs and checkpoint
interoperability) with similar models.

This reverts commit b99af33.

ghstack-source-id: 0f93b1b4b9fedcf48f768d2ab1159eaab18ec8c2
Pull Request resolved: #214
wconstab added a commit that referenced this pull request Apr 10, 2024
Avoid diverging the model structure (FQNs and checkpoint
interoperability) with similar models.

This reverts commit b99af33.

ghstack-source-id: 8e27f08f9373f20e8ca73ac2534a6f97b15edd6f
Pull Request resolved: #214
wconstab added a commit that referenced this pull request Apr 10, 2024
Avoid diverging the model structure (FQNs and checkpoint
interoperability) with similar models.

This reverts commit b99af33.

ghstack-source-id: 3f2d31e5e4d2025d5ea0d5132cf377b92ed22105
Pull Request resolved: #214
wconstab added a commit that referenced this pull request Apr 10, 2024
Avoid diverging the model structure (FQNs and checkpoint
interoperability) with similar models.

This reverts commit b99af33.

ghstack-source-id: 9811f5fa99fdde387efe6018aa00afd28e7e923b
Pull Request resolved: #214
wconstab added a commit that referenced this pull request Apr 10, 2024
Avoid diverging the model structure (FQNs and checkpoint
interoperability) with similar models.

This reverts commit b99af33.

ghstack-source-id: 9811f5fa99fdde387efe6018aa00afd28e7e923b
Pull Request resolved: #214
lessw2020 pushed a commit that referenced this pull request Apr 18, 2024
Make it easier to chop Transformer into pieces for PP
lessw2020 pushed a commit that referenced this pull request Apr 18, 2024
Avoid diverging the model structure (FQNs and checkpoint
interoperability) with similar models.

This reverts commit b99af33.

ghstack-source-id: 9811f5fa99fdde387efe6018aa00afd28e7e923b
Pull Request resolved: #214
philippguevorguian pushed a commit to YerevaNN/YNNtitan that referenced this pull request Aug 17, 2024
Avoid diverging the model structure (FQNs and checkpoint
interoperability) with similar models.

This reverts commit f30202c.

ghstack-source-id: 9811f5fa99fdde387efe6018aa00afd28e7e923b
Pull Request resolved: pytorch#214
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants