Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Renamed bsz to bs for consistency; removed dead code #299

Merged
merged 2 commits into from
May 3, 2024

Conversation

awgu
Copy link
Contributor

@awgu awgu commented May 3, 2024

Stack from ghstack (oldest at bottom):

some minor cleanups

awgu added a commit that referenced this pull request May 3, 2024
ghstack-source-id: 0b273e8f81013c1c632f0c505b7229d51af3e488
Pull Request resolved: #299
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 3, 2024
@@ -132,7 +132,6 @@ class Attention(nn.Module):
Attributes:
n_kv_heads (int): Number of key and value heads.
n_heads (int): Number of query heads.
n_local_kv_heads (int): Number of local key and value heads.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not an attribute (only one occurrence of n_local_kv_heads if you search in this file)

@@ -183,12 +182,12 @@ def forward(
torch.Tensor: Output tensor after attention.

"""
bsz, seqlen, _ = x.shape
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all inline comments in this method use bs for batch size so can make this bs for consistency

@@ -421,7 +420,7 @@ def forward(self, tokens: torch.Tensor):
torch.Tensor: Output logits after applying the Transformer model.

"""
_bsz, seqlen = tokens.shape
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, _bsz is unused, so just remove

Copy link
Contributor Author

@awgu awgu May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it helps readability to know the tokens.shape is (batch size, sequence length), I can keep it and maybe rename it to _bs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although not used, it improves code readability -- it tells how many dimensions tokens has, and what they are. So IMO I'd wish they are kept. Also, the "unusedness" has been indicated using the _ prefix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it helps readability to know the tokens.shape is (batch size, sequence length), I can keep it and maybe rename it to _bs?

just saw this message, yeah I agree

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed it to _bs

@awgu awgu marked this pull request as ready for review May 3, 2024 00:12
@awgu awgu requested review from wanchaol and tianyu-l May 3, 2024 00:12
Copy link
Contributor

@tianyu-l tianyu-l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment inline.

@@ -421,7 +420,7 @@ def forward(self, tokens: torch.Tensor):
torch.Tensor: Output logits after applying the Transformer model.

"""
_bsz, seqlen = tokens.shape
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although not used, it improves code readability -- it tells how many dimensions tokens has, and what they are. So IMO I'd wish they are kept. Also, the "unusedness" has been indicated using the _ prefix.

awgu added a commit that referenced this pull request May 3, 2024
ghstack-source-id: bbedad3819ab9ef90b233209c34dd1dbc846b06a
Pull Request resolved: #299
@awgu awgu merged commit 6a3e7b9 into gh/awgu/6/base May 3, 2024
4 checks passed
awgu added a commit that referenced this pull request May 3, 2024
ghstack-source-id: bbedad3819ab9ef90b233209c34dd1dbc846b06a
Pull Request resolved: #299
@awgu awgu deleted the gh/awgu/6/head branch May 3, 2024 00:48
tianyu-l pushed a commit to tianyu-l/torchtitan_intern24 that referenced this pull request Aug 16, 2024
ghstack-source-id: bbedad3819ab9ef90b233209c34dd1dbc846b06a
Pull Request resolved: pytorch#299
philippguevorguian pushed a commit to YerevaNN/YNNtitan that referenced this pull request Aug 17, 2024
ghstack-source-id: bbedad3819ab9ef90b233209c34dd1dbc846b06a
Pull Request resolved: pytorch#299
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants