Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

possible issues in PositionEmbedding ? #92

Closed
Alexander-Barth opened this issue Jun 3, 2022 · 1 comment
Closed

possible issues in PositionEmbedding ? #92

Alexander-Barth opened this issue Jun 3, 2022 · 1 comment

Comments

@Alexander-Barth
Copy link
Contributor

Thank you very much for making your work available as a Julia package!
It helps me a lot to learn about Transformers (a topic which I don't know much...)
I was comparing:

function PE(size, pos, i::Int)
if rem(i, 2) == 0
sin(pos/1e4^(i/size))
else
cos(pos/1e4^((i-1)/size))
end
end

to
https://machinelearningmastery.com/a-gentle-introduction-to-positional-encoding-in-transformer-models-part-1/

It seems to me that we have cos(pos) (for i = 1) in embedding but we do not have its pair sin(pos) as in index i is 1-based. If we would shift i by 1 then we would have all sin/cos pairs.

Of course, this is a very new topic too me and I might be completely wrong here (or it simply does not matter...).
I can make a PR this is indeed an issue.

@chengchingwen
Copy link
Owner

Generally it doesn't matter that much. That function is just to create periodic function as the basis. So as long as those bases have similar attributes, the training behavior would be similar.

OTOH, it is an issue if we want the implementation to be exactly the same as the original paper. The index shift means we are actually using a complete different set of bases for the position encoding. We should change it so that the value could be the same. A PR is appreciated :)

Alexander-Barth added a commit to Alexander-Barth/Transformers.jl that referenced this issue Jun 3, 2022
chengchingwen pushed a commit that referenced this issue Jun 5, 2022
* addresses issue #92

* update order of basis functions in PositionEmbedding
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants