You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks for your contribution ...
I just had some questions :
In usual transformer implementation, I don't see any mention of convolution layers (self conv blocks that you have used) in attention, neither in the paper, I see a clear model architecture explanation (please inform me if I am missing), I see just a small subsection where they have compared it with tacotron. Could you tell me where did you get this implementation from?? Any paper ?? Discussions ???
2.You are concatenating query vector with attention in mha blocks. Is there any discussion somewhere?? (Paper ?? Discussions??) What happened without query concatenation??
You have used negative values for the stop end vector, but the decoder prenet uses relu activations. Although it still learned, isn't it good to change that??
Thanks
The text was updated successfully, but these errors were encountered:
Hi, thanks for your contribution ...
I just had some questions :
2.You are concatenating query vector with attention in mha blocks. Is there any discussion somewhere?? (Paper ?? Discussions??) What happened without query concatenation??
Thanks
The text was updated successfully, but these errors were encountered: