You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.
Thanks for replying to my previous questions. In the fig 3 of your paper, i had few queries.
In Average Span vs Span Limit (Central graph), you showed that in case of fixed span model, span increases as span limit increases. I wanted to ask, as per your code base, spans are already monitored by current_val only if adapt_span_enabled is set to True (line). So how did you measure the span of fixed model because in that case, the bool value will be false, and then AdaptiveSpan won't monitor it. How did you measure the span of fixed model ?
In FLOPS vs Span Limit, you showed that FLOPS keep on increasing in the case of fixed span model while in the case adaptive span, FLOPS were constant (approximately linear). After through inspection, FLOPS are constant in adaptive span but they don't see seem to be rising in case of standard attention as well. In both the cases, FLOPS are same. Could you please share some insights.
Thanks
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Thanks for replying to my previous questions. In the fig 3 of your paper, i had few queries.
In Average Span vs Span Limit (Central graph), you showed that in case of fixed span model, span increases as span limit increases. I wanted to ask, as per your code base, spans are already monitored by
current_val
only ifadapt_span_enabled
is set to True (line). So how did you measure the span of fixed model because in that case, the bool value will be false, and then AdaptiveSpan won't monitor it. How did you measure the span of fixed model ?In FLOPS vs Span Limit, you showed that FLOPS keep on increasing in the case of fixed span model while in the case adaptive span, FLOPS were constant (approximately linear). After through inspection, FLOPS are constant in adaptive span but they don't see seem to be rising in case of standard attention as well. In both the cases, FLOPS are same. Could you please share some insights.
Thanks
The text was updated successfully, but these errors were encountered: