-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft] Support RoPE Op and fuse #13978
Conversation
const int32_t i0_n = input_shape.Dims(0); | ||
const int32_t multihead_n = input_shape.Dims(1); | ||
const int32_t i2_n = input_shape.Dims(2); | ||
const int32_t head_n = input_shape.Dims(3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the meaning of the variable name is ambiguous.
the meaning of i0 and i2 is input rank0 and input rank2.
I would like to change this to a more clear name, but since I didn't exactly understand the meaning of this rank,
so I used a simple name(ex. i0, i2.).
If recommend variable name with a clear meaning, I will change it.
and,
as a kernel added to verify fused ops, optimization is not considered and is implemented as is in the graph.
and, although it may already be optimized,
there is no environment (input, golden data?) for measuring latency yet (maybe i don't know),
so i don't know what environment the data should be measured to be meaningful. andi plan to further
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rank0 and input rank2.
usually we used term dim
as dimension
for this.
meaning for each dim, maybe the source paper can explain or the python script for the model.
for measuring latency yet
we don't measure this in frontend :)
i don't know what environment the data should be measured to be meaningful.
as of now, we don't. just random input data, compared to source model, onnx-runtime for onnx and tflite interpreter for tflite model, looked as some computational black box.
@@ -619,6 +620,7 @@ union BuiltinOptions { | |||
BitcastOptions, | |||
BitwiseXorOptions, | |||
RightShiftOptions, | |||
RoPEOptions = 250, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i will change op number after rmsnorm is added because the number overlaps with rmsnorm
@@ -0,0 +1,211 @@ | |||
operand { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the rope part was extracted from the tg model(layers.tg.dynamic.1.q.opt.circle) and I created tflite receipe by referring to the graph.
and, the sin/cos table was divided into 0~80 in the range of -1 to 1, made into constants, and added as input.
and, the tg model is a quantized model and the data type is INT16. I tried to create a receipt with this data type, but
an error occurred because the luci interpreter's handling of INT16 was abnormal, so I created a receipe with Float32.
i think modifying the interpreter to support INT16 should be classified as a different task, and is it necessary? I think we need to check first.
and, through luci-pass-value-test, It was confirmed that the tflite model and circle model fused with rope op have the same results.
ONE-DCO-1.0-Signed-off-by: youngsik kim <[email protected]>
…into draft/compiler/rope
add param to support two modes(neox, gpt) of rope ONE-DCO-1.0-Signed-off-by: youngsik kim <[email protected]>
table RoPEOptions { | ||
mode: RoPEMode; | ||
} | ||
|
||
// An OperatorCode can be an enum value (BuiltinOperator) if the operator is a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add rope mode (neox, gpt)
#13972 (comment)
ONE-DCO-1.0-Signed-off-by: youngsik kim <[email protected]>
nnpackage/schema/circle_schema.fbs
Outdated
@@ -1521,7 +1521,13 @@ table InstanceNormOptions { | |||
fused_activation_function:ActivationFunctionType; | |||
} | |||
|
|||
enum RoPEMode : int { | |||
NEOX, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seanshpark
when expressing in enum, should I exclude '0'?
the mode of test.recipe is set to 0. the option can be expressed abnormaly.
flatc --json ./res/CircleSchema/0.9/circle_schema.fbs -- ./build/compiler/common-artifacts/RoPE_000.circle
builtin_options_type: "RoPEOptions",
builtin_options: {
}
if from '1', it is normal.
and, Is there a missing code? Or does it not support 0 structurally?
I thought you said that yesterday. I don't remember.--;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should I exclude '0'?
I think we can follow what other enum does.
Others have explicitly have values like
NEOX = 0,
And some has NONE = 0
or DEFAULT = 0
. But as I understand, we don't need none or default.
and, Is there a missing code? Or does it not support 0 structurally?
I can't catch the problem right now. I'll look up to the code...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seanshpark
thanks
i checked with circledump, and it is displayed normally
./build/compiler/circledump/circledump ./build/compiler/common-artifacts/RoPE_000.circle
O(0:0) ROPE
mode(GPT_NEOX)
I T(0:0) ifm
I T(0:1) sin_table
I T(0:2) cos_table
O T(0:3) ofm
ONE-DCO-1.0-Signed-off-by: youngsik kim <[email protected]>
this commit supports for RoPE operation in luc IR ONE-DCO-1.0-Signed-off-by: youngsik kim <[email protected]> draft : Samsung#13978 issue : Samsung#13972
this commits support RoPE for circle chef ONE-DCO-1.0-Signed-off-by: youngsik kim <[email protected]> draft : Samsung#13978 issue : Samsung#13972
ONE-DCO-1.0-Signed-off-by: youngsik kim <[email protected]> draft: Samsung#13978 issue: Samsung#13972
ONE-DCO-1.0-Signed-off-by: youngsik kim <[email protected]> draft: Samsung#13978 issue: Samsung#13972
ONE-DCO-1.0-Signed-off-by: youngsik kim <[email protected]> draft: Samsung#13978 issue: Samsung#13972
this commit supports for RoPE operation in luc IR ONE-DCO-1.0-Signed-off-by: youngsik kim <[email protected]> draft : Samsung#13978 issue : Samsung#13972
this PR is a draft version of the compiler related to RoPE op fuse.
Please refer to the issue below.
#13972
when i reflect it in the future, i will proceed according to the rules below.