-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
text_generator_main.cc using tinyllama model to inference can show Garbled characters #109
Comments
Hi @nigelzzz, can you please provide more information so that we may reproduce it? For example, what version of Python you are using? which branch you are using? Please also provide reproduce steps like: python convert_to_tflite.py
<whatever commands you used to run the model> Thanks! |
Hi @pkgoogle , /ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/convert_to_tflite.py
python3 convert_to_tflite.py Then we can see i built // Prepare helpers
std::unique_ptr<tflite::FlatBufferModel> LoadModel() {
std::unique_ptr<tflite::FlatBufferModel> model =
@@ -85,7 +93,13 @@ std::unique_ptr<tflite::Interpreter> BuildInterpreter(
tflite::ops::builtin::BuiltinOpResolver resolver;
// NOTE: We need to manually register optimized OPs for KV-cache and
// Scaled Dot Product Attention (SDPA).
- tflite::ops::custom::GenAIOpsRegisterer(&resolver);
+ resolver.AddCustom("odml.update_kv_cache",
+ tflite::ops::custom::Register_KV_CACHE());
+ resolver.AddCustom("odml.scaled_dot_product_attention",
+ tflite::ops::custom::Register_SDPA());
+
+
+ //tflite::ops::custom::GenAIOpsRegisterer(&resolver); parameter
|
@pkgoogle , Because i can't see the file on llama huggingface repo |
the |
@haozha111 very thanks!!! |
Hi @nigelzzz, which checkpoint data are you using from the original tiny_llama model? Thanks for your help. |
@pkgoogle, |
@pkgoogle , Thanks!! |
Hi @nigelzzz, @hheydary is currently assigned to this case. I would first try to see if you still get the same result if you removed your modifications first. If not, then you know it has something to do w/ your update. If so, you said "can show" so are you saying this happens often or just once in a while? If it happens in only particular instances, that will be good data to share with us. If it happens "all the time" ... this should show in the loss when validating on a known dataset. But yeah those would be good places to start. Hope that helps. |
Hi @nigelzzz,
i.e., (<|user|> \n PROMPT \n <|assistant|>. |
Hi @hheydary and @pkgoogle,
|
Unfortunately, I am not able to reproduce the issue that you are seeing. Using the following command:
The model generates reasonable outputs. A few things:
|
@hheydary ,
|
@hheydary ,
|
|
in 0.2.0
|
@hheydary ,
|
@pkgoogle @hheydary @haozha111 , |
Description of the bug:
*.tflite, (no quantize)
tiny_llama_seq512_kv1024.tflite
, the output isActual vs expected behavior:
No response
Any other information you'd like to share?
No response
The text was updated successfully, but these errors were encountered: