Tiny-llama Encountered unresolved custom op: odml.update_kv_cache #175

vignesh-spericorn · 2024-08-28T06:37:44Z

Description of the bug:

I converted tiny-llama model using the convert_to_tflite.py.
The name of the converted model is tiny_llama_seq512_kv1024.tflite.

I tried to run inference using the following code

import tflite_runtime.interpreter as tflite
from transformers import AutoTokenizer

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("tiny-llama")

# Input text
input_text = "write a poem about sun in 4 lines"

# Tokenize the input text and convert it to tensor format
input_tokens = tokenizer.encode(input_text, return_tensors='np')  # Returns numpy array

# Load the TFLite model
model_path = "output/tiny_llama_seq512_kv1024.tflite"
interpreter = tflite.InterpreterWithCustomOps(model_path=model_path)
interpreter.allocate_tensors()

I got the following error

RuntimeError: Encountered unresolved custom op: odml.update_kv_cache.
See instructions: https://www.tensorflow.org/lite/guide/ops_custom Node number 49 (odml.update_kv_cache) failed to prepare.Encountered unresolved custom op: odml.update_kv_cache.
See instructions: https://www.tensorflow.org/lite/guide/ops_custom Node number 49 (odml.update_kv_cache) failed to prepare.

Versions
Python 3.11.9
tf_nightly==2.18.0.dev20240826
tflite-runtime==2.14.0
tflite-runtime-nightly==2.18.0.dev20240826
tokenizers==0.19.1
torch==2.4.0
torch-xla==2.4.0
transformers==4.44.2

Actual vs expected behavior:

No response

Any other information you'd like to share?

No response

The text was updated successfully, but these errors were encountered:

haozha111 · 2024-08-28T17:18:21Z

hi,

can you use our C++ example or LLM inference API to do model inference? the error indicates the missing of a custom op (kv cache) and it fails. Currently we can't link those custom ops in python yet, but you can refer to this for how to do the inference:
https://github.com/google-ai-edge/ai-edge-torch/tree/main/ai_edge_torch/generative#end-to-end-inference-pipeline

vignesh-spericorn · 2024-08-30T04:34:15Z

Thanks i'll try this. But can we expect the python implementation of custom ops soon ?

haozha111 · 2024-08-30T17:00:15Z

Thanks i'll try this. But can we expect the python implementation of custom ops soon ?

yes, we are working on it. @majiddadashi fyi

vignesh-spericorn added the type:bug Bug label Aug 28, 2024

github-staff deleted a comment from vignesh-spericorn Aug 28, 2024

haozha111 assigned haozha111 and hheydary Aug 28, 2024

haozha111 assigned majiddadashi Aug 30, 2024

pkgoogle added the status:awaiting ai-edge-developer label Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tiny-llama Encountered unresolved custom op: odml.update_kv_cache #175

Tiny-llama Encountered unresolved custom op: odml.update_kv_cache #175

vignesh-spericorn commented Aug 28, 2024 •

edited

Loading

haozha111 commented Aug 28, 2024

vignesh-spericorn commented Aug 30, 2024

haozha111 commented Aug 30, 2024

Tiny-llama Encountered unresolved custom op: odml.update_kv_cache #175

Tiny-llama Encountered unresolved custom op: odml.update_kv_cache #175

Comments

vignesh-spericorn commented Aug 28, 2024 • edited Loading

Description of the bug:

Actual vs expected behavior:

Any other information you'd like to share?

haozha111 commented Aug 28, 2024

vignesh-spericorn commented Aug 30, 2024

haozha111 commented Aug 30, 2024

vignesh-spericorn commented Aug 28, 2024 •

edited

Loading