Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trace model in model-explorer #254

Open
nigelzzzzzzz opened this issue Sep 24, 2024 · 13 comments
Open

Trace model in model-explorer #254

nigelzzzzzzz opened this issue Sep 24, 2024 · 13 comments

Comments

@nigelzzzzzzz
Copy link

Description of the bug:

hi @pkgoogle,
i have some question about computer graph with tinyllama.

  • i can't see rotary position encoding in the computer graph, i just can see tok embedding.
    image
  • i see a lot of stable-hlo composite op, can i know how to work in this? i know the op fuse is a optimization method. but why choose below op to fuse.
    image
  • why the number of stable-hlo composite op is 22.
  • can i know what is graph input.
    image

Actual vs expected behavior:

No response

Any other information you'd like to share?

No response

@pkgoogle pkgoogle self-assigned this Sep 24, 2024
@pkgoogle
Copy link
Contributor

Hi @nigelzzzzzzz, I don't think I can explain it better than the code but effectively a model is equivalent to a program -- just like how compilers in the past converted programs to binary, they some times go through intermediary representations. These intermediary representations are expressed in MLIR. StableHLO is a dialect of MLIR, and so is VHLO. During this process to convert a model, there are sometimes opportunities to optimize the model/program/graph. It would take a long time to explain this all and I do not know every piece myself -- you might want to dig into the code yourself to see how you can make sense of it.

@pkgoogle pkgoogle added status:awaiting user response When awaiting user response type:support For use-related issues and removed type:bug Bug labels Sep 24, 2024
Copy link

github-actions bot commented Oct 2, 2024

Marking this issue as stale since it has been open for 7 days with no activity. This issue will be closed if no further activity occurs.

@nigelzzzzzzz
Copy link
Author

Hi @pkgoogle , thanks for your reply, i have checked the source code,
After qkv projection, the result can get rope cache via gather_nd, but i check the op information, it always show 0.
image
but when i use toysinglemodel, it can show correct value
image

@pkgoogle
Copy link
Contributor

pkgoogle commented Oct 3, 2024

Hi @nigelzzzzzzz, I'm not understanding what do you mean by

it always shows 0

for the first screenshot... can you perhaps highlight what you mean or add more context there?

@nigelzzzzzzz
Copy link
Author

Hi @pkgoogle ,
ok, i got it, let me try to describe my question.
In my understand, I think rope stage like,

  • get rope cache from graphinput,
  • using gagher_nd get sin and cos value.
  • then after qkv projection in transformer block, do position embedding.

i think in gather_nd parameter should has sin or cos value. Because second picture i contain it.
you can reference red box.
Screenshot from 2024-10-04 11-58-04

@pkgoogle pkgoogle removed the status:awaiting user response When awaiting user response label Oct 7, 2024
@pkgoogle
Copy link
Contributor

pkgoogle commented Oct 7, 2024

Hi @nigelzzzzzzz, if I'm understanding you correctly, you're saying that the params values = [[ 1, 1, 1 ... 1]] is malformed, and you were expecting something more like [[0, 0],[0.8,0],[0.9,0.1],...]. Is your issue w/ the values or the shape? or both?

I'm running into issues visualizing the converted tiny_llama model, can you maybe give me your reproduction steps for producing the first visualization?

@pkgoogle pkgoogle added the status:awaiting user response When awaiting user response label Oct 7, 2024
@nigelzzzzzzz
Copy link
Author

Hi @pkgoogle ,
i just follow the steps from [https://github.com/google-ai-edge/model-explorer]

$ pip install ai-edge-model-explorer
$ model-explorer

and i expect the value should be like sin or cos value, because model can get
get token -> generate vector -> do rope embedding
User q value can mult rope matrix, like below.
image

so i think the gatherid's value should be like cos m theta and sin m theta matrix

@pkgoogle pkgoogle removed the status:awaiting user response When awaiting user response label Oct 10, 2024
@pkgoogle
Copy link
Contributor

It may be how I'm creating the tiny_llama model, which was with nightly... can you tell me which commit (git log top commit) did you use to create the tiny_llama model?

@pkgoogle pkgoogle added the status:awaiting user response When awaiting user response label Oct 10, 2024
Copy link

Marking this issue as stale since it has been open for 7 days with no activity. This issue will be closed if no further activity occurs.

@nigelzzzzzzz
Copy link
Author

@pkgoogle , i use master branch,.... sometimes the nightly can't work correctly....

@pkgoogle pkgoogle removed status:awaiting user response When awaiting user response status:stale labels Oct 18, 2024
@nigelzzzzzzz
Copy link
Author

Hi @pkgoogle ,

i using below commit id , you can reference it thanks.

commit b076be8c44041292b13a4b6a791465e3542f80d0 (HEAD -> main, origin/main, origin/HEAD)
Author: Google AI Edge <[email protected]>
Date:   Tue Oct 1 20:28:07 2024 -0700

    Stable diffusion pipeline updates to use int32 and float32.
    
    PiperOrigin-RevId: 681267676

@pkgoogle
Copy link
Contributor

pkgoogle commented Oct 22, 2024

I was able to replicate with the model produced with that commit. I haven't verified wether the value you see is degenerate.

@nigelzzzzzzz
Copy link
Author

Hi @pkgoogle , thanks for your reply.

Actually I know how to repe work, but i need to double confirm with the computer graph. i still confuse the rope operation in tflite computer graph. can you tell me how does the op working in the graph? thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants