Replies: 1 comment
-
Does that even work with llms?
More realistic would be about 0.3 tokens per second (my estimate) or even less. So: Probably no benefit to have coral-acceleration support... :( -- |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Possible integration of google coral acceleration card.
Beta Was this translation helpful? Give feedback.
All reactions