Skip to content

Can someone explain what ggml_backend_sched_t do? #10182

Answered by slaren
Zijie-Tian asked this question in Q&A
Discussion options

You must be logged in to vote
  1. I am not sure that I can add much more to what is already written in the comment. If you are confused why that is necessary, keep in mind that the scheduler does not find the most optimal way to split the graph algorithmically, because that would be far too expensive. Instead, it is a bunch of heuristics that work well in the common use cases. So the answer to why the split is done the way it is, it is because it solved some specific problem that was found during development.
  2. The copies are performed during ggml_backend_sched_compute_splits. The reason there may be multiple copies of each tensor is because it improves efficiency when using pipeline parallelism.
  3. reserve is only intended t…

Replies: 2 comments 4 replies

Comment options

You must be logged in to vote
4 replies
@Zijie-Tian
Comment options

@Zijie-Tian
Comment options

@slaren
Comment options

@Zijie-Tian
Comment options

Answer selected by Zijie-Tian
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants