You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Enhancement idea
Would it be possible to add num_beams and do_sample to llama.cpp to steer sampling and decoding strategy easier?
For example when using for greedy decoding:
Setting temperature to 0 makes the model deterministic by focusing on the most likely token. However, this setting alone does not control the overall decoding strategy.
Setting num_beams to 1 ensures that the model does not use beam search, which is a strategy that explores multiple possible sequences to find the most probable one.
Setting do_sample to False ensures that the model does not use sampling methods like multinomial sampling, which introduce randomness into the token selection process.
Currently, in llama.cpp there is no native support for these params: Error: unknown parameter
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Enhancement idea
Would it be possible to add
num_beams
anddo_sample
to llama.cpp to steer sampling and decoding strategy easier?For example when using for greedy decoding:
Setting
temperature
to 0 makes the model deterministic by focusing on the most likely token. However, this setting alone does not control the overall decoding strategy.num_beams
to 1 ensures that the model does not use beam search, which is a strategy that explores multiple possible sequences to find the most probable one.do_sample
to False ensures that the model does not use sampling methods like multinomial sampling, which introduce randomness into the token selection process.Currently, in llama.cpp there is no native support for these params:
Error: unknown parameter
Please see https://huggingface.co/docs/transformers/main_classes/text_generation#transformers.GenerationConfig
What do you think about this idea (before making it an Enhancement in Issues)?
Beta Was this translation helpful? Give feedback.
All reactions