Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Per request sampling params #185

Open
qihqi opened this issue Sep 24, 2024 · 3 comments
Open

[Feature Request] Per request sampling params #185

qihqi opened this issue Sep 24, 2024 · 3 comments
Assignees

Comments

@qihqi
Copy link
Collaborator

qihqi commented Sep 24, 2024

Currently sampling params such as temperature are set as commandline flags in when the server starts.

It would be nice for each request to pass in the sampling params instead.

@kiratp
Copy link

kiratp commented Sep 27, 2024

Beyond sampling parameters, the following would be very helpful

  1. Prompt token counts: Makes it easier to potentially trim the next request
  2. logprobs - Extremely useful for scenarios like LLM-as-judge or similar
  3. seed - getting deterministic responses is pretty useful during development and for certain use cases for end user applicaitons

@qihqi
Copy link
Collaborator Author

qihqi commented Oct 2, 2024

HI @kiratp few questions:

On 1. prompt_token_counts would be the same behavior as per-request max_output_len?
On 2> logprobs is a boolean arg on input to signify to return the logprobs in the response protocol buffer?
On 3> This would be a global seed as command line argument to start the server; as seed itself is a global in torch?

@kiratp
Copy link

kiratp commented Oct 3, 2024

  1. Yes. idea would be to get the actual token counts for prompt and completion (something like this: https://platform.openai.com/docs/api-reference/making-requests)
  2. Yes
  3. That is fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants