Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running against local model service #1688

Open
drisspg opened this issue Jun 23, 2023 · 4 comments
Open

Running against local model service #1688

drisspg opened this issue Jun 23, 2023 · 4 comments
Labels
additions New models or scenarios competition Support for the NeurIPS Large Language Model Efficiency Challenge enhancement New feature or request models p2 Priority 2 (Good to have for release)

Comments

@drisspg
Copy link
Collaborator

drisspg commented Jun 23, 2023

For the neurips competetion we woud like to be able to run Helm against a model running local to the machine. I.e run helm in a container and have it send requests to another container. I started implementing a new local http client following the "adding a model" readme.

However I see that there already appears to be something pretty similiar here:

class ServerService(Service):

Do you think this would fit our needs? How exactly does one use this with helm-run?
I would imagine it to be something like:

RUN echo 'entries: [{description: "mmlu:subject=philosophy,model=local_model", priority: 1}]' > run_specs.conf

helm-run --conf-paths run_specs.conf --suite v1 --max-eval-instances 10 --local-path="http://localhost"
@msaroufim
Copy link
Collaborator

msaroufim commented Jun 23, 2023

cc @yifanmai - do you mind if we get label permissions only so we can tag features as competition

@msaroufim msaroufim added the competition Support for the NeurIPS Large Language Model Efficiency Challenge label Jun 24, 2023
@yifanmai
Copy link
Collaborator

Granted permissions.

ServerService is probably too heavyweight. It basically runs a full "playground API" server (which we also call a "proxy" server elsewhere in the code), which has a lot of extra functionality that you don't need (e.g. user authentication, user quotas).

I think you would need to create a new kind of server and client that's basically subset of the full playground API:

  • Make a new HTTP server library with make_request(), tokenize() and decode() endpoints similarly to the full playground server.
  • Make a new Client subclass for which the endpoint URL(s) is configurable, either by flag or environment variable, that only implements make_request(), tokenize() and decode() similarly to the full playground client

I also think that the current abstractions would make it difficult to reuse code, so it might be better to make these separate implementations.

I'm not sure what the right names of these clients and servers would be... possibly CompetitionClient, or NativeClient (because it uses "native" CRFM JSON schemas).

@yifanmai yifanmai added p2 Priority 2 (Good to have for release) models additions New models or scenarios enhancement New feature or request labels Jun 28, 2023
@timothylimyl
Copy link
Contributor

this feels like a feature that should be integrated into helm. I think most people will want to set up their own local models to run helm evaluation.

@drisspg
Copy link
Collaborator Author

drisspg commented Sep 2, 2023

This was landed in #1693 there are more changes incoming to make this solution more generic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
additions New models or scenarios competition Support for the NeurIPS Large Language Model Efficiency Challenge enhancement New feature or request models p2 Priority 2 (Good to have for release)
Projects
None yet
Development

No branches or pull requests

4 participants