Updating make-me-say to be compatible with Solvers #1546

lennart-finke · 2024-08-18T10:15:07Z

This PR refactors make-me-say to be compatible with the Solvers API.

Instead of passing three completion functions, the eval is now passed one solver as the con artist and two completion functions as the mark and summary model respectively. ~~We still assume gpt-4-32k and gpt-3.5-turbo-16k as defaults, as #1530 is not yet merged.~~ (Edit: Reviewer suggested adding gpt-4o-mini as default instead and changing the registry ourselves.)

Submission agreement

By contributing to Evals, you are agreeing to make your evaluation logic and data under the same MIT license as this repository. You must have adequate rights to upload any data used in an Eval. OpenAI reserves the right to use this data in future service improvements to our product. Contributions to OpenAI Evals will be subject to our usual Usage Policies (https://platform.openai.com/docs/usage-policies).

I agree that my submission will be made available under an MIT license and complies with OpenAI's usage policies.

.gitignore

evals/elsuite/make_me_say/autoeval.py

evals/registry/solvers/make-me-say.yaml

evals/registry/evals/make-me-say.yaml

lennart-finke · 2024-08-22T14:55:38Z

Thanks for the comments @danesherbs! Addressed them and reran tests, ready for further review or merge.

lennart-finke added 4 commits July 30, 2024 15:48

Refactored abstractions to conform to Solvers

5f876e4

Greening tests by conforming to refactor

f673103

Reformated prompts, added token estimate

30f6e2e

Clean up readme

5f595b5

lennart-finke requested review from andrew-openai, etr2460 and katyhshi as code owners August 18, 2024 10:15

danesherbs reviewed Aug 22, 2024

View reviewed changes

.gitignore Show resolved Hide resolved

danesherbs reviewed Aug 22, 2024

View reviewed changes

evals/elsuite/make_me_say/autoeval.py Outdated Show resolved Hide resolved

danesherbs reviewed Aug 22, 2024

View reviewed changes

evals/registry/solvers/make-me-say.yaml Outdated Show resolved Hide resolved

danesherbs reviewed Aug 22, 2024

View reviewed changes

evals/registry/evals/make-me-say.yaml Outdated Show resolved Hide resolved

Addressed comments for PR to main

950b2f3

lennart-finke requested a review from danesherbs August 28, 2024 19:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updating make-me-say to be compatible with Solvers #1546

Updating make-me-say to be compatible with Solvers #1546

lennart-finke commented Aug 18, 2024 •

edited

Loading

lennart-finke commented Aug 22, 2024

Updating make-me-say to be compatible with Solvers #1546

Are you sure you want to change the base?

Updating make-me-say to be compatible with Solvers #1546

Conversation

lennart-finke commented Aug 18, 2024 • edited Loading

Submission agreement

lennart-finke commented Aug 22, 2024

lennart-finke commented Aug 18, 2024 •

edited

Loading