Make BBQ prompts identical to HELM's version #39

brianwgoldman · 2024-01-11T20:11:42Z

In PR #33 I mentioned the only difference was how training examples get sampled. In this PR I'm porting the logic from HELM for how to do sampling.

My goal is to get NewHELM to produce the exact same value for the BBQ stats when using GPT2, as a way to ensure we have a fully functioning replacement.

In HELM this is the most common type of prompt. It is also how BBQ works, so I'll need it in fleshing out that Test.

…ous PR

Before this change the difference was how they sampled in context learning examples. I've updated that to match.

github-actions · 2024-01-11T20:11:59Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

yifanmai

Should have mentioned this earlier, but an alternative to copy and pasting the sampling algorithm would be to hardcode in the indexes of the sampled training examples.

yifanmai · 2024-01-17T23:54:10Z

Also, the sampled test items will still be different, right? I think it's sufficient to get close enough to BBQ, without needing to reproduce it exactly.

brianwgoldman · 2024-01-18T00:12:19Z

Also, the sampled test items will still be different, right? I think it's sufficient to get close enough to BBQ, without needing to reproduce it exactly.

The test items are still the same as long as max_eval_instances is 1000 or more. In that situation, HELM does no sampling or shuffling of the eval instances.

brianwgoldman added 28 commits January 5, 2024 08:46

Create a utility module for multiple choice questions.

f98e54d

In HELM this is the most common type of prompt. It is also how BBQ works, so I'll need it in fleshing out that Test.

Forgot to remove newline

98e1254

Copy over BBQ scenario to prepare for integration.

059a4ba

Somehow forgot the actual bbq file.

45a0424

Most of the way to making TestItems.

b653b61

Update formatting to match what HELM uses.

48ed4a7

Remove unused import

a87417e

Copy bbq_metric into NewHELM, roughly split to where it should go.

6cda68b

The BBQ test works, but I need to backport some things into the previ…

94dae6f

…ous PR

Use a dataclass to hold context.

1193712

Merge branch 'auxy/bbq-make-instances' into auxy/copy-bbq-metric

c68933b

Merge branch 'auxy/copy-bbq-metric' into auxy/fully-working-bbq

5df7945

Fix some typing mistakes

28eced5

Merge branch 'auxy/bbq-make-instances' into auxy/copy-bbq-metric

9d7e44a

Merge branch 'auxy/copy-bbq-metric' into auxy/fully-working-bbq

7d2db25

Fix test to match change to gpt2

86f5e45

Merge branch 'main' into auxy/multiple-choice

44c5d90

Merge branch 'auxy/multiple-choice' into auxy/copy-bbq-scenario

3d3ccd8

Merge branch 'auxy/copy-bbq-scenario' into auxy/bbq-make-instances

2414dbc

Merge branch 'auxy/bbq-make-instances' into auxy/copy-bbq-metric

37339a9

Merge branch 'auxy/copy-bbq-metric' into auxy/fully-working-bbq

11d1741

Use aggregations.py

92d9615

Forgot to run black.

4bf0872

Remove outdated comment.

703690a

Keep strip() call from HELM.

a081e74

Make NewHELM's BBQ produce exactly the same prompts as HELM.

d1cb70c

Before this change the difference was how they sampled in context learning examples. I've updated that to match.

Merge branch 'main' into auxy/rewrite-example-sampling

9153ab5

Merge branch 'main' into auxy/rewrite-example-sampling

286c2f5

brianwgoldman requested a review from a team as a code owner January 11, 2024 20:11

brianwgoldman requested a review from yifanmai January 12, 2024 20:27

Merge branch 'main' into auxy/rewrite-example-sampling

e80aaf4

yifanmai approved these changes Jan 17, 2024

View reviewed changes

brianwgoldman merged commit 29626dc into main Jan 18, 2024
2 checks passed

brianwgoldman deleted the auxy/rewrite-example-sampling branch January 18, 2024 00:12

github-actions bot locked and limited conversation to collaborators Jan 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make BBQ prompts identical to HELM's version #39

Make BBQ prompts identical to HELM's version #39

brianwgoldman commented Jan 11, 2024

github-actions bot commented Jan 11, 2024 •

edited

Loading

yifanmai left a comment

yifanmai commented Jan 17, 2024

brianwgoldman commented Jan 18, 2024

Make BBQ prompts identical to HELM's version #39

Make BBQ prompts identical to HELM's version #39

Conversation

brianwgoldman commented Jan 11, 2024

github-actions bot commented Jan 11, 2024 • edited Loading

yifanmai left a comment

Choose a reason for hiding this comment

yifanmai commented Jan 17, 2024

brianwgoldman commented Jan 18, 2024

github-actions bot commented Jan 11, 2024 •

edited

Loading