Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAR-b results #7

Merged
merged 4 commits into from
Jul 11, 2024
Merged

RAR-b results #7

merged 4 commits into from
Jul 11, 2024

Conversation

gowitheflow-1998
Copy link
Contributor

@gowitheflow-1998 gowitheflow-1998 commented Jul 10, 2024

uploading RAR-b results (10 models for now)

All opensourced models were rerun with MTEB. Can confirm match exactly with own results. API models were from own implementation reformatted into MTEB formats.

Have currently differentiated w/o and w/ instruction setting on the task level, such as RARbMath and RARbMathInstruct but am happy to reformat them soon to differentiate on the model level following discussions in issues#1066 after we decide it!

Note:
1. Results of all-mpnet-base-v2 all-MiniLM-L6-v2 and GritLM-7B were already in the repo. For the first two, I find that the results are the same with my run for w/o instruct, so I only added f{Task}Instruct.json (won't be conflicts). For GritLM-7B, the current f{Task}.json matches with my f{Task}Instruct.json and will bring conflicts. So I added a temp folder results/GritLM-7B/13f00a0e36500c80ce12870ea513846a066004af-temp to upload the results for now just for the leaderboard.
2. Have put a 0.0 in evaluation time for API models as don't have the numbers.
3. Dragon-plus is asymmetric with dragon-plus-query-encoder and dragon-plus-context-encoder as different models (so different model revisions), so I have just named the folder dragon-plus for now with no model revisions. Any ideas for putting results of asymmetric models in general?


Updates:

Have reformatted into model-level differentiation according to issues#1066, while keeping the same task name for both.

e.g., all-mpnet-base-v2/revision_id/RARbCode and all-mpnet-base-v2-instruct/revision_id/RARbCode; GritLM-7B/revision_id/RARbCode and GritLM-7B-noinstruct/revision_id/RARbCode;

@gowitheflow-1998
Copy link
Contributor Author

hi @orionw can you review this? 🙌

Copy link
Contributor

@orionw orionw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Responding to the PR comments:

Is there any issue with duplicates? I am curious how this will work when there are duplicate results with different model revisions, but that may be a PR another time. It looks like you crossed it out, so it's okay now?

  1. Dragon-plus is asymmetric with dragon-plus-query-encoder and dragon-plus-context-encoder as different models (so different model revisions), so I have just named the folder dragon-plus for now with no model revisions. Any ideas for putting results of asymmetric models in general?

This seems great, DPR is similar.

@orionw
Copy link
Contributor

orionw commented Jul 11, 2024

Thanks for starting on the instruction-naming process! It's a good reminder that we're gonna need a big PR to rename all the instruction models, I'll make an issue

@gowitheflow-1998
Copy link
Contributor Author

Thanks for the review and opening the issue!

Is there any issue with duplicates? I am curious how this will work when there are duplicate results with different model revisions, but that may be a PR another time. It looks like you crossed it out, so it's okay now?

No duplicates at the moment. I think as long as the result files are under different model revision folders, it'll be fine?

@orionw orionw merged commit 9b6b7ba into main Jul 11, 2024
1 check passed
@orionw
Copy link
Contributor

orionw commented Jul 11, 2024

perfect, thanks!!

@Muennighoff
Copy link
Contributor

It's a good reminder that we're gonna need a big PR to rename all the instruction models, I'll make an issue

Why do we need to rename them all? I thought the naming would be

Default way the model is intended to be used model_name - If model is used in non-default way with instruction add -instruct, w/o instruction add -noinstruct? But if you prefer not leaving the default names as is, we can also consider renaming.

On that one note - Why does this PR have bge-base-en-v1.5-instruct? Isn't using the retrieval instruction the default way for bge base and thus the existing results in bge-base-en-v1.5 are with instruction? Sorry maybe I am missing something.

@orionw
Copy link
Contributor

orionw commented Jul 12, 2024

Ah thought we were on the same page, sorry. Let me make a comment on the other issue to keep things in one place.

@gowitheflow-1998
Copy link
Contributor Author

On that one note - Why does this PR have bge-base-en-v1.5-instruct? Isn't using the retrieval instruction the default way for bge base and thus the existing results in bge-base-en-v1.5 are with instruction? Sorry maybe I am missing something.

I think as we discussed in the issue, BGE implementation uses the same instruction for all retrieval tasks, so it is more a non-instruction version for tasks that need to take in actual task-specific and example-specific instructions?

@Muennighoff
Copy link
Contributor

On that one note - Why does this PR have bge-base-en-v1.5-instruct? Isn't using the retrieval instruction the default way for bge base and thus the existing results in bge-base-en-v1.5 are with instruction? Sorry maybe I am missing something.

I think as we discussed in the issue, BGE implementation uses the same instruction for all retrieval tasks, so it is more a non-instruction version for tasks that need to take in actual task-specific and example-specific instructions?

Oh yes makes sense sorry for missing that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants