Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adout second iteration and new_instruction #9

Open
Git-L1 opened this issue Mar 30, 2024 · 1 comment
Open

adout second iteration and new_instruction #9

Git-L1 opened this issue Mar 30, 2024 · 1 comment

Comments

@Git-L1
Copy link

Git-L1 commented Mar 30, 2024

hi, thanks for release your code.I have some small questions to ask you.
The code you provided is for one iteration, so what is the input dataset for the second iteration? Is it the train_cool that has been replaced in the diagram, and where is cache_pool reflected in the code?
Then,Does the new_instruction generated in the diagram refer to the merger of new_hard and new_easy?
Looking forward to your early reply.
thanks!
QQ图片20240330212127

@YJiangcm
Copy link
Owner

Thanks for your interest in our work. Let me illustrate it using an example:

In the first iteration, the train_pool and cache_pool are both 52,000 Alpaca instructions and we generate 3,000 new_hard and 3,000 new_easy (the combination of them is regarded as new_instruction).

Therefore, in the second iteration, the train_pool would be exactly the 6,000 new_instruction. The cache_pool would be the instructions from the previous iteration plus the new_instruction, which are 58,000 instructions in total.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants