Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[💡SUG] Allow benchmark_filename to accept only train/test split for sequential recommendation #2075

Open
tduricic opened this issue Aug 21, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@tduricic
Copy link

tduricic commented Aug 21, 2024

Is your feature request related to a problem? Please describe.
When using the benchmark_filename parameter for sequential recommendation, it requires specifying train, validation, and test sets in a list format (e.g., ['part1', 'part2', 'part3']). However, after hyperparameter tuning, I want to provide only a train/test split (e.g., ['part1', 'part2']) without needing a validation set, so I can train the final model on the combined train/validation data and evaluate it on the test set.

Describe the solution you'd like
I would like to be able to give only a train/test split in benchmark_filename (e.g., ['part1', 'part2']), where part1 is used for training and part2 for testing, bypassing the need for a validation set after hyperparameter tuning.

Describe alternatives you've considered
Manually combining the train and validation sets outside RecBole or modifying the validation requirements in the code, but both are inefficient and error-prone.

Additional context
This feature would simplify the workflow for sequential recommendation by allowing direct training on a train/test split, making it easier to utilize the full dataset for final model training after hyperparameter optimization.

@tduricic tduricic added the enhancement New feature or request label Aug 21, 2024
@Fotiligner
Copy link
Collaborator

Thanks for your attention to RecBole!
We will add this function in the following update. Before that, you could manually merge the train/valid data files together as the 'part1' file.

@tduricic
Copy link
Author

Thank you for your quick response!
Could you please elaborate on this solution a bit more, it is still not quite clear to me how the dataset splits should look like then, and especially the train/validation/test procedure. Also, what if I want to put the model into production and just train in on the full dataset without validation and testing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants