Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set a seed of generator with an option for more randomness when training several models with different seeds #10486

Merged
merged 5 commits into from
Dec 19, 2022

Conversation

developer0hye
Copy link
Contributor

@developer0hye developer0hye commented Dec 13, 2022

It is related to #9545.

πŸ› οΈ PR Summary

Made with ❀️ by Ultralytics Actions

🌟 Summary

Enhancing reproducibility of YOLOv5 training runs with a new seeding option.

πŸ“Š Key Changes

  • Added a seed parameter to data loader functions to set the random seed for shuffling data.
  • Updated the random seed generation logic in create_dataloader by incorporating the new seed value.
  • Modified the train.py script to pass the seed parameter from the command-line options to the data loaders.

🎯 Purpose & Impact

  • Reproducibility: The addition of a seed parameter allows for more reproducible training runs, as the data shuffle order can be kept consistent across different runs, given the same seed value.
  • Customization: Users have more control over their training process, as they can set the seed manually if desired.
  • Flexibility: This change is backward compatible, meaning it will not affect existing workflows unless users opt to set their own seed.
  • Consistency: Data loading, especially when shuffling, can now be consistent across distributed training setups with the same seed, aiding in debugging and result verification.

@developer0hye developer0hye changed the title Set a seed of generator with an option for more randomness Set a seed of generator with an option for more randomness when training several models with different seeds Dec 13, 2022
@glenn-jocher
Copy link
Member

@AyushExel do you think we should change dataset shuffling on --seed changes? Right now I don't think this is explicitly enforced, so different --seed values will likely still see the same dataset generator seeds.

@AyushExel
Copy link
Contributor

@glenn-jocher yes we should if the deterministic behaviour is still preserved

@glenn-jocher glenn-jocher merged commit 10e93d2 into ultralytics:master Dec 19, 2022
@glenn-jocher
Copy link
Member

@developer0hye PR is merged. Thank you for your contributions to YOLOv5 πŸš€ and Vision AI ⭐

@glenn-jocher glenn-jocher self-assigned this Dec 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants