-
Notifications
You must be signed in to change notification settings - Fork 1.6k
add support of seed_everything #2654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support of seed_everything #2654
Conversation
…tils.seed.seed_everything
pplantinga
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a good PR that better follows PyTorch recommendations about randomness. I have a few minor comments but it is good enough that it could be merged now.
docs/experiment.md
Outdated
|
|
||
| However, due to the differences in how GPU and CPU executions work, results may not be fully reproducible even with identical seeds, especially when training models. This issue primarily affects training experiments. | ||
|
|
||
| On the other hand, when preparing data using data preparation scripts, the output of these scripts is independent of the global seeds. This ensures that you will get identical outputs on different setups, even if different seeds are used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could expand this to explain important details about distributed experiments, that different seeds will be set on different machines which will affect things like augmentations, but not things like initial model parameters or data loaders.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Let me know what you think
pplantinga
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
What does this PR do?
This PR add the support of a new function called
seed_everythingwhich tries to maximize reproducibility.When using two processes it does print:
While for one:
As a matters of fact, we offset the seed by the rank of the current process.
Before submitting
PR review
Reviewer checklist