add support of seed_everything #2654

Adel-Moumen · 2024-08-30T10:36:33Z

What does this PR do?

This PR add the support of a new function called seed_everything which tries to maximize reproducibility.

When using two processes it does print:

INFO:speechbrain.utils.seed:[rank: 1] Setting seed to 3403
INFO:speechbrain.utils.seed:[rank: 0] Setting seed to 3402

While for one:

INFO:speechbrain.utils.seed:[rank: 0] Setting seed to 3402

As a matters of fact, we offset the seed by the rank of the current process.

Before submitting

Did you read the contributor guideline?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Does your code adhere to project-specific code style and conventions?

PR review

Reviewer checklist

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified
Confirm that the changes adhere to compatibility requirements (e.g., Python version, platform)
Review the self-review checklist to ensure the code is ready for review

…tils.seed.seed_everything

pplantinga

Looks like a good PR that better follows PyTorch recommendations about randomness. I have a few minor comments but it is good enough that it could be merged now.

speechbrain/utils/seed.py

speechbrain/utils/distributed.py

pplantinga · 2024-09-11T14:18:00Z

docs/experiment.md

+
+However, due to the differences in how GPU and CPU executions work, results may not be fully reproducible even with identical seeds, especially when training models. This issue primarily affects training experiments.
+
+On the other hand, when preparing data using data preparation scripts, the output of these scripts is independent of the global seeds. This ensures that you will get identical outputs on different setups, even if different seeds are used.


Maybe we could expand this to explain important details about distributed experiments, that different seeds will be set on different machines which will affect things like augmentations, but not things like initial model parameters or data loaders.

Done. Let me know what you think

…speechbrain into add_seed_everywhere

pplantinga

LGTM!

Adel-Moumen and others added 11 commits August 30, 2024 11:29

add support of

11c6635

Merge branch 'speechbrain:develop' into develop

a13b588

seed_everything fn

aeebbe4

fix wrong modifications

e177554

fix typing

8c8e055

add author

b202576

simplify code

bcbec8e

add deterministic + manual_seed_all

50ca5ca

remove unused fn

8edc173

fix logging level

3bd109d

add relative import to allow sb.utils.seed_everything instead of sb.u…

bb3acb4

…tils.seed.seed_everything

Adel-Moumen marked this pull request as ready for review August 30, 2024 12:13

Adel-Moumen requested a review from asumagic August 30, 2024 12:13

Adel-Moumen self-assigned this Aug 30, 2024

Adel-Moumen requested a review from TParcollet August 30, 2024 12:13

Adel-Moumen added 3 commits August 30, 2024 14:47

test seeds

fc58913

modify yamls to reflect new seed_everything fn

602868a

remove old fashion to set seed to new one.

bb1bc3f

Adel-Moumen added this to the v1.0.2 milestone Aug 30, 2024

Adel-Moumen and others added 6 commits August 30, 2024 16:41

add os SB_GLOBAL_SEED variable

a530ecc

Merge branch 'speechbrain:develop' into develop

26a50f3

Merge branch 'speechbrain:develop' into add_seed_everywhere

c831342

add return type of seed_everything fn

8d78306

add documentation about repro

b7515df

pre-commit fix

caa2790

Adel-Moumen requested review from pplantinga September 11, 2024 11:02

Merge branch 'develop' into add_seed_everywhere

57f4367

pplantinga reviewed Sep 11, 2024

View reviewed changes

add permalink + remove arg

3cfb1e5

Adel-Moumen added 5 commits September 12, 2024 10:38

remove inp rank

2f30cf1

fix docstring format

107d52e

Merge remote-tracking branch 'origin/develop' into add_seed_everywhere

d449dd1

Merge branch 'add_seed_everywhere' of https://github.com/Adel-Moumen/…

f78ed55

…speechbrain into add_seed_everywhere

add section about DDP setting

f13f3f5

pplantinga approved these changes Sep 12, 2024

View reviewed changes

pplantinga merged commit eb13b9e into speechbrain:develop Sep 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add support of seed_everything #2654

add support of seed_everything #2654

Uh oh!

Adel-Moumen commented Aug 30, 2024 •

edited

Loading

Uh oh!

pplantinga left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pplantinga Sep 11, 2024

Uh oh!

Adel-Moumen Sep 12, 2024

Uh oh!

pplantinga left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		However, due to the differences in how GPU and CPU executions work, results may not be fully reproducible even with identical seeds, especially when training models. This issue primarily affects training experiments.

		On the other hand, when preparing data using data preparation scripts, the output of these scripts is independent of the global seeds. This ensures that you will get identical outputs on different setups, even if different seeds are used.

add support of seed_everything #2654

add support of seed_everything #2654

Uh oh!

Conversation

Adel-Moumen commented Aug 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

PR review

Uh oh!

pplantinga left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pplantinga Sep 11, 2024

Choose a reason for hiding this comment

Uh oh!

Adel-Moumen Sep 12, 2024

Choose a reason for hiding this comment

Uh oh!

pplantinga left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Adel-Moumen commented Aug 30, 2024 •

edited

Loading