Skip to content

Conversation

@geliAI
Copy link
Collaborator

@geliAI geliAI commented Aug 12, 2022

Create a schedule with a learning rate that decreases linearly from the initial lr set in the optimizer to 0, after a warmup period during which it increases linearly from 0 to the initial lr set in the optimizer.

@geliAI geliAI requested a review from TParcollet August 12, 2022 14:35
@geliAI geliAI requested a review from popcornell August 16, 2022 17:50
return self.value_at_epoch[old_index], self.value_at_epoch[index]


class LinearWarmupScheduler:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI and thanks! Is this resumable ? I see a "current_step" shouldn't the sheduler be saved as well in case of resuming experiment? This can be easily done with hooks (see other scheduler with states). What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, on a side note, StepScheduler also does not have hooks. we should fix that in a separate PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, this is a very good question. TBH, I am not very familiar with the concept of hooks. But I will take a look at how other schedulers are implemented.

Copy link
Collaborator Author

@geliAI geliAI Aug 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the checkpoint hooks. Please take a look at it.

@geliAI geliAI requested a review from TParcollet August 17, 2022 19:36
@TParcollet
Copy link
Collaborator

Huge thanks!

@TParcollet TParcollet merged commit 73789b5 into speechbrain:develop Aug 17, 2022
@danpovey
Copy link

I notice the design is quite different from the one in PyTorch native schedulers that have a step() function and have load_state_dict() and state_dict() functions.

We also ended up changing the interface a bit, as I wanted something where you could step on both minibatches and epochs. [In our case it's not part of a unified interface, though; because as for now, for flexibility of early development, our model is to put most of the complexity in local scripts without putting most things in any central place.]

@TParcollet
Copy link
Collaborator

Agreed. Torch schedulers are a bit rigid. Although, you can use it natively with SB as well. As you may have seen, we follow the opposite direction for now: more central places and less complexity in local scripts. I guess it's a balance to do between how much maintenance you can put from a coordinated team (central) vs how much you wish to rely on the community to do that (local scripts). At least, this is a personal opinion, I find it hard to maintain properly recipes as they tend to grow way too rapidly in number :p

@danpovey
Copy link

Hm yes, for now we are aiming to get the best possible WER with reasonable latency before we add lots of recipes; at a later time we might consider centralizing things a bit. I figure if people really need recipes that work for a specific dataset, they can always get it from speechbrain or ESPNet.

@TParcollet
Copy link
Collaborator

The numbers you get with Transducers are really impressive, I really wish we soon obtain enough resources to put someone on this full-time. The last intern that tried did not succeed but he had other things to do as well (the PR where he tried your nice pruned transducer loss).

@geliAI geliAI deleted the develop_linear_warmup_LR branch August 18, 2022 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants