Skip to content

Conversation

@sangeet2020
Copy link
Contributor

I initiate a pull request with the following additions, modifications and fixings:

  • [Addition] A new wav2vec model has been fine-tuned on the CommonVoice German dataset. For this I add a new hyper-param yaml file. README has been updated accordingly. Discussed with @mravanelli and to be later put into huggingface/speechbrain.
  • [Fixings] In the file speechbrain/processing/features.py, a bug has been fixed as proposed in Device error when setting deltas=True in speechbrain.lobes.features.Fbank. #1489
  • [Non-important Modification] Progress bar has been color-coded (gree, magenta, cyan)

Thank You

@Adel-Moumen Adel-Moumen self-assigned this Sep 5, 2022
Copy link
Collaborator

@Adel-Moumen Adel-Moumen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huge thanks for the PR!! Always great to have new languages in SpeechBrain 😁

Could you please update the README and add German too? (maybe you can add there the time taken per epoch)

If you agree, I will create a new model card on hugging face with this model when everything is ready. Then, you will have to make a pull request to add your model etc... 🤗

About what you did for the colouration in core.py, we will need more discussion on it. Could you please undo the core change with colour and create a separate pull request on that?

Your code failed on some tests (pre-commit and test), please run the pre-commit and tests as described here 🙂

Make sure to add your recipe in tests/recipes.csv to fix the test issue

Last but not least, could you please make sure that your .yaml is in the same order as the other .yaml (e.g French, Italian...)? For instance, you defined on line 104 the wav2vec2_hub while in the other yaml it is done on line 17. Could you please make sure that everything is similar?

@sangeet2020
Copy link
Contributor Author

sangeet2020 commented Sep 8, 2022

  • Readme has been updated
  • Coloration in core.py has been reverted.
  • German hyper-params .yaml file has been fixed as per suggestions.
  • Pre-commit and tests - sucessful

@sangeet2020
Copy link
Contributor Author

@Adel-Moumen Changes are ready to be reviewed.
Thanks

Copy link
Collaborator

@Adel-Moumen Adel-Moumen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello,

Thanks for the changes!

There are some minor edits that I would like you to make, please. I will also create your HuggingFace card for this model asap, you will then need to update the link for the model as well.

Many thanks for your excellent work!

@sangeet2020
Copy link
Contributor Author

sangeet2020 commented Sep 22, 2022

Apologies @Moumeneb1 , I had to close this PR, and re-open it because it got messed up when I pulled the latest changes today.
Thank You

@sangeet2020 sangeet2020 reopened this Sep 22, 2022
@Adel-Moumen
Copy link
Collaborator

Apologies @Moumeneb1 , I had to close this PR, and re-open it because it got messed up when I pulled the latest changes today. Thank You

Hello, you pinged the wrong reviewer Ahah.

There is one last thing that needs to be done is the upload of your model on the HuggingFace hub. For now, we will need to wait a little bit more. Unfortunately, the process takes a little bit of time.

@sangeet2020
Copy link
Contributor Author

Sure thing.

@sangeet2020
Copy link
Contributor Author

@Adel-Moumen I was wondering if we have any updates further? ~Thanks

@Adel-Moumen
Copy link
Collaborator

Hello,

It will be ready next week! :)

@Adel-Moumen
Copy link
Collaborator

Adel-Moumen commented Oct 25, 2022

Hello @sangeet2020,

Could you please resolve the conflict with the test/recipes.csv? Afterwards, we will merge this PR.

I did not finish the upload on the HF model hub. I struggled to host your model on the hub due to the tokenizer used.

If you agree, I can add you to the HF SpeechBrain organisation and then you will have to create a PR on the model card to add everything necessary.

Let me know what you think! Thanks. :-)

@mravanelli
Copy link
Collaborator

@sangeet2020, could you please solve the conflit? Also, it could be great if @anautsch could give a final look before merging it

@sangeet2020
Copy link
Contributor Author

Re-opening the PR, fixing conflicts with test/recipes.csv

@sangeet2020 sangeet2020 reopened this Nov 2, 2022
@sangeet2020
Copy link
Contributor Author

Now, I can host the new German CommonVoice model on HF. You can add me to the SpeechBrain organization.
Here is my username: sangeet2020

And apologies for the delay, bit busy with a new recipe for Microsoft DNS. Should be on SpeechBrain soon. :)

@Adel-Moumen
Copy link
Collaborator

Hello,

I added you to the HF organization. You can create a private repository on the organization, and start working on it! Let me know if you need more help. (you can reach me directly on slack)

Please fix the recipes :)

@sangeet2020
Copy link
Contributor Author

@Adel-Moumen

  • Model uploaded in HF as a pvt repo.
  • recipes.csv fixed
    Other details sent on slack.

Thank You

Copy link
Collaborator

@Adel-Moumen Adel-Moumen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello,

Thanks a lot! I tried your model and it's working perfectly.

Could you please update the readme, and recipe.csv with the model link, then we will merge the PR! :)

Copy link
Collaborator

@Adel-Moumen Adel-Moumen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

everything looks great! thanks!

@Adel-Moumen Adel-Moumen merged commit 11c70de into speechbrain:develop Nov 4, 2022
@sangeet2020
Copy link
Contributor Author

Thank you very much!!
SpeechBrain team

@TParcollet
Copy link
Collaborator

Thanks for contributing.

@sangeet2020
Copy link
Contributor Author

@Adel-Moumen please check the model in the SpeechBrain's google drive link. You uploaded the model dir named CKPT+2022-08-16+00-27-35+00 and this happens to be checkpoint for epoch 31 (you can check this from log.txt file). Could you please please upload the checkpoint for epoch 45 from here (GDrive LINK) The checkpoint folder for epoch 45 is CKPT+2022-08-19+23-10-33+00
Please let me know if there is any confusion

Thank You

@Adel-Moumen
Copy link
Collaborator

Hi, please accept me on the gdrive thanks.

@sangeet2020
Copy link
Contributor Author

Hi, please accept me on the gdrive thanks.

Editing access granted.

@Adel-Moumen
Copy link
Collaborator

Thanks.

I will edit the folder tomorrow!

@Adel-Moumen
Copy link
Collaborator

Hello @sangeet2020,

Is it better like that? :-)

@sangeet2020
Copy link
Contributor Author

Yes, this looks alright now :). Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants