-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Downsampling #1888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Downsampling #1888
Conversation
TParcollet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! See my comments.
recipes/LibriSpeech/ASR/CTC/hparams/downsampled/train_hf_wavlm_average_downsampling.yaml
Outdated
Show resolved
Hide resolved
Quick fix.
| from speechbrain.utils.distributed import run_on_main | ||
| from hyperpyyaml import load_hyperpyyaml | ||
| from pathlib import Path | ||
| from pyctcdecode import build_ctcdecoder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be optional nop? Now this is mandatory to pip install pyctcdecode in order to use the CTC wav2vec...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it should be optional, will put the import later
TParcollet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Code for the best technique in the paper "Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative Study" : https://arxiv.org/abs/2303.06740, allowing for sequence downsampling during fine-tuning of SSL models. This leads to lower inference times with low performance drops.