-
Notifications
You must be signed in to change notification settings - Fork 1.6k
SLU Media recipe #1172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SLU Media recipe #1172
Conversation
|
Thanks @GaelleLaperriere !!! We will discuss that directly at the lab, but I will certainly ask for a few changes ;-) |
TParcollet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @GaelleLaperriere thank you so much for your work. Comments are a bit short, but we have so many things to do on SB ... Once the changes are done, I will test the recipes!
|
Hi @GaelleLaperriere and @TParcollet, at this stage we are with this PR? I'm wondering just because I would like to figure out if we want to include it in the upcoming new version of speechbrain. |
|
Hey @mravanelli unsure that this will be possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello,
Many thanks for the PR. You are doing a really great job!
This review will only be on the data prepare file. When the other files will be updated, I will then review everything.
First, could you please remove the .csv files ? I have uploaded on our Google Drive: https://drive.google.com/drive/u/1/folders/1z2zFZp3c0NYLFaUhhghhBakGcFdXVRyf the files so that you can put the links where to download the external data in the README.md as well as in the docstring of the prepare functions when it's needed
Please fix the pre-commit/testing as well.
|
Hi @GaelleLaperriere after running the recipe testing for MEDIA python -c 'from tests.utils.recipe_tests import run_recipe_tests; print("TEST FAILED!") if not(run_recipe_tests(filters_fields=["Dataset"], filters=[["MEDIA"]], do_checks=False, run_opts="--device=cuda")) else print("TEST PASSED")'all tests failed with When I put empty strings (it seems only relevant to the dataset preparation script, which this test skips), it gets stuck here: Please add an |
|
Got curious, after using this for recipe testing config I got these logs for the three recipes tests: The last one is a consequence of the first, or? lab_enc_file = hparams["save_folder"] + "/labelencoder.txt"
label_encoder.load_or_create(
path=lab_enc_file,which should create the The earlier # 2. Define audio pipeline:
@sb.utils.data_pipeline.takes("wav", "start_seg", "end_seg")
@sb.utils.data_pipeline.provides("sig")
def audio_pipeline(wav, start_seg, end_seg):Needs to have In the config above, the train CSV dummy is: and it does not have these two fields. Please take a look at other sample annotations, which you could use to satisfy you dataio pipeline, we collect them here: If none is there, there are several options:
|
|
Hello, Tank you for your help testing the recipe. I solved the first issue. I just changed the tags "start_seg" and "end_seg" for "start" and "stop" as done in the testing samples. Please keep me updated when you run the tests again. Many thanks. |
|
Hello @GaelleLaperriere,
Welcome; no worries.
Ok. Works. Adding to the test samples here would've been no big deal. All tests are passing now, your solution works !
Yes.
After running tests with the above mentioned python -c 'from tests.utils.recipe_tests import run_recipe_tests; print("TEST FAILED!") if not(run_recipe_tests(filters_fields=["Dataset"], filters=[["MEDIA"]], do_checks=False, run_opts="--device=cuda")) else print("TEST PASSED")'fails with: Please:
lmk if the above |
|
Placeholders are now overridden in MEDIA.csv with "Null". I don't know if it will work this way. I don't know if Adel told you about the error I get when running |
Thank you @GaelleLaperriere !
It's good you write about it here (others might help it). You merged the latest develop on Feb 17; this one has all testing tools required. The flag is used here: speechbrain/tests/utils/recipe_tests.py Lines 502 to 505 in e8cd583
Same in your repo/PR. The log that it isn't found, is weird, since it is a run_opts: speechbrain/speechbrain/core.py Lines 455 to 460 in e8cd583
What might have happened: your python environment is not operating from your local SpeechBrain repo, which has the change. As for now, these testing tools are not available via the pip version of SpeechBrain. We will release a new version for PyPI soon after merging your PR. On my end, all fail now with You still defined note: I reduced the epoch number from 10 to 2—it's just a speed-up; 10 we need to obtain somewhat stable debug resutls for comparing with (we do this only for a few core recipes, since - it takes time yet the objective: is does the code crash -> yes/no). It's an fyi; you already suffered enough through this PR... please let me know if testing works on your end, too, now. |
Thank you, it was the case. Now the tests work fine. 👍 |
anautsch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Concerns have been addressed.
Hello there,
I added in this new recipe:
I still need to make the README.md. An ASR recipe is coming.
The dataset Media is free for academic purposes, but must be requested from ELRA in order to retrieve it.
Thanks in advance.
EDIT
Tasks TODO: