-
Notifications
You must be signed in to change notification settings - Fork 1.6k
refactor: recipe testing CSVs #1600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: recipe testing CSVs #1600
Conversation
|
Status: checks are running through & necessary fixes were deployed. Continuing with part 2. There are remaining issues, in brief: 41: https://huggingface.co/speechbrain/vad-crdnn-libriparty snippet needs to be revised. 16+33: Download issues for mp3s are known (and from what I remember caused by dependencies; a later task). 14+19+42: The PyTorch bugs share alike stacktraces e.g. SB pretrained models available via HF use speech_augmentation (might be necessary /or not). // Note. Tests on CPU w/ internet access Test that can download data on CPU first and then move to an offline GPU node |
|
Note: to install all extra requirements declared in recipes: find recipes | grep extra | xargs cat | sort -u | grep -v \# | xargs -I {} pip install {}Tokenizer recipe testing (w/o data preparation) is functional; for all recipes w/ Tokenizers. python -c 'from speechbrain.utils.recipe_tests import run_recipe_tests; print("TEST FAILED!") if not(run_recipe_tests(filters_fields=["Task"], filters=["Tokenizer"])) else print("TEST PASSED")' |
|
AISHELL-1 recipe tests pass python -c 'from speechbrain.utils.recipe_tests import run_recipe_tests; print("TEST FAILED!") if not(run_recipe_tests(filters_fields=["Dataset"], filters=["AISHELL-1"], download_only=True)) else print("TEST PASSED")'
# then offline
TRANSFORMERS_OFFLINE=1 python -c 'from speechbrain.utils.recipe_tests import run_recipe_tests; print("TEST FAILED!") if not(run_recipe_tests(filters_fields=["Dataset"], filters=["AISHELL-1"])) else print("TEST PASSED")'note: these tests are w/o data preparation & w/o performance checks |
|
Aishell1Mix recipes "work" - there is one change I did not commit. Since Aishell1Mix links to LibriMix, which in turn likes to invoke its logger
that logger fails to be called. ... my solution was to add to the end of the above train file: if __name__ == "__main__":
...
else:
logger = logging.getLogger(__name__)so in any case, a logger is created. I do not dare to push this change to this PR - hence "works". edit: putting the logger in LibriMix to the recipe top (as is in other recipes). Note: I dropped this line and moved it to the yaml files
which have now a different point of entry num_spks: 2
dataset_folder: !PLACEHOLDER
data_folder: !ref <dataset_folder>/aishell1mix/Aishell1Mix<num_spks>this is a hotfix solution on my side: now, both variables serve a unique purpose (and are not changed in the middle to be then repurposed). |
That's for when we are about to approach the next release.
Not part of this PR. Getting to some other things first; certainly I can take a look after this PR. @mravanelli @TParcollet What's your take on URLs that are identified as being down? |
Adel-Moumen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR @anautsch ! You did an impressive works.
Please add an header in tests/__init__.py.
I left a comment on each thing that should be improved imo.
I will now try the PR in practice, I'll get back in touch if I am facing some troubles.
| @@ -1 +1 @@ | |||
| transformers==4.13 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move all extras to latest dependencies.
It's a testing convenience issue – otherwise each recipe will need their own env... these tests should be done at every release version - for the next major release, I'd like to either lift that freezing or freeze it to their latest version if possible.
I get that just leaving it there is convenient for the end-user; it's not for maintenance through testing. On the other hand, rewriting the testing scripts for changing envs per recipe - why not (either) ...
| @@ -1 +1 @@ | |||
| transformers==4.13 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
| source: !ref <whisper_hub> | ||
| freeze: !ref <freeze_whisper> | ||
| save_path: !ref <save_folder>/whisper_checkpoint | ||
| save_path: !ref <whisper_folder>/whisper_checkpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it means that it will create a subfolder whisper_checkpoint in the folder whisper_checkpoint... better to put only !ref <whisper_folder> than !ref <whisper_folder>/whisper_checkpoint
| print(f"\tsame: {results[repo]['same'] }") | ||
|
|
||
|
|
||
| def test_performance( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
tests/utils/refactoring_checks.py
Outdated
| hparams_file, run_opts, overrides = speechbrain.parse_arguments( | ||
| sys.argv[1:] | ||
| ) | ||
| # speechbrain.utils.distributed.ddp_init_group(run_opts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this line is commented ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. I think bc I use this for single-gpu only; working on another PR that does DDP testing now. Left it in as an indicator, I think - but surely it can drop. The aim here was to test with fewest parameters possible & in debug mode.
tests/consistency/README.md
Outdated
| Link to the HuggingFace repository containing the pre-trained model (e.g., `https://huggingface.co/speechbrain/asr-wav2vec2-librispeech`). If specified, it must be mentioned in the README file. | ||
| - test_debug_flags (optional): | ||
| This optional field reports the flags to run recipe tests (see `tests/.run-recipe-tests.sh`). The goal of the recipe tests is to run an experiment with a tiny dataset and 1-make sure it runs 2-make sure it overfits properly. | ||
| For instance, `--data_folder=tests/samples/ASR/ --train_csv=tests/samples/annotation/ASR_train.csv --valid_csv=tests/samples/annotation/ASR_train.csv --test_csv=[tests/samples/annotation/ASR_train.csv] --number_of_epochs=10 --skip_prep=True` will run an experiments with the given train and hparams files using a tiny dataset (ASR_train.csv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will run an experiments -> will run an experiment
tests/coverage/PR_cycle.md
Outdated
| ``` | ||
|
|
||
| Summary of changes in V.A with comments and tasks (for reviewers): | ||
| 1. new break in `func_sig` declartion (in the signature of the function) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
declartion -> declaration
| @@ -0,0 +1,218 @@ | |||
| # Guiding contributors, reviewers & maintainers through the complexity of SpeechBrain testing. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to put it in our documentation (https://speechbrain.readthedocs.io/en/latest/contributing). I don't think it should be put in our README. It does not belong there. Too much details.
|
The comments are addressed and markdown files are relocated. Conflicts with the latest develop branch are resolved. @TParcollet @Adel-Moumen about the md relocations:
The readthedocs toc is updated (it's in the docs folder of this PR). @ycemsubakan please lmk after your paper deadline, if the proposed changes to your recipes are ok. Currently, the LibriSpeech recipes are re-running to see, where we are after all these changes (of this PR and on develop). This addresses also moving the un/freeze of |
I have gone through these recipes. Please see my comment above. REAL-M and WHAMandWHAMR seems to be fine, but please see my comments. For BinauralWSJ0Mix, it seems that the added |
Thank you @ycemsubakan—I removed the added |
The tests pass even without the argument added in this PR on line 161 of real-m train.py.
|
@ycemsubakan neat! @TParcollet @mravanelli how'd you like to proceed with approving changes & merging this one? |
TParcollet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The goal of this PR is to refactor recipe testing. Not much, as most of testing needs are covered already.
In particular, the
tests/recipes.csvfile causes PRs to run into merge conflicts when another recipe enters the list. Conflicts for recipes make sense to highlight, when multiple contributors collaborate on the same dataset—but not across datasets.Solution, part I:
RecipeIDfield (ID has no persistency service; spontaneous use let's the ID result from row number)pytest tests/consistencyworks on CPU)Not all datasets are available for all recipes all the time; recipe testing needs to operate regardless. The need is to extend upon tests/consistency/README.md :
The existing testing uses the
test_debug_flagsflag here viatest_flag[recipe_id]:speechbrain/speechbrain/utils/recipe_tests.py
Line 314 in 64c8fa9
In
tests/recipes.csv, the majority of test definitions havetest_debug_flagsunspecified. If this field is specified,test_debug_checksis specified also: for a minimal testing set-up, after so and so many epochs, is an expected performance criterion reached.This PR extends this concept to:
test_debug_flagshints towards a sanity check as—regardless of data, can the recipe finish w/o breaking. For the concern of this PR,test_debug_checkscan remain unspecified. This is based in separating dataset preparation from the rest of the scripts (i.e. constructing dataset representations by JSON/CSV meta data), these very JSON &/or CSV files are necessary in dataio pipelines that are then handed to batches. These dataio pipelines can be as diverse as it gets. Just run:To make it short,
grep -r utils.data_pipeline.takes recipes | sed 's/.*takes..//' | sed 's/")//' | sed 's|",."|\n|g' | sort -ugives:note:
params["input_type"]is 'clean_wav' andparams["target_type"]can be 'phones' or 'words' (presently).and
grep -r utils.data_pipeline.takes tests/integration | sed 's/.*takes..//' | sed 's/")//' | sed 's|",."|\n|g' | sort -ugives:The first and foremost goal is to provide SpeechBrain at the flexibility that research needs. Since this part of testing hints at some form of round-trip-testing, some more rigid notions come into play. Mappings help for alignment. The recipe data pipelines should neither be touched nor overwritten (they may re-appear within the recipe-declared Brain class). Here, data defined by existing integration tests comes in handy, since they need to provide testing data corresponding to the
Taskfield defined intests/recipes.csv!Note: the above is an assumption right now; let's see what is in the achievable.
Solution, part II:
test_debug_flagsfrom 'optional' -> 'mandatory' (default: None if there is no !PLACEHOLDER in yaml files)Note:
HF_repo&test_debug_checksshould remain 'optional' (out-of-scope for this PR)