Fix return value of batch_evaluation for separation recipes #1555

z-wony · 2022-08-28T13:12:52Z

In case of 'batch_size' is not 1, evaluate_batch() returns tensor array (in separation recipes).
It occurs exception in _fit_valid() (#988)
So, this commit fixes return values as API reference of evaluate_batch()

But, I expect there are a lot of same issues in other recipes.
So, I'd like to suggest changing update_average() in core.py.
If 'list of tensors' are allowed to input argument, solution is more simple.

z-wony · 2022-08-28T13:17:00Z

My callstack is below. (with batch_size: 2)

speechbrain.core - Exception:
Traceback (most recent call last):
  File "train.py", line 629, in <module>
    separator.fit(
  File "/home/jwkim/github/speechbrain/speechbrain/core.py", line 1154, in fit
    self._fit_valid(valid_set=valid_set, epoch=epoch, enable=enable)
  File "/home/jwkim/github/speechbrain/speechbrain/core.py", line 1057, in _fit_valid
    avg_valid_loss = self.update_average(loss, avg_valid_loss)
  File "/home/jwkim/github/speechbrain/speechbrain/core.py", line 1303, in update_average
    if torch.isfinite(loss):
RuntimeError: Boolean value of Tensor with more than one value is ambiguous

In the core.py

1045     def _fit_valid(self, valid_set, epoch, enable):
1046         # Validation stage
1047         if valid_set is not None:
1048             self.on_stage_start(Stage.VALID, epoch)
1049             self.modules.eval()
1050             avg_valid_loss = 0.0
1051             with torch.no_grad():
1052                 for batch in tqdm(
1053                     valid_set, dynamic_ncols=True, disable=not enable
1054                 ):
1055                     self.step += 1
1056                     loss = self.evaluate_batch(batch, stage=Stage.VALID)
1057                     avg_valid_loss = self.update_average(loss, avg_valid_loss)

evaluate_batch() returns tensor([19.2439, 24.9512], device='cuda:0') to loss
So, if torch.isfinite(loss): in update_average() throws exception.

z-wony · 2022-09-05T13:53:26Z

Dear @mravanelli . Could you review this?

ycemsubakan · 2022-09-14T14:10:54Z

@z-wony Sorry for my late reply. I will take a look soon. Thanks for the PR.

z-wony · 2022-10-03T04:44:52Z

ping? : )

ycemsubakan · 2022-10-03T12:50:09Z

Guys, I'll take a look this week. Very sorry, I am swamped with several things.

ycemsubakan · 2022-10-03T14:44:45Z

Alright, I just tried this branch, it seems this doesn't cause anything else to break. We can merge.

@z-wony Do you know when this started to break the eval loop? Because originally this wasn't causing any issues. Like do you know which commit started to cause this issue? (Just out of curiosity)

z-wony · 2022-10-03T15:49:30Z

Alright, I just tried this branch, it seems this doesn't cause anything else to break. We can merge.

@z-wony Do you know when this started to break the eval loop? Because originally this wasn't causing any issues. Like do you know which commit started to cause this issue? (Just out of curiosity)

@ycemsubakan Thank you for review and sorry I have no idea about your question.
But some recipes separate batch_size configuration for evaluation and set to 1 in their hparam. (distinct with train_batch_size)
Thus, issue not be raised.

mravanelli · 2022-10-11T08:55:17Z

Cem, do you have some time to review this as well?

…

On Mon, Sep 5, 2022 at 9:53 AM Jiwon Kim ***@***.***> wrote: Dear @mravanelli <https://github.com/mravanelli> . Could you review this? — Reply to this email directly, view it on GitHub <#1555 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEA2ZVWVYS7BYPNAPJPUX7DV4X3ODANCNFSM573BVHQQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Fix return value of batch_evaluation for separation recipes

8e51fb8

ycemsubakan self-requested a review September 14, 2022 14:11

ycemsubakan approved these changes Oct 3, 2022

View reviewed changes

ycemsubakan merged commit 479fc62 into speechbrain:develop Oct 3, 2022

z-wony deleted the fix-batch-eval branch October 3, 2022 14:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix return value of batch_evaluation for separation recipes #1555

Fix return value of batch_evaluation for separation recipes #1555

Uh oh!

z-wony commented Aug 28, 2022

Uh oh!

z-wony commented Aug 28, 2022

Uh oh!

z-wony commented Sep 5, 2022

Uh oh!

ycemsubakan commented Sep 14, 2022

Uh oh!

z-wony commented Oct 3, 2022

Uh oh!

ycemsubakan commented Oct 3, 2022

Uh oh!

ycemsubakan commented Oct 3, 2022

Uh oh!

z-wony commented Oct 3, 2022

Uh oh!

mravanelli commented Oct 11, 2022 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix return value of batch_evaluation for separation recipes #1555

Fix return value of batch_evaluation for separation recipes #1555

Uh oh!

Conversation

z-wony commented Aug 28, 2022

Uh oh!

z-wony commented Aug 28, 2022

Uh oh!

z-wony commented Sep 5, 2022

Uh oh!

ycemsubakan commented Sep 14, 2022

Uh oh!

z-wony commented Oct 3, 2022

Uh oh!

ycemsubakan commented Oct 3, 2022

Uh oh!

ycemsubakan commented Oct 3, 2022

Uh oh!

z-wony commented Oct 3, 2022

Uh oh!

mravanelli commented Oct 11, 2022 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants