added emotion and asr #11

salah-zaiem · 2023-11-21T18:12:06Z

No description provided.

mravanelli · 2023-11-29T17:59:11Z

Thank you @salah-zaiem for your valuable contribution! I did a first code inspection and I have a few comments:

SpeechBrain 1.0 Compliance:
Before extending the code to other tasks and models, I suggest making it compliant with SpeechBrain 1.0. A LibriSpeech CTC recipe compliant with SpeechBrain 1.0 can be found here. The conversion is relatively easy, and both I and @Adel-Moumen are available to assist if needed. The Changes include:
- speechbrain.lobes.augment.TimeDomainSpecAugment does not exist (refer to the linked example for the new augment).
- Consider removing the option for from pyctcdecode import build_ctcdecoder since we now have our own CTC beamsearcher, especially if LM is not in use.
Please, test the code with the latest unstable-v0.6 in the SpeechBrain repository (soon to be merged into dev)
What is the purpose of ssl_train.py and ssl.yaml?
I would rename discrete_train.py to train.py
in benchmarks/DASB/LibriSpeech/hparams/encodec_12.yaml, I propose to eliminate the manual definition of csv_folder. (Similar to the MP3S benchmark, consider storing the csv file in !ref <output_folder> for consistency)
It appears that the data preparation script is missing for both datasets.
After addressing the above points and cleaning up the code, consider adding the two probing heads as implemented in MP3S.

added emotion and asr

1ee67be

mravanelli self-requested a review November 27, 2023 22:51

mravanelli assigned salah-zaiem Nov 27, 2023

mravanelli added the enhancement New feature or request label Nov 27, 2023

fix linters

272ff8e

salah-zaiem closed this by deleting the head repository Feb 9, 2024

Provide feedback