Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added emotion and asr #11

Closed
wants to merge 2 commits into from
Closed

added emotion and asr #11

wants to merge 2 commits into from

Conversation

salah-zaiem
Copy link
Collaborator

No description provided.

@mravanelli mravanelli self-requested a review November 27, 2023 22:51
@mravanelli mravanelli added the enhancement New feature or request label Nov 27, 2023
@mravanelli
Copy link
Contributor

Thank you @salah-zaiem for your valuable contribution! I did a first code inspection and I have a few comments:

  1. SpeechBrain 1.0 Compliance:
    Before extending the code to other tasks and models, I suggest making it compliant with SpeechBrain 1.0. A LibriSpeech CTC recipe compliant with SpeechBrain 1.0 can be found here. The conversion is relatively easy, and both I and @Adel-Moumen are available to assist if needed. The Changes include:

    • speechbrain.lobes.augment.TimeDomainSpecAugment does not exist (refer to the linked example for the new augment).
    • Consider removing the option for from pyctcdecode import build_ctcdecoder since we now have our own CTC beamsearcher, especially if LM is not in use.

    Please, test the code with the latest unstable-v0.6 in the SpeechBrain repository (soon to be merged into dev)

  2. What is the purpose of ssl_train.py and ssl.yaml?

  3. I would rename discrete_train.py to train.py

  4. in benchmarks/DASB/LibriSpeech/hparams/encodec_12.yaml, I propose to eliminate the manual definition of csv_folder. (Similar to the MP3S benchmark, consider storing the csv file in !ref <output_folder> for consistency)

  5. It appears that the data preparation script is missing for both datasets.

  6. After addressing the above points and cleaning up the code, consider adding the two probing heads as implemented in MP3S.

@salah-zaiem salah-zaiem closed this by deleting the head repository Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants