Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with Merlin #52

Open
Mark-Leisten-ajalaco opened this issue Apr 12, 2019 · 1 comment
Open

Integration with Merlin #52

Mark-Leisten-ajalaco opened this issue Apr 12, 2019 · 1 comment

Comments

@Mark-Leisten-ajalaco
Copy link

Has anyone looked at bootstrapping the wavenet vocoder to Merlin (https://github.com/CSTR-Edinburgh/merlin/)? Merlin is an open-source TTS system (which uses Ossian or Festival as a front-end) for acoustic and duration modelling by default uses the WORLD vocoder and therefore extracts world vocoder features, as such it seems that an integration of this with Merlin should be possible. Just interested to see if someone has tried this out, and if they can offer some guidance.

@tuanad121
Copy link

tuanad121 commented Jun 10, 2019

It's interesting, I think you can replace the WORLD synthesis step with the Wavenet-based waveform generation.
In their synthesis script (https://github.com/CSTR-Edinburgh/merlin/blob/master/misc/scripts/vocoder/world/synthesis.py), the synthesis part is from line 120 to the end. The 3 input files are *.f0, *.sp, *.bapd. The data is in double type. The *.bapd is band-aperiodicity (or coarse aperiodicity). I'm not sure our wavelet-based synthesis uses coarse aperiodicity or full-band aperiodiciy (full-band aperiodicity has fft_size / 2 + 1 dimensions)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants