A hybrid quasi-harmonic/CELP wideband speech coding scheme for unit selection TTS synthesis

Lee, Chang-Heon; Rosec, Olivier; Stylianou, Yannis

doi:10.21437/Interspeech.2011-649

This paper suggests a new wideband speech coding model to efficiently compress acoustic inventories for concatenative unit selection text-to-speech (TTS) synthesis system. To fulfill the requirements of TTS synthesizer such as partial segment decoding and random access capability, a non-predictive scheme was adopted which combines the adaptive Quasi-Harmonic Model (aQHM) with the innovative codebook (ICB) model. aQHM plays a major role in modeling pitch harmonic components, and ICB compensates, in a closed-loop way, for the modeling error of aQHM. This is especially important in transient or unvoiced regions. To further improve the coding efficiency, a hybrid coding framework is also suggested. Results from a large French speech database show that the proposed algorithm provides similar speech quality to the high quality AMR-WB codec while it supports the random access capability.

A hybrid quasi-harmonic/CELP wideband speech coding scheme for unit selection TTS synthesis

Chang-Heon Lee, Olivier Rosec, Yannis Stylianou