ISCA Archive Interspeech 2011
ISCA Archive Interspeech 2011

A hybrid quasi-harmonic/CELP wideband speech coding scheme for unit selection TTS synthesis

Chang-Heon Lee, Olivier Rosec, Yannis Stylianou

This paper suggests a new wideband speech coding model to efficiently compress acoustic inventories for concatenative unit selection text-to-speech (TTS) synthesis system. To fulfill the requirements of TTS synthesizer such as partial segment decoding and random access capability, a non-predictive scheme was adopted which combines the adaptive Quasi-Harmonic Model (aQHM) with the innovative codebook (ICB) model. aQHM plays a major role in modeling pitch harmonic components, and ICB compensates, in a closed-loop way, for the modeling error of aQHM. This is especially important in transient or unvoiced regions. To further improve the coding efficiency, a hybrid coding framework is also suggested. Results from a large French speech database show that the proposed algorithm provides similar speech quality to the high quality AMR-WB codec while it supports the random access capability.
