ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

A study on soft margin estimation of linear regression parameters for speaker adaptation

Shigeki Matsuda, Yu Tsao, Jinyu Li, Satoshi Nakamura, Chin-Hui Lee

We formulate a framework for soft margin estimation-based linear regression (SMELR) and apply it to supervised speaker adaptation. Enhanced separation capability and increased discriminative ability are two key properties in margin-based discriminative training. For the adaptation process to be able to flexibly utilize any amount of data, we also propose a novel interpolation scheme to linearly combine the speaker independent (SI) and speaker adaptive SMELR (SMELR/SA) models. The two proposed SMELR algorithms were evaluated on a Japanese large vocabulary continuous speech recognition task. Both the SMELR and interpolated SI+SMELR/SA techniques showed improved speech adaptation performance in comparison with the well-known maximum likelihood linear regression (MLLR) method. We also found that the interpolation framework works even more effectively than SMELR when the amount of adaptation data is relatively small.

  翻译: