Observatorio de I+D+i UPM

Memorias de investigación

Ponencias en congresos:

A�o:2012

Descripci�n
In order to obtain more human like sounding humanmachine interfaces we must first be able to give them expressive capabilities in the way of emotional and stylistic features so as to closely adequate them to the intended task. If we want to replicate those features it is not enough to merely replicate the prosodic information of fundamental frequency and speaking rhythm. The proposed additional layer is the modification of the glottal model, for which we make use of the GlottHMM parameters. This paper analyzes the viability of such an approach by verifying that the expressive nuances are captured by the aforementioned features, obtaining 95% recognition rates on styled speaking and 82% on emotional speech. Then we evaluate the effect of speaker bias and recording environment on the source modeling in order to quantify possible problems when analyzing multi-speaker databases. Finally we propose a speaking styles separation for Spanish based on prosodic features and check its perceptual significance.
Internacional	Si
Nombre congreso	InterSpeech 2012, 13th Annual Conference of the International Speech Communication Association
Tipo de participaci�n	960
Lugar del congreso	Portland, Oregon
Revisores	Si
ISBN o ISSN	1990-9772
DOI
Fecha inicio congreso	09/09/2012
Fecha fin congreso	13/09/2012
Desde la p�gina	1
Hasta la p�gina	4
T�tulo de las actas	InterSpeech 2012, 13th Annual Conference of the International Speech Communication Association

Participantes

Autor: Jaime Lorenzo Trueba UPM
Autor: Roberto Barra Chicote UPM
Autor: Tuomo Raitio Department of Signal Processing and Acoustics, Aalto University, Finland
Autor: Nicolas Obin Sound Analysis and Synthesis, IRCAM, Paris, France
Autor: Paavo Alku Department of Signal Processing and Acoustics, Aalto University, Finland
Autor: Yunichi Yamagishi CSTR, University of Edinburgh, United Kingdom
Autor: Juan Manuel Montero Martinez UPM