Observatorio de I+D+i UPM

| Otras actividades
HOME

Proyectos Internacionales Art�culos Patentes UPM Software UPM Empresas UPM Otras actividades Memorias de investigaci�n

Memorias de investigación

Art�culos en revistas:

Topic identification techniques applied to dynamic language model adaptation for automatic speech recognition

A�o:2015

�reas de investigaci�n

Tecnolog�a electr�nica y de las comunicaciones,
Ciencias de la computaci�n y tecnolog�a inform�tica

Datos

Descripci�n
In this paper we present an efficient speech recognition approach for multitopic speech by combining information retrieval techniques and topic-based language modeling. Information retrieval based techniques, such as topic identification by means of Latent Semantic Analysis, are used to identify the topic in a recognized transcription of an audio segment. According to the confidence on the topics that have been identified, we propose a dynamic language model adaptation in order to improve the recognition performance in ?a two stages? automatic speech recognition system. The scheme used for the adaptation of the language model is a linear interpolation between a background general LM and a topic dependent LM. We have studied different approaches to generate the topic dependent LM and also for determining the interpolation weight of this model with the background model. In one of these approaches we use the given topic labels in the training dataset to obtain the topic models. In the other approach we separate the documents in the training dataset into topic clusters by using the k-means algorithm. For strengthening the adaptation models we also use topic identification techniques to group non topic-labeled documents from the EUROPARL text database in order to increase the amount of data for training specific topic based language models. For the evaluation of the proposed system we are using the Spanish partition of the European Parliament Plenary Sessions (EPPS) Database; we selected a subset of the database with 67 labeled topics for the evaluation. For the task of topic identification our experiments show a relative reduction in topic identification error of 44.94% when compared to the baseline method, the Generalized Vector Model with a classic TF?IDF weighting scheme. For the task of dynamic adaptation of LMs applied to ASR we have achieved a relative reduction in WER of 13.52% over a single background language model.
Internacional	Si
JCR del ISI	Si
T�tulo de la revista	Expert Systems With Applications
ISSN	0957-4174
Factor de impacto JCR	2,24
Informaci�n de impacto	Datos JCR del a�o 2013
Volumen	42
DOI	10.1016/j.eswa.2014.07.035
N�mero de revista
Desde la p�gina	101
Hasta la p�gina	112
Mes	ENERO
Ranking	Journal Rank in Category 12/81

Esta actividad pertenece a memorias de investigaci�n

Participantes

Autor: Julian David Echeverry Correa UPM
Autor: Javier Ferreiros Lopez UPM
Autor: Alejandro Coucheiro Limeres UPM
Autor: Ricardo de Cordoba Herralde UPM
Autor: Juan Manuel Montero Martinez UPM

Grupos de investigaci�n, Departamentos, Centros e Institutos de I+D+i relacionados

Creador: Grupo de Investigaci�n: Grupo de Tecnolog�a del Habla
Departamento: Ingenier�a Electr�nica