Observatorio de I+D+i UPM

| Otras actividades
HOME

Proyectos Internacionales Art�culos Patentes UPM Software UPM Empresas UPM Otras actividades Memorias de investigaci�n

Memorias de investigación

Ponencias en congresos:

Language Identification based on n-gram Frequency Ranking

A�o:2007

�reas de investigaci�n

Inteligencia artificial,
Industria electr�nica

Datos

Descripci�n
We present a novel approach for language identification based on a text categorization technique, namely an n-gram frequency ranking. We use a Parallel phone recognizer, the same as in PPRLM, but instead of the language model, we create a ranking with the most frequent n-grams, keeping only a fraction of them. Then we compute the distance between the input sentence ranking and each language ranking, based on the difference in relative positions for each n-gram. The objective of this ranking is to be able to model reliably a longer span than PPRLM, namely 5-gram instead of trigram, because this ranking will need less training data for a reliable estimation. We demonstrate that this approach overcomes PPRLM (6% relative improvement) due to the inclusion of 4- gram and 5-gram in the classifier. We present two alternatives: ranking with absolute values for the number of occurrences and ranking with discriminative values (11% relative improvement).
Internacional	Si
Nombre congreso	8th Annual Conference of the Internacional Speech Communication Association (Interspeech 2007)
Tipo de participaci�n	960
Lugar del congreso	Antwerp, Belgium
Revisores	Si
ISBN o ISSN	ISSN 1990-9772
DOI
Fecha inicio congreso	27/08/2007
Fecha fin congreso	31/08/2007
Desde la p�gina
Hasta la p�gina
T�tulo de las actas

Esta actividad pertenece a memorias de investigaci�n

Participantes

Autor: Luis Fernando D'Haro Enriquez UPM
Autor: Fernando Fernandez Martinez UPM
Autor: Javier Ferreiros Lopez UPM
Autor: Javier Macias Guarasa UPM
Autor: Ricardo de Cordoba Herralde UPM

Grupos de investigaci�n, Departamentos, Centros e Institutos de I+D+i relacionados

Creador: Grupo de Investigaci�n: Grupo de tecnolog�a del habla
Departamento: Ingenier�a Electr�nica