Memorias de investigación
Communications at congresses:
Phone-gram units in RNN-LM for language identification with vocabulary reduction based on neural embeddings
Year:2016

Research Areas
  • Electric engineers, electronic and automatic (eil)

Information
Abstract
In this paper we present our results on using Recurrent Neural Networks Language Model scores (RNNLM) trained on different phone-gram orders and using different phonetic ASR recognizers. In order to avoid data sparseness problems and to reduce the vocabulary of all possible n-gram combinations, a K-means clustering procedure was performed using phone vector embeddings as a pre-processing step. We will provide more details on the vocabulary reduction efforts on 2-gram and 3-gram. Additional experiments to optimize the amount of classes, batch-size, hidden neurons, state-unfolding, are also presented. We have worked with the KALAKA-3 database for the plenty closed condition. Thanks to our clustering technique and the combination of high level phone-grams, our phonotactic system performs more than 10% better than the unigram-based RNNLM system. Also, the obtained RNNLM scores are calibrated and fused with other scores from an acoustic-based i-vector system and a traditional PPRLM system. This fusion provides additional improvements showing that they provide complementary information to the LID system.
International
Si
Congress
Iberspeech 2016
970
Place
Lisboa - Portugal
Reviewers
Si
ISBN/ISSN
978-3-319-49169-1
Start Date
23/11/2016
End Date
25/11/2016
From page
109
To page
118
IberSpeech 2016 - Proceedings
Participants

Research Group, Departaments and Institutes related
  • Creador: Centro o Instituto I+D+i: Centro de I+d+i en Procesado de la Información y Telecomunicaciones
  • Departamento: Ingeniería Electrónica