Observatorio de I+D+i UPM

Memorias de investigación
Communications at congresses:
Architecture for Text Normalization using Statistical Machine Translation techniques
Year:2012
Research Areas
  • Electronic technology and of the communications,
  • Electric engineers, electronic and automatic (eil)
Information
Abstract
This paper proposes an architecture, based on statistical machine translation, for developing the text normalization module of a text to speech conversion system. The main target is to generate a language independent text normalization module, based on data and flexible enough to deal with all situa-tions presented in this task. The proposed architecture is composed by three main modules: a tokenizer module for splitting the text input into a token graph (tokenization), a phrase-based translation module (token translation) and a post-processing module for removing some tokens. This paper presents initial exper-iments for numbers and abbreviations. The very good results obtained validate the proposed architecture.
International
Si
Congress
IberSPEECH 2012
960
Place
Madrid Spain
Reviewers
Si
ISBN/ISSN
84-616-1535-2
Start Date
21/11/2012
End Date
22/11/2012
From page
204
To page
213
VII Jornadas en Tecnología del Habla and III Iberian SLTech Workshop
Participants
  • Autor: Veronica Lopez Ludeña (UPM)
  • Autor: Ruben San Segundo Hernandez (UPM)
  • Autor: Juan Manuel Montero Martinez (UPM)
  • Autor: Roberto Barra Chicote (UPM)
  • Autor: Jaime Lorenzo Trueba (UPM)
Research Group, Departaments and Institutes related
  • Creador: Grupo de Investigación: Grupo de Tecnología del Habla
  • Departamento: Ingeniería Electrónica
S2i 2020 Observatorio de investigación @ UPM con la colaboración del Consejo Social UPM
Cofinanciación del MINECO en el marco del Programa INNCIDE 2011 (OTR-2011-0236)
Cofinanciación del MINECO en el marco del Programa INNPACTO (IPT-020000-2010-22)