Descripción
|
|
---|---|
This paper describes the text normalization module of a text to speech fully-trainable conversion system and its application to number transcription. The main target is to generate a language independent text normalization module, based on data instead of on expert rules. This paper proposes a general architecture based on statistical machine translation techniques. This proposal is composed of three main modules: a tokenizer for splitting the text input into a token graph, a phrase-based translation module for token translation, and a post-processing module for removing some tokens. This architecture has been evaluated for number transcription in several languages: English, Spanish and Romanian. Number transcription is an important aspect in the text normalization problem. | |
Internacional
|
Si |
Nombre congreso
|
SSW8 2013 - 8th ISCA Speech Synthesis Workshop |
Tipo de participación
|
960 |
Lugar del congreso
|
Barcelona (España) |
Revisores
|
Si |
ISBN o ISSN
|
0000-0000 |
DOI
|
|
Fecha inicio congreso
|
31/08/2013 |
Fecha fin congreso
|
02/09/2013 |
Desde la página
|
65 |
Hasta la página
|
69 |
Título de las actas
|
Proceedings SSW8 2013 - 8th ISCA Speech Synthesis Workshop |