Descripción
|
|
---|---|
This paper describes the UPM-UC3M system for the Albayzín evaluation 2010 on Audio Segmentation. This evaluation task consists of segmenting a broadcast news audio document into clean speech, music, speech with noise in background and speech with music in background. The UPM-UC3M system is based on Hidden Markov Models (HMMs), including a 3-state HMM for every acoustic class. The number of states and the number of Gaussian per state have been tuned for this evaluation. The main analysis during system development has been focused on feature selection. Also, two different architectures have been tested: the first one corresponds to an one-step system whereas the second one is a hierarchical system in which different features have been used for segmenting the different audio classes. For both systems, we have considered long term statistics of MFCC (Mel Frequency Ceptral Coefficients), spectral entropy and CHROMA coefficients. For the best configuration of the one-step system, we have obtained a 25.3% average error rate and 18.7% diarization error (using the NIST tool) and a 23.9% average error rate and 17.9% diarization error for the hierarchical one. | |
Internacional
|
No |
Nombre congreso
|
VI Jornadas de Tecnología del Habla FALA |
Tipo de participación
|
960 |
Lugar del congreso
|
Vigo |
Revisores
|
Si |
ISBN o ISSN
|
978-84-8158-510-0 |
DOI
|
|
Fecha inicio congreso
|
10/11/2010 |
Fecha fin congreso
|
12/11/2010 |
Desde la página
|
421 |
Hasta la página
|
425 |
Título de las actas
|
VI Jornadas de Tecnología del Habla FALA |