Memorias de investigación
Ponencias en congresos:
UPM-UC3M system for music and speech segmentation
Año:2010

Áreas de investigación
  • Tecnología electrónica y de las comunicaciones,
  • Ingeniería eléctrica, electrónica y automática

Datos
Descripción
This paper describes the UPM-UC3M system for the Albayzín evaluation 2010 on Audio Segmentation. This evaluation task consists of segmenting a broadcast news audio document into clean speech, music, speech with noise in background and speech with music in background. The UPM-UC3M system is based on Hidden Markov Models (HMMs), including a 3-state HMM for every acoustic class. The number of states and the number of Gaussian per state have been tuned for this evaluation. The main analysis during system development has been focused on feature selection. Also, two different architectures have been tested: the first one corresponds to an one-step system whereas the second one is a hierarchical system in which different features have been used for segmenting the different audio classes. For both systems, we have considered long term statistics of MFCC (Mel Frequency Ceptral Coefficients), spectral entropy and CHROMA coefficients. For the best configuration of the one-step system, we have obtained a 25.3% average error rate and 18.7% diarization error (using the NIST tool) and a 23.9% average error rate and 17.9% diarization error for the hierarchical one.
Internacional
No
Nombre congreso
VI Jornadas de Tecnología del Habla FALA
Tipo de participación
960
Lugar del congreso
Vigo
Revisores
Si
ISBN o ISSN
978-84-8158-510-0
DOI
Fecha inicio congreso
10/11/2010
Fecha fin congreso
12/11/2010
Desde la página
421
Hasta la página
425
Título de las actas
VI Jornadas de Tecnología del Habla FALA

Esta actividad pertenece a memorias de investigación

Participantes

Grupos de investigación, Departamentos, Centros e Institutos de I+D+i relacionados
  • Creador: Grupo de Investigación: Grupo de Tecnología del Habla
  • Departamento: Ingeniería Electrónica