Descripción
|
|
---|---|
A system for the automatic classification of acoustic scenes is proposed that uses the stereophonic signal captured by a binaural microphone. This system uses one channel for calculating the spectral distribution of energy across auditory-relevant frequency bands. It further obtains some descriptors of the envelope modulation spectrum (EMS) by applying the discrete cosine transform to the logarithm of the EMS. The availability of the two-channel binaural recordings is used for representing the spatial distribution of acoustic sources by means of position-pitch maps. These maps are further parametrized using the two-dimensional Fourier transform. These three types of features (energy spectrum, EMS and position pitch maps) are used as inputs for a standard multilayer perceptron with two hidden layers. | |
Internacional
|
Si |
Entidad
|
DCASE2018 Challenge |
Lugar
|
Surrey (Reino Unido) |
Páginas
|
|
Referencia/URL
|
http://dcase.community/documents/challenge2018/technical_reports/DCASE2018_Fraile_84.pdf |
Tipo de publicación
|
Technical Report |