Observatorio de I+D+i UPM

Memorias de investigación
Descripción de parámetros glóticos basados en el modelado de los pliegues vocales para la detección de patología de la voz
Áreas de investigación
  • Teledetección,
  • Procesamiento de imágenes
Voice pathologies have constituted a social problem in recent times, which has reached a serious concern. Pollution in cities, smoking habits, the use of air conditioners, etc. contributes to it. This problem reaches more relevance for professionals who use their voice frequently: announcers, singers, teachers, contact call operators, etc. Therefore techniques that are capable of drawing conclusions from a sample of the recorded voice with a microphone are of particular interest for the diagnosis as opposed to other invasive ones, involving exploration by laryngoscopes, fiber scopes or video endoscopes. The reason behind is that the latter are techniques much more uncomfortable for patients because they require the introduction of the instrumental by the throat, in surgical proceedings. The first techniques have come a long way in a relatively short period of time. In regard to the diagnosis of diseases, we have gone in the last fifteen years from working primarily with parameters extracted from the voice signal (both in time and frequency domains) and with scales drawn from subjective assessment by experts to do the same with estimates of the glottal source. The importance of using the glottal source resides broadly in that this signal is linked to the state of the speaker's laryngeal function. Unlike the voice signal (phonated speech) the glottal source, if conveniently reconstructed using adaptive lattices, may be less influenced by the vocal tract. As it is well known the vocal tract is related to the articulation of the spoken message and its influence complicates the process of voice pathology detection, unlike when using the reconstructed glottal source, where vocal tract influence has been almost completely removed. The estimates of the glottal source have been obtained through inverse filtering techniques developed by our research group. We have also deepened into the nature of the glottal signal, dissecting it and relating it to the biomechanical parameters of the vocal cords, obtaining several estimates of items such as mass, loss or elasticity of cover and body of the vocal fold, among others. From the components of the glottal source also arise the so-called biometric parameters, related to the time-frequency pattern of the signal, which are themselves a biometric signature of the individual. We will also work with temporal parameters related to the different stages that are observed in the glottal signal during a cycle of phonation. Finally, we will take into consideration classical perturbation and energy parameters. In short, we have now a considerable amount of glottal parameters in a multidimensional statistical basis, designed to be able to discriminate people with pathologic or dysphonic voices from those who do not show pathology. This thesis addresses several issues: first, a careful analysis of these new parameters is required, so we will offer a complete statistical description of them. We will also discuss issues such as distribution of the parameters, considering criteria such as their statistical normality. We will take special care in the analysis of the difference between x distributions from healthy subjects and distributions from pathological subjects. To reach these goals we will use different statistical techniques such as: generation of descriptive items and diagramas, tests for normality and hypothesis testing, both parametric and nonparametric. These latter techniques consider the difference between the groups of healthy subjects and groups of people with a voice-related disease.
Tipo de Tesis
Sobresaliente cum laude
Esta actividad pertenece a memorias de investigación
  • Autor: Carlos Alfredo Lázaro Carrascosa (Universidad Rey Juan Carlos I)
  • Director: Pedro Gomez Vilda (UPM)
Grupos de investigación, Departamentos, Centros e Institutos de I+D+i relacionados
  • Creador: Grupo de Investigación: Informática Aplicada al Procesado de Señal e Imagen
  • Centro o Instituto I+D+i: Centro de tecnología Biomédica CTB
  • Departamento: Arquitectura y Tecnología de Sistemas Informáticos
S2i 2022 Observatorio de investigación @ UPM con la colaboración del Consejo Social UPM
Cofinanciación del MINECO en el marco del Programa INNCIDE 2011 (OTR-2011-0236)
Cofinanciación del MINECO en el marco del Programa INNPACTO (IPT-020000-2010-22)