Memorias de investigación
Artículos en revistas:
Regularized greedy column subset selection.
Año:2019

Áreas de investigación
  • Ciencias de la computación y tecnología informática

Datos
Descripción
The Column Subset Selection Problem is a hard combinatorial optimization problem that provides a natural framework for unsupervised feature selection, and there exist efficient algorithms that provide good approximations. The drawback of the problem formulation is that it incorporates no form of regularization, and is therefore very sensitive to noise when presented with scarce data. In this paper we propose a regularized formulation of this problem, and derive a correct greedy algorithm that is similar in efficiency to existing greedy methods for the unregularized problem. We study its adequacy for feature selection and propose suitable formulations. Additionally, we derive a lower bound for the error of the proposed problems. Through various numerical experiments on real and synthetic data, we demonstrate the significantly increased robustness and stability of our method, as well as the improved conditioning of its output, all while remaining efficient for practical use.
Internacional
Si
JCR del ISI
Si
Título de la revista
Information Sciences
ISSN
0020-0255
Factor de impacto JCR
5,91
Información de impacto
Volumen
Volume 486
DOI
10.1016/j.ins.2019.02.039
Número de revista
Desde la página
393
Hasta la página
418
Mes
JUNIO
Ranking

Esta actividad pertenece a memorias de investigación

Participantes

Grupos de investigación, Departamentos, Centros e Institutos de I+D+i relacionados
  • Creador: Grupo de Investigación: Grupo de Modelización Matemática y Biocomputación
  • Departamento: Matemática Aplicada a Las Tecnologías de la Información y Las Comunicaciones
  • Departamento: Sistemas Informáticos