Descripción
|
|
---|---|
Subspace clustering is an interesting investigation field that has been intensively studied in the last two decades. The objective of subspace clustering is to find all lower-dimensional clusters hidden in subspaces of high dimensional data. Although the majority of existing subspace clustering algorithms adopt certain heuristic pruning techniques to reduce the search space, the time complexity of such algorithms remain exponential with regard to the highest dimensionality of hidden subspace clusters. Even with help of parallelism, these techniques will require extremely high computational time in practice. In this paper we propose a novel subspace clustering technique that reduces the exponential time complexity to quadratic via approximation. We also provide a parallel implementation of proposed algorithm on top of Apache Spark to further accelerate our approach on large data sets. Preliminary experiment results show our algorithm performs much better especially considering the scalability with regard to the dimensionality of hidden clusters. | |
Internacional
|
Si |
Nombre congreso
|
ADBIS 2016 (Short Papers and Workshops): New Trends in Databases and Information Systems , BigDap, DCSA, DC. |
Tipo de participación
|
960 |
Lugar del congreso
|
Prague, Czech Republic |
Revisores
|
Si |
ISBN o ISSN
|
978-3-319-44065-1 |
DOI
|
/10.1007%2F978-3-319-44066-8_16 |
Fecha inicio congreso
|
28/08/2016 |
Fecha fin congreso
|
31/08/2016 |
Desde la página
|
147 |
Hasta la página
|
154 |
Título de las actas
|
Proceedings ADBIS: Proceedings ADBIS 2016: New Trends in Databases and Information Systems. Communications in Computer and Information Science 637, Springer 2016. |