Memorias de investigación
Ponencias en congresos:
A pruning algorithm for mining maximal length frequent itemsets
Año:2016

Áreas de investigación
  • Ingenierías,
  • Ciencias de la computación y tecnología informática,
  • Ingeniería eléctrica, electrónica y automática

Datos
Descripción
Association rule mining is one of the most popular exploratory data mining techniques to discover interesting and previously unknown correlations from datasets. The main goal of association rules algorithms is to find the most frequent set of variables, and then find the correlations between the frequent items. Current algorithms for association rule mining are computationally expensive, especially for very large datasets. Moreover, the large number of discovered frequent itemsets hinders the applications of the algorithms in many real-world datasets. Usually frequent sets with larger length are more interesting and finding the set of maximal length itemsets is useful for many applications. We introduce a novel algorithm, called Width-Sort that efficiently discovers the maximal length frequent itemsets. In Width-Sort, dataset is partitioned based on the transactions lengths to reflects over the additional information hidden in them. Lemmas are developed to estimate an upper bound for the maximal length of the frequent itemsets as well as to prune the items that cannot be part of the maximal length frequent itemsets. The efficiency of the algorithm is tested using both simulated and real-world datasets.
Internacional
Si
Nombre congreso
9th International Conference of the ERCIM. Computational and Methodological Statistics (CMStatistics 2016)
Tipo de participación
730
Lugar del congreso
Sevilla (Spain)
Revisores
Si
ISBN o ISSN
978-9963-2227-1-1
DOI
Fecha inicio congreso
09/12/2016
Fecha fin congreso
11/12/2016
Desde la página
157
Hasta la página
157
Título de las actas
9th International Conference of the ERCIM (European Research Consortium for Informatics and Mathematics) Working Group on Computational and Methodological Statistics (CMStatistics 2016)

Esta actividad pertenece a memorias de investigación

Participantes

Grupos de investigación, Departamentos, Centros e Institutos de I+D+i relacionados
  • Creador: Grupo de Investigación: Estadística computacional y Modelado estocástico
  • Departamento: Ingeniería de Organización, Administración de Empresas y Estadística