Memorias de investigación
Ponencias en congresos:
Fuzzy Semantic Labeling of Semi-structured Numerical Datasets
Año:2018

Áreas de investigación
  • Ciencias de la computación y tecnología informática

Datos
Descripción
SPARQL endpoints provide access to rich sources of data (e.g. knowledge graphs), which can be used to classify other less structured datasets (e.g. CSV files or HTML tables on the Web). We propose an approach to suggest types for the numerical columns of a collection of input files available as CSVs. Our approach is based on the application of the fuzzy c-means clustering technique to numerical data in the input files, using existing SPARQL endpoints to generate training datasets. Our approach has three major advantages: it works directly with live knowledge graphs, it does not require knowledge-graph profiling beforehand, and it avoids tedious and costly manual training to match values with types. We evaluate our approach against manually annotated datasets. The results show that the proposed approach classifies most of the types correctly for our test sets.
Internacional
Si
Nombre congreso
21st International Conference on Knowledge Engineering and Knowledge Management
Tipo de participación
960
Lugar del congreso
Nancy, Francia
Revisores
Si
ISBN o ISSN
978-3-030-03666-9
DOI
10.1007/978-3-030-03667-6
Fecha inicio congreso
12/11/2018
Fecha fin congreso
16/11/2018
Desde la página
19
Hasta la página
33
Título de las actas
Knowledge Engineering and Knowledge Management. 21st International Conference, EKAW 2018, Nancy, France, November 12-16, 2018, Proceedings

Esta actividad pertenece a memorias de investigación

Participantes

Grupos de investigación, Departamentos, Centros e Institutos de I+D+i relacionados
  • Creador: Grupo de Investigación: Ontology Engineering Group
  • Departamento: Inteligencia Artificial