Descripción
|
|
---|---|
Calibration is often overlooked in machine-learning problem-solving approaches, even in situations where an accurate estimation of predicted probabilities, and not only a discrimination between classes, is critical for decision-making. One of the reasons is the lack of readily available open-source software packages which can easily calculate calibration metrics. In order to provide one such tool, we have developed a custom modification of the Weka data mining software, which implements the calculation of Hosmer-Lemeshow groups of risk and the Pearson chi-square statistic comparison between estimated and observed frequencies for binary problems. We provide calibration performance estimations with Logistic regression (LR), BayesNet, Naïve Bayes, artificial neural network (ANN), support vector machine (SVM), knearest neighbors (KNN), decision trees and Repeated Incremental Pruning to Produce Error Reduction (RIPPER) models with six different datasets. Our experiments show that SVMs with RBF kernels exhibit the best results in terms of calibration, while decision trees, RIPPER and KNN are highly unlikely to produce well-calibrated models. | |
Internacional
|
Si |
Nombre congreso
|
4th International Conference on Integrated Information |
Tipo de participación
|
960 |
Lugar del congreso
|
Madrid |
Revisores
|
Si |
ISBN o ISSN
|
9780735412835 |
DOI
|
|
Fecha inicio congreso
|
05/09/2014 |
Fecha fin congreso
|
08/09/2014 |
Desde la página
|
128 |
Hasta la página
|
133 |
Título de las actas
|
Volume 1644: International Conference on Integrated Information (IC-ININFO 2014) Proceedings of the 4th International Conference on Integrated Information |