Multi-criteria analysis involving Pareto-optimal misclassification tradeoffs on imbalanced datasets

Marcos M Raimundo, Fernando J Von Zuben

July, 2020

Abstract

On binary classification, the goal of minimizing the false positive and false negative rates creates a conflict, being impossible to optimize both simultaneously. This challenge is even more significant on imbalanced classification datasets since an incorrect choice of the relative relevance of each objective on the optimization can lead to ignoring, or poorly learning the minority class. The proposal of this work takes into account the existing conflict among the learning losses of the classes, and use a deterministic multi-objective optimization method, called MONISE, to create a set of solutions with diverse misclassification tradeoffs among the classes. Since accuracy is no longer a proper criterion for imbalanced datasets, we had to resort to multiple criteria to report the performance: each classifier, proposed or competitors, was selected and reported using the same metrics. We used F1 , kappa and g-mean for general evaluation of performance and Fβs (F1/16 , F1/4 , F4 and F16 ) to emulate a shifting decision maker preference from precision to recall; all comparisons were made using a Friedman test with Finner posthoc test. However, when we take into account multiple metrics without any prior knowledge, it may become impossible to pinpoint the best method, since the evaluation criteria may also be in conflict. Again, to solve this, we resorted to a Friedman test with a non-dominated ranking. With this multi-criteria analyses, we conclude that explicitly considering multiple objectives on the optimization can guide to promising results.

Type

Conference paper

Publication

2020 International Joint Conference on Neural Networks (IJCNN)