Please use this identifier to cite or link to this item: http://idr.iimranchi.ac.in:8080/xmlui/handle/123456789/1919
Title: Quantifying data imbalance using exponential f-divergence
Authors: Sarkar, Sobhan
Pramanik, Anima
Keywords: Data Imbalance
Imbalance ratio
Imbalance degree
Likelihood ratio
f-Divergence
IIM Ranchi
Issue Date: 8-Jun-2023
Publisher: 2023 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON)
Citation: Sobhan Sarkar, and Anima Pramanik (March 22-25, 2023). Quantifying data imbalance using Exponential f-Divergence. In Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON), Phuket, Thailand, 2023, pp. 403-408.
Abstract: In this study, a new measure of imbalance is introduced in order to compute the extent of imbalance for multiclass data. In the case of binary datasets, the Imbalance Ratio (IR) can be used to measure the amount of imbalance. However in the case of multi-class datasets, since it only takes into account the frequency of the most frequent majority class and the least frequent minority class, it fails to encapsulate any properties from the intermediate classes. An imbalance Degree (ID) was proposed to overcome the issues of IR by considering information from the intermediate classes as well. Nevertheless, it required us to choose a distance metric that largely influenced the results and could lead to unfavorable results. It is also assumed that the number of minority classes impacted the extent of the imbalance without considering their individual contributions, which is not correct. Thus, ID cannot be chosen as an authentic metric if this assumption is breached. Furthermore, another metric called Likelihood Ratio Imbalance Degree (LRID) was proposed to make the metric independent of the number of minority classes in the data. However, it considered the imbalance to be directional and assumed both positive and negative values for individual contributions from classes. In this study, we obtain a more authentic procedure to measure the extent of imbalance extent using statistical divergence from balanced class distributions.
URI: https://doi.org/10.1109/ECTIDAMTNCON57770.2023.10139691
http://idr.iimranchi.ac.in:8080/xmlui/handle/123456789/1919
Appears in Collections:Conference Presentations / Proceedings

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.