Skip to content

Commit

Permalink
DOC: Updates to Macro vs micro-averaging in plot_roc.py (#29845)
Browse files Browse the repository at this point in the history
Co-authored-by: Xiao Yuan <yuanx749@gmail.com>
Co-authored-by: Lucy Liu <jliu176@gmail.com>
  • Loading branch information
3 people authored and jeremiedbb committed Jan 9, 2025
1 parent ea8a725 commit 1f43fd2
Showing 1 changed file with 20 additions and 4 deletions.
24 changes: 20 additions & 4 deletions examples/model_selection/plot_roc.py
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,12 @@
# Obtaining the macro-average requires computing the metric independently for
# each class and then taking the average over them, hence treating all classes
# equally a priori. We first aggregate the true/false positive rates per class:
#
# :math:`TPR=\frac{1}{C}\sum_{c}\frac{TP_c}{TP_c + FN_c}` ;
#
# :math:`FPR=\frac{1}{C}\sum_{c}\frac{FP_c}{FP_c + TN_c}` .
#
# where `C` is the total number of classes.

for i in range(n_classes):
fpr[i], tpr[i], _ = roc_curve(y_onehot_test[:, i], y_score[:, i])
Expand Down Expand Up @@ -441,7 +447,17 @@
# global performance of a classifier can still be summarized via a given
# averaging strategy.
#
# Micro-averaged OvR ROC is dominated by the more frequent class, since the
# counts are pooled. The macro-averaged alternative better reflects the
# statistics of the less frequent classes, and then is more appropriate when
# performance on all the classes is deemed equally important.
# When dealing with imbalanced datasets, choosing the appropriate metric based on
# the business context or problem you are addressing is crucial.
# It is also essential to select an appropriate averaging method (micro vs. macro)
# depending on the desired outcome:
#
# - Micro-averaging aggregates metrics across all instances, treating each
# individual instance equally, regardless of its class. This approach is useful
# when evaluating overall performance, but note that it can be dominated by
# the majority class in imbalanced datasets.
#
# - Macro-averaging calculates metrics for each class independently and then
# averages them, giving equal weight to each class. This is particularly useful
# when you want under-represented classes to be considered as important as highly
# populated classes.

0 comments on commit 1f43fd2

Please sign in to comment.