Monday, December 7, 2015

ML Notes : Area Under Curve (AUC) vs Overall Accuracy

http://stats.stackexchange.com/a/69944/73383

The area under the curve (AUC) is equal to the probability that a
classifier will rank a randomly chosen positive instance higher than a
randomly chosen negative example. It measures the classifiers skill in
ranking a set of patterns according to the degree to which they belong
to the positive class, but without actually assigning patterns to
classes.

The overall accuracy also depends on the ability of the classifier to
rank patterns, but also on its ability to select a threshold in the
ranking used to assign patterns to the positive class if above the
threshold and to the negative class if below.

Thus the classifier with the higher AUROC statistic (all things being
equal) is likely to also have a higher overall accuracy as the ranking
of patterns (which AUROC measures) is beneficial to both AUROC and
overall accuracy. However, if one classifier ranks patterns well, but
selects the threshold badly, it can have a high AUROC but a poor
overall accuracy.

No comments:

Blog Archive