/ precision

What are Precision, Recall and F1?

Let's say that you are making a prediction for the daily weather conditions: either sunny or cloudy. In a country like the UK, it's cloudy most of the time. A predictor that always claimed the weather would be cloudy would be accurate 80% of the time for instance. Yet this prediction would carry very little useful information. Alternative ways to think about predictions are precision, recall and using the F1 measure. In this article we will explain these terms.

Precision and Recall

In the weather example, assume that we make predictions for 5 days. Let's think of sunny as being a positive class, and cloudy as being a negative one. Now consider the set of predictions for the weather. Out of the days predicted as positive/sunny, how many are actually positive? This is known as precision.

$$ precision = \frac{\mbox{no. correctly predicted positive}}{\mbox{no. predicted positive}}.$$

Day 1 Day 2 Day 3 Day 4 Day 5
Actual weather: Cloudy Cloudy Cloudy Sunny Cloudy
Predicted weather: Cloudy Cloudy Sunny Sunny Cloudy
Correct Yes Yes No Yes Yes

In the example above, we have predicted that two days are sunny (days 3 and 4). There is a mistake on day 3 since the prediction is sunny and the weather was actually cloudy. Since 2 days that are predicted as sunny (highlighted in red), and 1 day which was sunny, the precision is 1/2 in this case.

In other words, a high precision means that when we predict sunny we are right most of the time. Note that it says nothing about predicting negative/cloudy instances. For instance, if in another example there are 100 sunny days and only 1 of them is correctly predicted, the precision is 1 (a perfect score). The recall captures the other side of the set of predictions, namely, out of the days which are actually sunny/positive, how many were correctly predicted?

$$ recall = \frac{\mbox{no. correctly predicted positive}}{\mbox{no. actual positive}}.$$

Consider the example below:

Day 1 Day 2 Day 3 Day 4 Day 5
Actual weather: Sunny Cloudy Cloudy Sunny Cloudy
Predicted weather: Cloudy Cloudy Cloudy Sunny Cloudy
Correct No Yes Yes Yes Yes

In this example, there are 2 days that are actually sunny (highlighted in red), but only 1 is predicted as such (day 4), making the recall 1/2. A high recall means that when the day is actually sunny, the prediction is likely to be correct.

F1

The F1 score is simply a way to combine the precision and recall. Rather than take a mean of precision and recall, we use the harmonic mean which is given by:

$$
f1 = 2 \frac{precision \cdot recall}{precision + recall}.
$$

The higher the f1, the better the predictions. Just to finish off, let's consider the lazy predictor which always claims it is going to be sunny:

Day 1 Day 2 Day 3 Day 4 Day 5
Actual weather: Cloudy Cloudy Cloudy Sunny Cloudy
Predicted weather: Sunny Sunny Sunny Sunny Sunny
Correct No No No Yes No

We can easily see that the precision is 1/5 since 1 day is correctly predicted as sunny, but 5 sunny predictions are made. The recall is 1 since all days that are actually sunny are predicted correctly. This makes the f1 score 2 * 0.2/1.2 = 0.33.

For more information (and a slightly more technical desription) there is, as always, the Wikipedia page