# What are Precision, Recall and F1?

Let's say that you are making a prediction for the daily weather conditions: either sunny or cloudy. In a country like the UK, it's cloudy most of the time. A predictor that always claimed the weather would be cloudy would be accurate 80% of the time for instance. Yet this prediction would carry very little useful information. Alternative ways to think about predictions are *precision*, *recall* and using the *F1 measure*. In this article we will explain these terms.

## Precision and Recall

In the weather example, assume that we make predictions for 5 days. Let's think of sunny as being a *positive* class, and *cloudy* as being a *negative* one. Now consider the set of predictions for the weather. Out of the days *predicted* as positive/sunny, how many are actually positive? This is known as precision.

$$ precision = \frac{\mbox{no. correctly predicted positive}}{\mbox{no. predicted positive}}.$$

Day 1 | Day 2 | Day 3 | Day 4 | Day 5 | |
---|---|---|---|---|---|

Actual weather: | Cloudy | Cloudy | Cloudy | Sunny | Cloudy |

Predicted weather: | Cloudy | Cloudy | Sunny |
Sunny |
Cloudy |

Correct | Yes | Yes | No | Yes | Yes |

In the example above, we have predicted that two days are sunny (days 3 and 4). There is a mistake on day 3 since the prediction is sunny and the weather was actually cloudy. Since 2 days that are predicted as sunny (highlighted in red), and 1 day which was sunny, the precision is 1/2 in this case.

In other words, a high precision means that when we predict sunny we are right most of the time. Note that it says nothing about predicting negative/cloudy instances. For instance, if in another example there are 100 sunny days and only 1 of them is correctly predicted, the precision is 1 (a perfect score). The recall captures the other side of the set of predictions, namely, out of the days which are *actually* sunny/positive, how many were correctly predicted?

$$ recall = \frac{\mbox{no. correctly predicted positive}}{\mbox{no. actual positive}}.$$

Consider the example below:

Day 1 | Day 2 | Day 3 | Day 4 | Day 5 | |
---|---|---|---|---|---|

Actual weather: | Sunny | Cloudy | Cloudy | Sunny | Cloudy |

Predicted weather: | Cloudy | Cloudy | Cloudy | Sunny | Cloudy |

Correct | No | Yes | Yes | Yes | Yes |

In this example, there are 2 days that are actually sunny (highlighted in red), but only 1 is predicted as such (day 4), making the recall 1/2. A high recall means that when the day is actually sunny, the prediction is likely to be correct.

## F1

The F1 score is simply a way to combine the precision and recall. Rather than take a mean of precision and recall, we use the *harmonic mean* which is given by:

$$

f1 = 2 \frac{precision \cdot recall}{precision + recall}.

$$

The higher the f1, the better the predictions. Just to finish off, let's consider the lazy predictor which always claims it is going to be sunny:

Day 1 | Day 2 | Day 3 | Day 4 | Day 5 | |
---|---|---|---|---|---|

Actual weather: | Cloudy | Cloudy | Cloudy | Sunny | Cloudy |

Predicted weather: | Sunny | Sunny | Sunny | Sunny | Sunny |

Correct | No | No | No | Yes | No |

We can easily see that the precision is 1/5 since 1 day is correctly predicted as sunny, but 5 sunny predictions are made. The recall is 1 since all days that are actually sunny are predicted correctly. This makes the f1 score 2 * 0.2/1.2 = 0.33.

For more information (and a slightly more technical desription) there is, as always, the Wikipedia page

### Subscribe to SimplyML: Simply Machine Learning

Get the latest posts delivered right to your inbox