Learn AI Metrics for Beginners: Recall vs Precision

In this blog post, we discuss the two most common AI metrics: precision and recall, alongside a quick Confusion 🤔🤯 Matrix 101. We ended with questions that you should ask your team to determine which of the two metrics to prioritize.

Why should you care about precision and recall?

In the field of AI, precision and recall are essential measures of prediction performance across various scenarios.

In a perfect world, you want your prediction system to be getting both high precision and high recall. However, that almost never happens. One or the other will have to be prioritized but both should be optimized.

The discussion around these metrics hinges on one fundamental question: is it more important to minimize false positives or maximize true positives?

Precision aims to reduce false positives, whereas recall strives to maximize true positives.

Key Terms and Their Examples AKA the Confusion Table

1. True Positive (TP): This occurs when the model correctly predicts 'Yes'. Example: The model correctly identified that a photo contains a dog.

2. True Negative (TN): This occurs when the model correctly predicts 'No'. Example: The model correctly identified that a photo does not contain a dog (spoiler alert: it's a cat).

3. False Positive (FP) - Type 1 Error: This occurs when the model incorrectly predicts 'Yes'. Example: The model incorrectly identified a photo of a cat to be a dog.

4. False Negative (FN) - Type 2 Error: This occurs when the model incorrectly predicts 'No'. Example: The model incorrectly identified that a photo does not contain a dog, but it does.

Precision: how good your AI is at getting things right when making predictions

It answers the question: "When your model is predicting a positive class, how often was it right?" You can calculate it by dividing the number of true positives (correct predictions) by all the positive predictions made (regardless if it’s a true or false positive).

For example, let's say you've built a model to identify dogs in photos. If your model identified 5 photos as containing dogs, and only 3 of those photos actually contain dogs, your precision is 60%.

Another example: you've built a spam email detector. If, out of 500 emails your model flagged as spam, 450 were actually spam, your precision is 90%.

Certainly, beyond the aforementioned examples, precision is crucial in various other scenarios as well. Here are a couple more examples to illustrate this:

  1. Healthcare: An AI model predicting diseases needs high precision to avoid false positives, which could lead to unnecessary stress and harmful treatment for patients.
  2. Finance and Banking: A model for detecting fraudulent transactions requires high precision to prevent false positives that could lead to a customer's account being frozen.
  3. E-commerce: A recommendation system with high precision likely suggests products that interest customers. Low precision could lead to irrelevant recommendations, reducing customer engagement.

In all these cases, high precision is essential to avoid potential negative consequences of false positives (e.g. a cat being identified as a dog).

Recall: how good your model is at identifying all relevant instances

Recall, also known as sensitivity, answers the question: "How often does your model correctly predict a situation when it occurs?" You identify this by dividing the number of times the model was correct (true positives) to the total number of positives (i.e. sum of true positives and false negatives).

High recall indicates that the class is correctly recognized (a small number of false negatives).

For example, suppose you've built a model to identify dogs in images. If your AI correctly identifies dogs in 3 out of 5 images that contain dogs, the recall of your model is 60%. In other words, the model flagged two dog photos as “not a dog”.

Undoubtedly, recall is crucial in various scenarios beyond this example. Here are a few more to illustrate this:

  1. Meteorology: A weather prediction AI needs high recall to avoid false negatives, which could lead to lack of preparation for unforeseen weather conditions.
  2. Manufacturing: A quality control AI model requires high recall to prevent faulty products going unnoticed.
  3. Customer Support: An AI model classifying customer complaints needs high recall to ensure that all relevant complaints are addressed. Low recall could lead to missed complaints, leading to dissatisfied customers.

In all these scenarios, high recall is crucial to ensure the effectiveness of the models and to prevent potential issues arising from false negatives.

Precision and Recall: A Tug of War

To fully evaluate the effectiveness of a model, you must examine both precision and recall. Unfortunately, precision and recall are often in tension. That is, improving precision typically reduces recall and vice versa.

When to use Precision and recall?

We've prepared these questions to help your product teams determine whether to prioritize precision or recall in your AI and machine learning applications. Use these questions to guide your decision-making process and optimize the performance of your AI systems for the best possible outcomes.

  1. Would customers lose trust if there are false alarms in the AI's predictions?
  2. Is it more important that the AI does not make mistakes? (ex: When monitoring workplace safety with AI-powered cameras, errors like falsely claiming an employee wasn't wearing a hardhat can cause costly disputes.)
  3. Is precision a requirement to meet legal or industry standards?
  4. Is minimizing false positives important to maintain quality?

     If you answered "yes" mostly to questions 1-4, you should prioritize precision.
  5. Would missing true positives result in significant risks or missed opportunities?
  6. Is it crucial to identify as many positive instances as possible?
  7. Could false negatives lead to severe negative outcomes?
  8. Can the business efficiently handle false positives?
  9. Is ensuring comprehensive coverage more important for your application?

    If you answered "yes" mostly to questions 5-9, you should prioritize recall.

Remember, the focus on precision or recall depends on your specific project needs. It's often a balance between both based on your application requirements.