What is the difference between precision and recall
· Category: Data Science
Short answer
Precision measures how many of the positive predictions were correct (quality). Recall measures how many of the actual positives were found (coverage). High precision means few false positives; high recall means few false negatives. For model evaluation context, see how to evaluate ML model performance.
Definitions
- Precision = True Positives / (True Positives + False Positives)
- Recall = True Positives / (True Positives + False Negatives)
- F1 Score = 2 * (Precision * Recall) / (Precision + Recall)
When to prioritize which
- Prioritize precision when false positives are costly: spam detection (do not mark real emails as spam), medical screening (avoid unnecessary procedures)
- Prioritize recall when false negatives are costly: cancer detection (do not miss cases), fraud detection (do not miss fraudulent transactions)
Example
A model detects 100 cancer cases. 80 are real (true positives), 20 are false alarms. It missed 10 real cases.
- Precision = 80/100 = 80%
- Recall = 80/90 = 89%
Tips
- Always consider both metrics together, not in isolation
- Use precision-recall curves when dealing with imbalanced datasets