AI

Do human like precision more than recall?

Reading Time: 3 minutes

After working in the industry for a while, and applying the machine learning knowledge in what works well for business, I have come to the conclusion that human much prefer precision than recall. First thing first, what are precision and recall. The technical definitions for precision is (true positive) / (true positive + false positive), and recall is (true positive) / (true positive + false negative). That formula may be hard to understand, but the intuitive sense is that using precision as the metric will lower the number of false positives, while using recall as the metric will lower the number of false negatives.

Further explaining the technical words, false positives are the number of time where you thought it was true, but it turns out to be false. This is what I traditionally think about mistakes. False negatives are the number of times where you thought it was false, but it is actually true. I think of this as surprises. Taking from the book, Thinking Fast and Slow, people tend to make up reasons and logic to make sure whatever their conclusion is, it is the correct one. Base on that human tendency, false positives may be harder to accept than false negatives. This is because false positives directly violated our believes, assume we are reasonable people, whereas false negative just let ourselves know we learned something new, and it’s much easier to accept.

Given that false positives are harder to take for human beings, we try really hard to minimize the chance of that happening. Before prediction models exists, people have rules to do things. And if the rules help us solve a problem our find a thing, we keep the rules. When false positives happen, we will fix the rules to minimize false positives, but we don’t really try to find where the rule fails to catch the things we want, which are false negatives. And in a sense, before big data and fast predictive models happen, regular people that doesn’t have statistical trainings doesn’t care that much about false negatives. I think we are more comfortable finding false positives than false negatives.

The reality is that I find myself making more models that favors precision than recall when I don’t have the ability to maximize both. Obviously I like to use F1, the harmonic mean of precision and recall, to measure model success. But it’s not always the case given the amount of labeled data I have. In most cases I rather tell my client that I was not able to find all the things you want, than tell my client I may tell you something that’s wrong. The traditional rule approach tend to find things that have high precision and doesn’t cover much ground. And combine a bunch of these rules to get the desired results. I find myself doing similar things, just with prediction models. I keep advertising the high recall models, especially for discovering phases, where my client can find new directions for research, but they have been slow adapting that approach. Maybe they also have a tougher time telling their customers that they are not 100 percent positive about their findings.

Outside of research, I don’t know if I will ever be able to ask my clients to use recall as a metric they truly care about. Maybe it’s because my industry value precision more. Or maybe the word precision sound more positive than recall :). I would love to find out some business examples, where people are more open minded about using recall to solve their problem. And the people who make decisions understand its value.