12.02.2009

Base-Rate Neglect

Here’s a probability problem:
At a university, 15% of students are of legal age to drink (21 or older), leaving 85% illegal. There is one liquor store near the campus. Willis, the owner is very good at detecting a fake ID. 90% of the time he correctly stops a student trying to use a fraud license. Unfortunately, Willis isn’t perfect. He also misidentifies proper ID’s 10% of the time. So he has always has an accuracy of 90% when examining students’ identification. Say a student walks in and hands his ID to Willis, and Willis turns it down. What is the likelihood that this student is under age?
Someone suffering from base-rate neglect would say 90%, since Willis’s accuracy is 90%. That person would forget, however, that there is not an equal number of legal drinkers at the university. The true probability that the student is under age must be determined using Bayes’ Theorem:
Here we have event A defined as the event that the student is underage, while B is the event that Willis turns down his license. The probability that the student is underage given Willis will refuse his business is a function of Willis’s accuracy and the base probabilities of both being of age and being underage. Thus, the base-rate is vital to solve the problem.

Using this formula, the probability that the student is underage, given Willis has turned down his license, is 85%.

This base-rate fallacy results from two different types of data: general information and case-specific information. We tend to ignore general information (base rates) when we are given more specific data (Willis turned down the student’s ID). We ignore concensus information when he have specific examples. It is important to note that this only happens when there is an unequal base-rate among groups (i.e. if we were comparing men and women and 50% of the population was men and 50% women, the base rate does not affect the probabilities). The math behind generating these Bayesian probabilities is relatively straightforward, yet our brains trick us by narrowing our focus and falling for the old base-rate fallacy.