Here’s a classic business analysis scenario, which I’d like to use to illustrate one of my favourite mathematical curiosities.
Your marketers have sent out a bunch of direct mail to a proportion of your previous customers, and deliberately withheld the letters from the rest of them so that they can act as a control group.
As analyst extraordinaire, you get the job of totting up how many of these customers came back and bought something. If the percentage is higher in the group that received the mail, then the marketers will be very happy to take credit for increasing this year’s revenue.
However, once the results are back, it’s not looking great. Aggregating and crosstabulating, you realise that actually the people who were not sent the mail were a little more likely to return as customers.
|Sent marketing?||Count of previous customers in group||Count of customers returning to store||Success rate|
You go to gently break the bad news to the marketing team, only to find them already in fits of joy. Some other rival to your analytical crown got there first, and showed them that their mailing effort in fact attracted a slightly higher proportion of both men and women to come back to your shop. A universally appealing marketing message – what could be better? Bonuses all round!
Ha, being the perfect specimen of analytical talent that you are, you’ve got to assume that your inferior rival messed up the figures. That’s going to embarrass them, huh?
Let’s take a look at your rival’s scrawlings.
|Gender||Sent marketing?||Count of previous customers in group||Count of customers returning to store||Success rate|
- So, a total of 100+200 people were sent the letter. That matches your 300. Same for the “not-sent” population.
- 21 + 11 people who were sent the letter returned, that matches your 32.
- 37 + 3 people who were not sent the letter returned, again that matches your 40.
What what what??!
This is an example of “Simpson’s Paradox”, which luckily has nothing to do with Springfield’s nuclear reactor blowing up and rupturing a hole in the very fabric of mathematical logic.
Instead, the disparities in the sample size and propensity of the two gender populations to return as customers are coming into play.
Whether they received the mailing or not, the results show that women were much more likely to return and shop at the store again anyway. Thus, whilst the marketing mail may have had a little effect, the “gender effect” here was much stronger.
Gender could therefore be considered a confounding variable. This could have been controlled for when setting up the experiment, had it been tested and known how important gender was with regard to the rate of customer return beforehand.
But apparently no-one knew about that or thought to test the basic demographic hypotheses . As it happened, with whatever sample selection method was employed, only half as many women happened to be in the group that was sent the mailing, vs the proportion of men who were sent the mailing.
So, whilst the mailing was marginally successful in increasing the propensity of both men and women to return to the store, the aggregate results of this experiment hid it because men – who were intrinsically far less likely to return to the store than women – were over-represented in the people chosen to receive the mailing