Today, I give my answer to Question #2 in the test presented a few days ago, here.

***

**Question #2: Do the following headlines contradict or can both be true at once?**

** **

When we are presented with just the first statement that the most successful teams swear and gossip, we are tempted to draw a causal conclusion. We think swearing and gossiping is the cause of success. This motivates the advice given by the headline writer. (Okay, so the writer did not say out loud encourage your team members to swear and gossip more... but is there any other possible interpretation?)

If the cause--effect relationship holds, we may expect the least successful teams to not engage in swear and gossip, which means the truth of the first statement denies the truth of the second, and vice versa. (Even this is not certain since there are other factors affecting team success.)

If we are told both statements are true at the same time, I don't think any of us would conclude that swearing and gossiping determines team success. That's why it's important to know if one statement implies the truth/falsehood of the other.

**Answer: Both statements can be true at once. One does not imply the other.**

More precisely, one may imply the other but often doesn't. As Scott quipped on Twitter, it depends.

One approach is to lay out the 2x2 grid of all four possible states of being. Like this:

Each team is classified as either successful or not (based on some metric of success). Each team is also classified as either heavy or light on the swear-gossip axis. The sum of the four numbers is the total number of teams.

The statement that "the most successful teams swear and gossip" suggests that A is substantially larger than C, or that A is a large proportion of (A+C).

The statement that "the least successful teams swear and gossip" suggests that B is substantially larger than D, or that B is a large proportion of (B+D).

The question is whether the values of A and C constrain the values of B and D in such a way that if the first statement is true, then the second statement must be false. To invalidate this, we can show scenarios under which both statements are true.

Imagine we have 100 teams. There are only two polite teams that don't swear and gossip, split evenly between successful and not successful. Next, we have 50 teams in cell A (successful and swearing) and 48 teams in cell B (not successful and swearing). Now, among the successful teams, 50/51 swear and gossip while among the unsuccessful teams, 48/49 swear and gossip. Both statements are true. The first statement does not deny the second.

We have a whole class of scenarios that work similarly. The split between A and B can be 49/49, 48/50, 40/58, 55/43, etc.

***

For those who have been exposed to contingency tables like the one above, you may be tempted to fill in the four numbers by multiplying the probability of success and the probability of swearing and gossipping. When you do this, you're assuming independence.

Independence means that knowing whether team members swear and gossip provides no information on team success. This is the opposite of the study's intention. The researchers must believe that team success is dependent on the level of swearing and gossipping.

To model dependence, it's not sufficient to know just the probability of success and the probability of swearing and gossipping. For each level of swearing and gossipping, we assume a different rate of success.

Through this exercise, we learn that three of the four values (A, B, C, D) are free. The only constraint is that they must sum to the total number of teams. In other words, having specified A and C in the first statement, we have only nailed down the sum of B and D but the split between B and D is not constrained. Thus, both statements can be, and often are, true at once.

***

In **Numbersense (link)**, I described a real-life example of this fallacy.

Years ago, the Gates Foundation was putting a lot of money behind the "small schools movement". This was based on some studies showing that the top X schools in district Y have a high proportion of small schools (with small average class size).

The situation can be shown in the following table:

Notice that this is the same table as above, with "small class size" substituting for "swear and gossip".

The Foundation was shown just the first column, from which they concluded that small class size was the cause of high test scores (A being a large proportion of A+C). The statistician, Howard Wainer, looked at the entire distribution and demonstrated that the bottom X schools also have a high proportion of small schools (B being a large proportion of B+D). This corresponds to the second column.

If shown both columns, you'd not conclude that class size drives test scores. If you only know the first column, you'd be tempted toward that conclusion.

If you only know the second column, you'd be tempted to draw the opposite conclusion that small class size depresses test scores!

That's why data analysts want to see all of the data. That's why biased datasets can be extremely misleading. You can make errors of direction, not just magnitude.

***

There is an underlying statistical reason that makes Wainer's discovery inevitable in many real-world datasets. Extreme values (highest or lowest test scores) are associated with small class sizes, because small samples have higher variability.

Imagine a large school with 10,000 seniors, with average GPA of 3.75. Imagine taking a random sample of 1,000 students, and reporting the sample average GPA. It's not going to be exactly 3.75, off by some small amount. Now take a sample of 500 students, and the average deviation of the sample mean from 3.75 will be higher. Now do 100 students. The smaller the sample size, the more likely the sample average runs further away from 3.75.

In the business teams example, I would not be surprised (if they did not control for team size) that the most successful (and least successful) teams are smaller teams.

This is an example of **statistical gravity**: if the dataset consists of teams of varying sizes, the large teams will be found in the middle of the distribution, and the smaller teams will be found towards the tails.

## Recent Comments