Consider this paragraph from a FiveThirtyEight article about the small-schools movement (my italics):
Hanushek calculated the economic value of good and bad teachers, combining the “quality” of a teacher — based on student achievement on tests — with the lifetime earnings of an average American entering the workforce. He found that a very high-performing teacher with a class of 20 students could raise her pupils’ average lifetime earnings by as much as $400,000 compared to an average teacher. A very low-performing teacher, by contrast, could have the opposite effect, reducing her students’ lifetime earnings by $400,000.
If I had told you that students who performed higher on achievement tests have higher average lifetime earnings by as much as $400,000 compared to students who performed average on achievement tests, you'd not be surprised--unless you are a skeptic of achievement tests. This is evidence that achievement test scores predict lifetime earnings.
Now, this is not what Hanushekthe journalist wants you to believe. HShe said that high-performing teachers "could raise" pupils' average lifetime earnings. Two logical jumps are made in one breath here: one is the use of student achievement scores as a proxy for "teacher quality"; the second is "causation creep," (i.e. allowing a causal interpretation to creep on correlational evidence), which is signaled by the use of weasel words like "could" and "may".
The use of proxy measures is the source for many "statistical lies". One tool I use is "proxy unmasking". Substitute the proxy metric with the actual metric. So in this example, when I see "high-performing teacher," I substitute "high-performing student" since the observed data measured students, not teachers. The sound you hear is air rushing out of the hyperventilated argument.