A commentator of the French Open recently complained that the human line judge made a mistake: "Hawkeye's error is 3 cm and the ball was out by 4 cm. So the line judge is wrong to call it in."
***
The commentator got this all wrong. The divergence of opinion should reduce one's confidence in Hawkeye's estimate. Let me explain why.
Hawkeye's goal when it comes to judging line calls can be stylistically described as determining the center of the landing spot of the tennis ball. It's helpful to first look at what happens without a margin of error. From this estimated location, we draw a ball given the diameter of a tennis ball, and figure out if the ball overlaps with the line on the court. If it doesn't overlap, then the computer decides that the ball is outside the line.
The idea of a margin of error of 3 cm is visualized as drawing a circle of radius 3 cm around the estimated location. Now, from any point inside this circle, we draw tennis balls as before; and if none of these balls overlap with the line on the court, then the computer decides that the ball is outside the line. By involving the margin of error, we explicitly embrace the uncertainty of estimation. For sure, fewer out calls will be issued relative to the case when we just use a single location estimate.
The fundamental problem in statistics is that Hawkeye gets one chance (one sample) to get this right. If Hawkeye used a deterministic process, then given the same inputs (videos, etc.), it would always generate the same estimated ball location. In real-world systems, whether it's because of noise in the system, or some stochastic element in Hawkeye's process, the same inputs lead to different estimates. The margin of error describes how much these estimates vary.
The reported margin of error holds that Hawkeye's estimate is unlikely to be off by more than 3 cm. In other words, that expanded circle of radius 3 cm is expected to capture the "true" center of the ball's landing location.
The word is "unlikely" rather than "impossible". All margins of error comes with a confidence level; usually it is 95% confidence. This means there is a 5 percent chance that Hawkeye may be off by more than 3 cm from the true location.
Hawkeye's estimate is not error-free as the commentators assumed, even after allowing for the margin of error.
I'm curious about the margin of error associated with humans inspecting ball marks on the clay - I suspect it's small. (The error of judging the balls in flight, by contrast, is certainly much higher.)
***
At tournaments that use Hawkeye, the players are forbidden from challenging calls. Let's subvert the process and exchange the roles of humans and machines.
Assume Hawkeye makes the first call, in or out. If the player disagrees with the call, he or she raises a challenge, and the umpire (and/or line judge) goes to inspect the mark on the clay. Now, the umpire's word is final, no complaints allowed.
As an example, Hawkeye determines that the ball is 2.5 cm outside the line, which is less than 3 cm, thus the machine rules it "in". A player protests. The umpire decides that the mark on the ground is wholly outside the line, and changes the call to out. How will the commentators react?
If their reaction is not colored by a preference for machines over humans, they will say that the machine has made a mistake - and to accord with their current behavior (in reverse), they should then recommend that the tournament removes line-calling machines because they are not accurate.
This is an instance in which reversing the players makes clear one's biases.
***
If we take a Bayesian view of this, we should combine the evidence. In the first step, we have one estimate. Now, if the second estimate conforms with the first, then the evidence becomes stronger. But if the second estimate contradicts the first, then the evidence weakens. This is why I said at the start that the divergence of opinion causes me to lower my confidence in Hawkeye's estimate.
Even more, I believe that the human estimate derived from the mark on the ground is more accurate anyway so I'd give that even more weight.
P.S. Outside of clay courts, the situation is more complicated as there are no ball marks to look at. I'm not against the technology. I'm against the illusion of perfection, and I'm against black-box technology that stifles dissent. Both these issues can be addressed by how technology is applied.
Recent Comments