Mark P. and I have had discussions about evidence of efficacy related to Google Maps navigation: it is mighty hard to come up with data that prove that software like Google Maps or Waze actually delivers on the promise of reducing travel time. See this blog post for previous coverage.
These real-time optimization products often leave me with the feeling that they are executing "local" optimization, that is to say, they find the best solution within a highly constrained set of facts. A locally optimal solution is in general not optimal.
Since everyone has downloaded files from some web server, it's a good example to chew on. We have all seen the pop-up window that traces the estimated remaining time to finish downloading a large file. We also know that the prediction algorithm is so naive as to be laughable. It appears to perform a simple calculation: remaining file size divided by current download speed. This is the locally optimal answer. But internet download speeds are erratic, and so this local optimum is rarely accurate. We have all observed the estimated remaining time jump from 5 minutes to 5 hours, just because the internet speed has momentarily plunged.
The naive calculation is good if it is realistic to assume that the current download speed will hold steady till the time of completion. It is also good if it happens that the current download speed is close to the average speed of the entire download. (Neither of those are good assumptions.) The naive algo is reasonable if the set of facts available to the algorithm is constrained to (a) the current download speed and (b) the remaining file size. Imagine you are not allowed to use anything other than those two values, and you might decide to just divide one by the other.
Constraining these facts, though, is a choice. For example, the algo can be given the recent history of download speeds, rather than having just the current speed to work with.
***
Mark recently resurfaced this discussion because Google engineers have developed an optimization method for reducing the amount of congestion at road intersections (link)
Of course, the first thing I look for is proof. Here's what they say in the blog post:
Today, Green Light is live in over 70 intersections, helping to save fuel and lower emissions for up to 30 million car rides monthly. Early numbers indicate the potential to reduce stops by up to 30% and reduce emissions at intersections by up to 10%
There is no link to any papers, or data. So we just have to trust their word. The program manager elaborated:
We offer each city dedicated reports with tangible impact metrics, such as how many stops drivers saved at an intersection over time.
***
Let's think about how one might come up with evidence to prove the effectiveness of such optimization. The engineers say that the savings come from optimizing the timing of traffic lights to reduce stopping at intersections because the restart leads to pollution.
(Having driven around in Europe, I think the Europeans solve the problem completely by favoring roundabouts over traffic lights but that's a different thread.)
First, we need a baseline value for comparison. For any given intersection (one of those 70), the current traffic light timing is optimized, thus, we'd need to know the counterfactual, i.e. what would have been the amount of stoppage, had this intersection not implemented Google's optimization. To figure that out, we'd need to build a model that predicts the number of stops given a variety of factors, and we'd hope that this model has high accuracy.
Second, let's extend our focus from a single intersection to a stretch of highway involving a series of intersections. Is Google's algo implemented at every intersection, some selection of intersections, or just one intersection? For the sake of argument, let's say it's a single intersection, and we measure the change in stops at this intersection. It's highly likely that we may have pushed the congestion to another intersection, and what is gained locally is lost elsewhere.
***
Chapter 1 of Numbers Rule Your World (link) explores the complex world of traffic engineering through the tactic of ramp metering, i.e. controlling the flow of traffic onto highways. A key insight from that field is that congestion should be considered an unavoidable consequence when demand exceeds supply. Many tactics to reduce congestion merely re-distributes it.
The ramp-metering program is also notable for a valiant effort to evaluate real-life performance by running a principled experiment.
Redistributing congestion may not reduce overall congestion, but it may impact drivers' experiences. What's better, one long delay at one intersection, or numerous smaller delays? What induces more road rage?
Posted by: Jon Peltier | 08/13/2024 at 12:59 PM
JP: It's quite complex, isn't it? Is it redistributing within one driver? Or is it redistributing between drivers? The way the algo is described, it tries to reduce stop and go so I think they are going for one long delay rather than numerous shorter delays, in your example.
Posted by: Kaiser | 08/13/2024 at 01:03 PM