In the first chapter of my first book, Numbers Rule Your World (link), I explored the concept of variability using a pair of examples, one of which was Disney's FastPass virtual reservation system. Truly grasping the ins and outs of variability is one of the most important objectives for a budding statistician (or data scientist). In the discussion, I highlighted the work of Len Testa, whose website, TouringPlans.com, provides custom, computer-optimized itineraries for saving time in Disney's theme parks. Testa's team does exemplary work in applying mathematical models to solve a practical problem. I'm glad to present an interview with Testa today.
Len Testa holds a Masters degree in Computer Science from North Carolina A&T State University. He is the co-author of the Unofficial Guides to Walt Disney World, Disneyland, and Britain's Best Days Out, as well as a contributor to other books. His grad school research on the Time-Dependent Traveling Salesman Problem forms the basis of the Unofficial Guide's itinerary software.
KF: The problem of hitting a sequence of destinations in the fewest steps has a long history. Lots of people have worked on it. The most famous of these problems, the Traveling Salesman, even gets onto the mainstream press. However, most of this work is highly theoretical. Your touring plans to me are a shining example of amazing applied work that makes a lot of people happier. How does your work differ from the others?
A lot of times when you're trying to analyze data to solve a particular problem, you can approach it either from the perspective of "management" - the people controlling the process - or from the side of the "consumer."
The thing we try to model is optimal movement through a theme park. That is, if you're a customer and you want to ride 10 attractions, in what order should you visit them to minimize your wait in line?
The first time we approached this problem, we tried to figure out all of the things you need to know if you're running a theme park: ride speed, how many vehicles to have on the track, when to schedule entertainment to draw people to other parts of the park, and so on. It was complicated.
Then we looked at it from the point of view of the consumer. Consumers, it turns out, have a lot less information about how a theme park is run. About the only thing they really have is the posted wait time at every attraction. But it turns out that the posted wait time is really a synthesis of all the small decisions a theme park manager makes, so that's all you need. It's also a lot simpler to model.
KF: That is a really great answer. I hope all the budding data analysts out there are listening. Simplicity is a beauty.
KF: What is your pet peeve with published data interpretations?
Lack of context, especially around economic or political analyses. Yeah, an $860 billion stimulus package sounded like a lot of money in 2008. But in relation to a $15 trillion economy, it's what – 6%? But all of the discourse was on the raw number, not its size relative to the economy.
KF: Do you have other tips for doing great applied data work?
LT:I find reasoning by analogy to be a powerful way to understand and explain things. For example, if you're putting off a flight to Europe because you're afraid of a plane crash, which is a 1-in-500,000 chance, then why would you ever drive to work, where the odds of dying are an order of magnitude worse? So a lot of it is "If I'm willing to accept X then I should be willing to accept Y” type of thing.
Another helpful thing is being able to apply Bayes Theorem, especially when you're trying to make a business case for something. I remember one time we were trying to get funding to re-do some computer system (at another job), and we calculated a probability of 80% that if we made these changes, we'd succeed in reducing customer problems and lower future operating costs. Some people looked at that as a 1-in-5 chance of failure. I pointed out that we were making decisions every day with a lot less than a documented 80% chance of success.
***KF: Which sources do you turn to for reliable data analysis?
For economic analyses, Paul Krugman and the Financial Times. Nate Silver.
Closer to home, I like the mix of statisticians we have now at TouringPlans.com. They have different styles they use to approach problems, so it's useful to hear two different views. And when they agree, you know you've got a decent shot of being right.
KF: Statisticians disagree? They don't know the truth. Readers: you heard it on this blog first! Len, thank you so much for your time.