The first thing one (should) learn about statistics is "all that data is not information." That's the very first thing I tell my class each semester. This message is doubly resonant in this age of "Big Data".
I was reading a post on Dell at Felix Salmon's blog, a post written by Ryan McCarthy or Ben Walsh. It cited BusinessWeek's Roben Farzad: "When it comes to putting a price on Dell, [he] points out one wild card: the company’s 3,449 patents (with another 1,660 pending)."
This is not the first time I read about the potential value of the massive store of patents owned by some tech company. Large numbers are used to reinforce the point. This type of argument preys on people's tendency to extrapolate linearly. (The original BusinessWeek piece goes into describing how a consultancy tried to evaluate Dell's patents. Amusingly, the article contains no dollar amounts related to Dell, only the large number of patents.)
The value of patents might be modeled as an exponential distribution with a very short half life. That is to say, the vast majority of patents will have value zero but there is a long tail consisting of things like Google's algorithm, the CAPTCHA idea, the coffee cup sleeve and so on, which have generated huge profits for the inventors.
So, even if you own only one patent, the value could dwarf the total of 3,500 patents. If we look back at Dell's portfolio ten years from now, it is guaranteed that only a few of the 3,500 would provide a windfall while all others would prove to be duds. It's about the quality of the patents, and not just the quantity.
I'm sure you have come across these moments when you want to scream: your large numbers don't impress me. Feel free to share your stories.