I recommend the piece "Is AI Riding a One-Trick Pony?" by James Somers (link). If the question is to be taken literally, the answer is yes. The bigger question is whether this pony is taking us to the promised land, or a holding spot. As Somers stated, "AI today is deep learning, and deep learning is backprop." Backpropagation is a concept embedded in the algorithm of deep learning, and it was invented 30 years ago in Geoff Hinton's lab. The key to its recent ascent is the availability of much larger data sets, and the ability to do more computations faster. This rebirth of AI is a triumph of engineering rather than an intellectual advance.
Here are several other provocative quotes from the article:
- maybe we're not actually at the beginning of a revolution. Maybe we're at the end of one.
- there's a sort of reality distortion field that Hinton creates, an air of certainty and enthusiasm, that gives you the feeling there's nothing that vectors can't do... it's only when you leave the room that you remember: these "deep learning" systems are still pretty dumb
- a deep neural net that recognizes images can be totally stymied when you change a single pixel, or add visual noise that's imperceptible to a human
- to get a deep-learning system to recognize a hot dog, you might have to feed it 40 million pictures of hot dogs. To get [a two-year-old baby] to recognize a hot dog, you show her a hot dog
- the latest sweep of AI has been less science than engineering, even tinkering
- we're still largely in the dark about how those systems work, or whether they could ever add up to something as powerful as the human mind
One of the hot subjects of the moment is the claim that these neural networks can learn "representations." This is really a half-truth, and much more work is needed to prove such an audacious claim. Somers writes the following passage, which is similar to the stories about representational learning being passed around at meetings: "so-called vector arithmetic makes it possible to... subtract the vector for 'France' from the vector for 'Paris,' add the vector for 'Italy,' and end up in the neighborhood of 'Rome.' It works without anyone telling the network explicitly that Rome is to Italy as Paris is to France."
This vector system is a representation of the raw data in latent dimensions. The general concept of latent variables has been central to statistics and psychometrics for a long time. Think about any IQ or personality tests. You are asked a series of questions. Your answer choices represent the raw data. But the researcher isn't really interested in the responses to specific questions. The questions are designed to measure latent (hidden) dimensions such as your analytical ability, your extroversion, etc. Each question sheds some light on some subset of latent dimensions. Your score on, say, extroversion is some weighted average of your raw answers to various questions. Using your answers, the researcher can place you in the universe defined by those latent dimensions. In the example concerning country capitals, words are being mapped onto a universe defined by latent dimensions.
The interpretation of such dimensions has always involved human beings. So is the case here. The neural network itself has no idea what Rome or Paris or France or Italy is; it's just another word for all it's concerned. It's the human interpreter who imposes the meaning upon the output of the neural network. It's the human who selects those word pairs because he/she knows the correct relationship between them.
Read Somers's article here.