AI Snake Oil (Part 3): Evaluation

In the last post, I discussed training data: mostly that you ought to have it or a way of getting it. If someone pitches you an idea without even a reasonable vision of what the training data would be, they’ve got a lot less credibility. In other words, if you can’t even envision training data for a given task, then the task itself may be impractical.

No offense to the creators of this actual robot; I just needed an image with a CC license.
A ridiculous image to get the idea of “AI evaluation” into your head. (No offense to the creators of this actual robot; the giant red letter “F” is not an actual evaluation of your robot. I just needed an image with a CC license. I hope it ping-ponged well.)

Next, let’s talk about evaluation with respect to application development, namely, if someone pitches an AI application idea to you:

Question: Do they have an evaluation procedure built into their application development process?

Arguably, evaluation is more important than training data. I chose to discuss training data in the first post because thinking in terms of training data gives you intuitions about what’s possible. It eliminates the infinite, but still leaves you with dreams. Evaluation is where your dreams are torn to shreds, whether or not you have training data.

Fundamentally, I want to cover three things here: why we evaluate, how do we evaluate, and how do we score the results. Understanding these three things is essential to understanding what makes a suitable evaluation; a crappy evaluation sows false confidence, something worse than no evaluation at all.

Continue reading “AI Snake Oil (Part 3): Evaluation”

AI Snake Oil (Part 2): Training Data

First in this series, I want to address the simplest and most important question to ask about a machine learning start-up or application:

Question: Is there existing training data? If not, how do they plan on getting it?

To sufficiently understand the answers to this question, you have to understand what training data is and, from there, what tasks or ideas would be extremely difficult to capture within training data. I’ll be addressing those in this post.

Most useful AI applications require training data: examples of the phenomenon they’re trying to replicate with the computer. If some start-up or group proposes a solution to a problem and they don’t have training data, you should be much more skeptical of their proposed solution; it’s now meandering into magic and/or expensive.

I like to think of training data as artificial intelligence’s dirty secret. It never gets mentioned in the press, but it is the topic of Day 1 of any Machine Learning class and forms the theoretical basis for what you learn the rest of the semester. Techniques like these that use training data are called often statistical methods, since they gather statistics about the data they’re provided to make predictions; this is in contrast to the rule-drive methods that were used prior to this.

Continue reading “AI Snake Oil (Part 2): Training Data”

AI Snake Oil (Part 1): Golden Lunar Toilet

A lot of over-hyped AI claims are being thrown around right now. In a lot of cases, leveraging this hype, some individuals make promises they can’t keep, no matter how dedicated or incredibly talented they are as developers. Steve Jobs may have had a so called “reality distortion field,” but that didn’t ever spawn a conscious AI, and neither will these people.

What I do want to describe is how to tell if someone is trying to sell you AI snake oil—bullshit claims on what they can actually achieve in a realistic time and budget. Sure, with infinite resources, I could build you a gold toilet on the moon, but no one has that kind of cash lying around. Shit needs to get done, and the time and material for doing so is finite.

Anything is possible. The only limit is yourself.
Anything is possible. I will make this happen for $412 billion dollars. Please provide in gold bullion so I can melt it down into the toilet of my own secret Swiss bank account.

If you’re approached by someone trying to sell you artificial intelligence-related software, or you read a piece in the popular press about what profession AI will uncannily crush in the next year, these are the questions you should ask. Depending on the answers, you can determine whether they’re bluffing or that they’ve done their homework and are worth taking seriously.

I was originally going to make this one post, but it’s grown too large to fit into one. In this series, each post is centered around a question you should ask when someone wants to do something in the real world with natural language processing, machine learning, or other AI components. These questions are:

Each post will detail what you should expect for an answer. As I write, I might add to or revise some of these questions, so don’t consider this list definitive quite yet.

All said and done, there are some really great things happening in AI right now; it’s part of why I chose to invest 6 years of my life getting involved in computational linguistics as a field. However, on any big wave of technology, there’s also a big wave of exploitation.  When people exploit the gap in knowledge between researchers and the public with hyperbole, it comes back to hurt those of us who work so hard to actually make shit that works. I hope that these posts can help non-researchers think more critically about AI and provide researchers a way to inform the public without dragging them through the equivalent graduate level coursework.