There are two realities when it comes to artificial intelligence. In one, the future is so bright that you have to put on welding goggles just to look at it. AI is a backbone technology that is as necessary for global human activities as electricity and the Internet. But in the other reality, winter is coming.
An ‘AI winter’ is a period in which nothing can grow. That means nobody hires, nobody recruits and nobody finances. But this approaching bare season is special, it won’t affect the entire industry.
In fact, most experts won’t even notice. Google, OpenAI, DeepMind, Nvidia, Meta, IBM and any university that does legitimate research have nothing to worry about. Startups with a clear, useful purpose will be fine, despite typical market problems.
The only people who need to worry about the coming chill are those trying to do what we’ve come to call “black box alchemy.”
Black box alchemy
I shudder to call any AI effort “alchemy” because at least the idea of turning one metal into another has… any scientific merit†
I’m talking about the wildly popular research vein where researchers build crappy little prediction models and then make up fake problems so that AI can solve better than humans.
If you write it all down in one sentence, it sounds like it should be obvious that it’s on a whim. But I’m here to tell you that black box alchemy currently represents a great deal of academic research, and that’s a bad thing.
Black box alchemy is what happens when AI researchers take something an AI is good at — like returning relevant results when you search for something on Google — and try to use the same principles to do something impossible. Because the AI can’t explain why it’s coming up with the results (because the work takes place in a black box that we can’t see inside), the researchers pretend they’re doing science without having to show any work.
It’s a scam that plays in myriad paradigms, ranging from predictive policing and recidivism algorithms to bullshit pop facial recognition systems that supposedly detect everything from a person’s politics to whether they’re likely to become a terrorist.
The part that cannot be emphasized enough is that this particular scam is perpetuated in academia. It doesn’t matter if you plan on going to a community college or to Stanford, black box alchemy is everywhere.
Here’s how the scam works: Researchers come up with a scheme that allows them to develop an AI model that’s “more accurate” for a given task than humans.
This is, literally, the hardest part. You can’t choose a simple task like looking at pictures and deciding whether there is a cat or a dog in it. Humans will break the AI 100 times out of 100 on this task. We are very good at distinguishing cats from dogs.
And you cannot choose a task that is too complicated. For example, it makes no sense to train a prediction model to determine which 1930s patents are most relevant to modern thermodynamic applications. The number of people who could win at that game is too small to matter.
You have to pick a task that the average person thinks can be observed, measured and reported through the scientific method, but you can’t.
Once you’ve done that, the rest is easy.
My favorite example of black box alchemy is the Stanford Gaydar paper. It’s a masterpiece in bullshit AI.
Researchers trained a rudimentary computer vision system on a database of human faces. The faces were labeled with self-reported tags that indicated whether the person depicted was gay or straight.
Over time, they were able to reach superhuman levels of accuracy. According to the researchers, the AI was better able to tell which faces were gay than humans, and no one knows why.
Here’s the truth: no human being can tell if another human being is gay. We can guess. Sometimes we guess right, other times we guess wrong. This is not science.
Science requires observation and measurement. If there is nothing to observe or measure, we cannot do science.
Homosexuality is not a basic truth. There is no scientific standard for homosexuality.
Here’s what I mean: are you gay if you experience same-sex attraction or only if you act on it? Can you be a gay virgin? Can you have a queer experience and stay straight? How many gay thoughts does it take to qualify as gay, and who gets to decide that?
The simple reality is that human sexuality is not an issue you can plot on a map. No one can determine if someone else is gay. People have a right to stay in castes, deny their own experiential sexuality, and decide how much “gay” or “right” they need in their own lives to determine their own labels.
There is no scientific test for gays. And that means the Stanford team can’t train AI to detect homosexuality; it can only train an AI to try and beat humans in a discrimination game that has no positive real-world use case.
The Stanford gaydar newspaper is just one of thousands of examples of black box alchemy out there. It shouldn’t surprise anyone that this line of research is so popular, it’s the low-hanging fruit of ML research.
Twenty years ago, the number of high school graduates interested in machine learning was a drop in the ocean compared to the number of teens entering college this year to earn a degree in AI.
And that’s both good and bad. The good thing is that there are more brilliant AI/ML researchers in the world today than ever – and that number will only continue to grow.
The bad thing is that every AI classroom on the planet is littered with students who don’t understand the difference between a Magic 8-Ball and a prediction model — and even fewer understand why the former is more useful for predicting human outcomes.
And that brings us to the three things every student, researcher, professor, and AI developer can do to make the entire field of AI/ML better for everyone.
- Don’t do black box alchemy. The first question to ask before embarking on an AI project related to prediction is: will it impact human outcomes? If the only science you can use to measure the effectiveness of your project is to compare it to human accuracy, chances are you’re not doing a great job.
- Don’t create new models for the sole purpose of exceeding the benchmarks of previous models just because you can’t afford to manage useful databases.
- Do not train models on data that you cannot guarantee to be accurate and diverse.
I’d like to end this article with those three tidbits as a sort of smug mic drop, but it’s not that moment.
The fact is that a large proportion of students will likely struggle to do anything new in AI/ML that does not break all three of these rules. And that’s because black box alchemy is easy, building custom databases is damn near impossible for anyone without the resources of big technology, and only a handful of universities and companies can afford to train models with large parameters. .
We’re stuck in a place where the vast majority of students and aspiring developers don’t have access to the resources needed to go beyond looking for “cool” ways to use open source algorithms.
The only way to get through this era and into a more productive era is for the next generation of developers to scold current trends and push their way from the status quo – much like the current group of pioneering AI developers. did in their time.