How Neural Networks and Their Developers Really Work

Not Magic: How Neural Networks and Their Developers Work

Neural networks don’t just recognize text, images, and speech—they also help diagnose diseases and search for natural resources. But how does this actually work?

Three Key Facts About Artificial Intelligence

  • Machine learning is already part of our daily lives. It’s not some futuristic technology or flying cars we haven’t seen yet. We participate in machine learning every day: we’re either the subject of this learning or we provide the data for it.
  • There are no “magical black boxes.” There’s no artificial intelligence where you just throw something in and it calculates everything for you. The most important thing is high-quality data for training. All architectures and algorithms are well-known; the secret to any cool new application is always in the data.
  • Machine learning advances thanks to the open community. We support open source—just like Google and other developers of open and good technology.

From Heuristics to Learning

Quick primer: AI is a large field, and machine learning is a part of it. There are many algorithms, but the most interesting are neural networks. Deep learning is a specific type of neural network that we work with.

Why don’t old algorithms work, and why do we need machine learning? Sure, doctors recognize cancer better than neural networks—but usually only at stage four, when it’s often too late. To detect disease at stage one, you need algorithms. In the past, oil would gush from the ground, but that’s no longer the case; natural resources are getting harder to find.

All our previous knowledge is based on heuristic algorithms. For example, if a person has a certain family history, we might suspect a tumor is a certain type. But if we don’t have that information, we do nothing. That’s heuristics.

Most current professional software in various fields is built on heuristics. They’re trying to switch to machine learning, but it’s hard because you need data.

For example, Pornhub has great neural network algorithms, but also uses heuristics. The site has sections like “Popular” (by views), “Best” (by likes), and “Hottest.” The “Hottest” section isn’t based on views or hashtags—it’s videos people watch last before leaving the site, meaning they evoke the most emotion.

When and why did neural networks appear? They were first written about in 1959, but the number of publications only started to rise sharply in 2009. For 50 years, nothing happened: there wasn’t enough computing power or modern graphics accelerators. Now, there are about 50 new publications on neural network achievements every day, and there’s no turning back.

The key point: neural networks aren’t magic. When people hear I work in data science, they pitch startup ideas like: “Just grab all the data from Facebook, throw it into a neural network, and predict everything.” But it doesn’t work that way. There’s always a specific data type and a clear task.

  • There’s no such thing as “recognition” in a mathematical sense—that’s just how people describe it. Complex tasks are always broken down into simpler subtasks.

Here’s a digitized image of the handwritten number 9, 28 by 28 pixels:

The first layer of the neural network is the input, which “sees” 784 pixels in various shades of gray. The last layer is the output: several categories, and we ask the network to assign the input to one of them. In between are hidden layers:

These hidden layers are a function we don’t define with heuristics—they learn to map the input pixels to a class with a certain probability.

How Neural Networks Work with Images

  • Classification. You can train a neural network to classify images, like recognizing dog breeds. But you need millions of images—and the data type must match your real use case. If you train a network to find dogs but show it cupcakes, it’ll still look for dogs, leading to funny results.
  • Detection. This is a different task: finding an object of a certain class in an image. For example, uploading a photo of a beach and asking the network to find people and kites. Such algorithms are being beta-tested by the “Liza Alert” search team, who use drones to take many photos during searches. The algorithm filters out irrelevant images, but humans still validate the results.
  • Segmentation (single-class and multi-class). Used, for example, in self-driving cars. The network assigns objects to classes: cars, sidewalks, buildings, people—each with clear boundaries.
  • Generation. Generative networks take nothing as input and output a class of objects. The hidden layers learn to turn “nothing” into something specific. For example, here are two faces—both generated by a neural network. The network looks at millions of photos online and learns, through many iterations, what a face should look like.

If we can generate an image, we can make it move like a specific person—that is, generate video. For example, a recent viral video showed Obama saying “Trump is an idiot.” Obama never said that; the neural network was trained to match (from “match”) Obama’s face, and when another person spoke, the camera mapped their facial movements onto Obama’s face. Another example is Ctrl Shift Face, which makes impressive deepfakes of celebrities. Neural networks aren’t perfect yet, but they’re getting better every year, and soon it’ll be impossible to tell a real person from a “network-painted” one in video. Face ID won’t protect us from fraud anymore.

How Neural Networks Work with Text

Text has no meaning for networks—it’s just “vectors” for mathematical operations, like: “King minus man plus woman equals queen.”

But since neural networks learn from human-created texts, oddities arise. For example: “Doctor minus man plus woman equals nurse.” In the network’s view, female doctors don’t exist.

  • Machine translation. Old translators used heuristics: these words mean this, they can only be translated and arranged in a certain way. The results were often nonsense. Today, Google Translate uses neural networks, and its translations are much more natural.
  • Text generation. Six months ago, a neural network was created that, given a topic and a few keywords, can write an essay. It works well, but doesn’t fact-check or consider ethics. The authors didn’t release the code or training data, saying the world isn’t ready for this technology and it could be misused.
  • Speech recognition and generation. The same as with images: there’s a sound, and the signal is digitized. That’s how “Alice” and Siri work. When you type text into Google Translate, it translates, forms a sound wave from the letters, and plays it back—generating speech.

Reinforcement Learning

The game “Arkanoid” is a simple example of reinforcement learning:

  • There’s an agent—the thing you control, which can change its behavior (the paddle at the bottom).
  • There’s an environment, described by various modules (everything around the paddle).
  • There’s a reward: when the network drops the ball, it loses its reward.

When the network scores points, we tell it it’s doing well. The network then invents actions to maximize its reward. At first, it just stands still. We say, “Bad.” It moves one pixel. “Bad.” It tries random moves. Training a neural network is a long and expensive process.

Another example is the game Go. In May 2014, people said computers wouldn’t learn Go anytime soon. But the next year, a neural network beat the European champion. In March 2016, AlphaGo beat the world champion, and the next version beat the previous one 100:0, making completely unpredictable moves. It had no restrictions except the game rules.

Why spend so much money teaching computers to play games and invest in eSports? Because training robots to move and interact in the real world is even more expensive. If your algorithm crashes a multimillion-dollar drone, that’s a big loss. But you can practice on people—in Dota, for example—with no risk.

Open Source

How are machine learning applications implemented? Bold claims online that some company wrote an app that “recognized everything” aren’t true. There are market leaders who develop tools and release them as open source so everyone can write code, suggest changes, and move the field forward. There are “good guys” who also share some code. But there are also “bad guys” who don’t develop their own algorithms, use what the “good guys” wrote, make their own “Frankensteins,” and try to sell them.

Examples of Data Science in the Oil Industry

  • Finding new deposits. To determine if there’s oil underground, specialists set off explosions and record the signals to see how vibrations travel through the earth. Surface waves distort the picture, so the result needs to be “cleaned.” Seismic experts do this in special programs, picking new filter combinations each time. We can train a neural network to do the same. But the network may remove not just surface noise but useful signals, so we add a new condition: clean only the part of the signal the experts work with—this is called an “attention network.”
  • Describing core samples by lithology type. This is a segmentation task. There are photos of core samples (rocks from a well). A specialist must manually identify the layers, which takes weeks or months. A trained neural network can do it in under an hour. The more we train it, the better it gets.

“Better Than a Human”

Experts often ask how a neural network’s work can be compared to human experience: “Ivan Petrovich has been with us since 1964—he’s smelled every core sample!” Of course, but he did the same as the network: took the core, checked the textbook, watched how others did it, and tried to find patterns. The neural network just works much faster and can “live” Ivan Petrovich’s experience 500 times a day. Still, people don’t trust the technology, so we have to break tasks into small steps so experts can validate each one and believe the network works.

Claims that a neural network works “better than a human” are usually baseless, because there’s always someone “dumber” than the network. If you ask me to “recognize oil,” and I say, “Well, somewhere here,” you’ll say, “Aha, our system works better than you.” In reality, to evaluate a neural network’s effectiveness, you need to compare it to a group of top experts in the field.

Accuracy claims are also questionable. If you have ten people, one with lung cancer, and say they’re all healthy, you’re 90% accurate. You missed one out of ten, but the result is useless. Any news about revolutionary developments is false if there’s no open code or description of how it was done.

Data must be high-quality. You can’t just throw uncleaned, randomly collected data into a neural network and get something useful. What are “bad data”? To detect cancer, you need many high-resolution CT scans to build a 3D organ cube. Then a doctor can spot a suspicious mass. We asked specialists to label many such images to train the network. The problem: one doctor says the cancer is here, another says there are two, a third thinks differently. You can’t build a network from this, because they’re all different tissues. If you train on such data, the network will see cancer everywhere.

Problems with Neural Networks

  • With the dataset. Once, a Chinese violation detection system fined a woman for jaywalking—but she was just an ad on a bus crossing the street. The network was trained on the wrong dataset. It needed objects in context to distinguish real women from ads.
  • Another example: a lung cancer detection competition. One group released a dataset with 1,000 images, labeled by three experts (only where they agreed). That dataset could be used for training. Another company released news that it used hundreds of thousands of X-rays, but only 20% were of sick patients—the ones we care about. If the network trains without them, it won’t detect disease. Plus, those 20% included several disease categories. And since it was 2D, not 3D, nothing useful could be done with the dataset.
  • It’s important to include real information in the dataset. Otherwise, you’ll end up fining people who are just stickers on buses.
  • With implementation. Neural networks don’t know what to suggest when there’s no information or when to stop. For example, if you create a new email account, the first ads you see will be random. If you searched for a couch and bought one, you’ll still see couch ads for a while, because the network doesn’t know you already bought it. A chatbot that “fell in love” with Hitler just mimicked what people did. Remember: you create content every day, and it can be used against you.
  • With reality. In Florence, there’s an artist who puts funny stickers on road signs to brighten people’s days. But such signs probably aren’t in the training data for self-driving cars. If you release a car into that world, it might hit a few pedestrians and stop.

So, for neural networks to work well, we shouldn’t just hype them up—we should study math and use what’s available in open source.

Leave a Reply