What is machine learning? A quick guide to basic concepts
Machine learning does exactly what it says on the tin. It is a method by which a computer program can “automatically learn and improve from experience without being explicitly programmed.”
Now, what does that mean? Let's breakdown a few concepts.
Predictive modeling and learning
Consider how a living creature learns something. Say there’s a mail carrier that always brings a pocketful of dog treats on his rounds. Whenever he comes to a house with a dog, he drops one of those treats into the mail slot along with the mail. The dog inside the house recognizes the scent of the mailman, and knows that he comes to the house around 2 p.m. every day.
After a few days of receiving treats in the mail at the same time, the dog begins to understand the pattern: 2 p.m. = mailman = treats. The dog adjusts its behavior accordingly, getting excited and sitting near the door at the same time each day, then barking like crazy when the mailman gets there.
Naturally, there are anomalies in the dataset. Sometimes a different mailman comes who doesn’t bring treats, or sometimes the mailman is running behind schedule. And there’s no mail on Sundays.
The dog is not deterred by exceptions to the pattern, however, because the predictive model it has created is correct more often than not. If the dog were very intelligent, it might know to check that the correct mailman was coming before getting excited, and recognize that nobody comes every seventh day, making this predictive model even more accurate.
In broad strokes, a computer program using machine learning follows the same method. It analyzes data and searches for an underlying pattern or trend to develop a predictive model that learns from the data it's fed.
Any machine learning program needs a “training dataset” to teach it what kind of information it can expect, and from which to begin noticing the kind of information the programmer is looking for.
The difference between a dog and a computer program is, of course, the volume and complexity of the input.
Machine learning algorithms can process massive amounts of data and predict outcomes and patterns based on that information. Over time, the predictive model becomes more accurate as the program improves itself, no outside tampering required.
There are three broad categories of algorithms, which are defined by what kind of training datasets they are given: supervised, unsupervised, and semi-supervised.
Each of these approaches has advantages and disadvantages, depending on what the program is intended to accomplish.
Overview: Supervised machine learning algorithms are trained on datasets where a given input leads to a specific output according to a mapping function.
How it works: According to Jason Brownlee, “the goal is to approximate the mapping function so well that when you have new input data (X) that you can predict the output variables (Y) for that data.”
The programmer is in this instance like a teacher who gives their student a quiz. The teacher knows the correct answers, and grades the student each time they are quizzed. The student keeps taking quizzes until they pass consistently.
Supervised problems can be grouped into two types:
- Classification is sorting output into categories. An example of a classification problem is a spam filter for your email. The program reads the emails, and classifies them as spam or not-spam based on their content.
- Regression problems, on the other hand, return an output that can be measured. For example, a program that calculates how many gallons of gas a car requires on a road trip given the distance and model of the car would require a regression algorithm.
Overview: Unsupervised machine learning has no correct output for the given input. Unlike supervised machine learning, there is no expected answer, and there is no teacher, just the program plugging away at the data on its own.
The goal of this type of machine learning is to analyze the data as a whole, and discover facts about the underlying structure.
How it works: When an unsupervised algorithm analyzes data, it is usually for one of two purposes:
- In a clustering problem the goal is to find particular groups within a dataset. Discovering your customers’ age and income distribution is an example of a clustering problem, as the program can show you which age and income groups are most common.
- An association problem is more focused on finding rules or patterns that govern a dataset. When you analyze a customer’s flow through your website, checking which links they are most drawn to, that’s rule association.
Overview: Semi-supervised machine learning is, unsurprisingly, a combination of the first two types.
How it works: Techniques of supervised and unsupervised machine learning can be used in the same problem.
For example, one could make predictions about a dataset using an unsupervised algorithm and feed the results to a supervised algorithm.
Semi-supervised machine learning doesn’t have any defined subcategories, but is most useful when your dataset is a mix of labeled and unlabeled data points.
Real-world problems, like classifying a collection of physical photographs, may be best solved by semi-supervised machine learning.
Machine learning use cases
The possible applications and advantages of this technology are numerous.
Very generally, we need machine learning if we want to accomplish a task that requires human-like adaptability, or is too large to scale. It also allows us to create an analytical model that is free of human bias, at least in theory.
Tasks that humans can learn to do automatically—such as understanding spoken words, judging road conditions, and recognizing people in a photograph—don't come easily to a typical computer program because it would need to learn from experiences as a human does.
Machine learning is designed to mimic human intelligence within set parameters. Every iteration helps the program improve its accuracy and ability to perform whatever task it's meant to do.
Human brains are marvelous data processors, but with limits. A human being could never do what a search engine does, for example, because there’s more information on the internet than a person can process.
A machine learning program can accomplish a task that most humans could do, such as search a web page for keywords, but do it on a scale that only computers can process.