Machine learning is a branch of artificial intelligence where algorithms and statistical models are used to identify patterns in data and make predictions without explicit programming. These algorithms are optimized through trial and error and feedback, meaning machines learn by experience and increased exposure to data much the same way humans do. Machine learning is applied in a range of industries and applications, including fraud detection, healthcare forecasting and natural language processing.
4 Types of Machine Learning
- Supervised learning
- Unsupervised learning
- Semi-supervised learning
- Reinforcement learning
Machine learning and its algorithms consists of four main types: supervised learning, unsupervised learning, semi-supervised learning and reinforcement learning.
Here’s what to know about each type and a few ways they are used.
1. Supervised Learning
Supervised learning involves training a machine and its algorithm using labeled training data, and requires a significant amount of human guidance. It’s one of the most popular forms of machine learning and is able to train models to accomplish tasks in classification, regression or forecasting.
In order to work, supervised learning requires a significant amount of human intervention because of its use of labeled data sets. Data must be divided into features (the input data) and labels (the output data).
Features describe individual, measurable units of data, such as height, salary, colors or animal breeds.
Labels are used to group data by specific characteristics and are often assigned manually by humans to help explain the context of certain data to the machine. For example, a data label indicates whether or not there’s a dog in a picture, or if the word “hello” is spoken in an audio clip. This teaches a machine what elements it needs to recognize, plus how to identify labeled elements from raw data in the future.
With supervised learning, labeled input and output data is constantly fed and re-fed into human-trained systems, offering real-time guidance for machines. This helps predictions increase in accuracy after each new data set is fed into the system. Humans also provide feedback on the accuracy of the machine learning algorithm during this process, which helps it to learn over time.
Supervised learning examples:
- Recommender systems and recommendation engines
- Inbox spam detection
- Stock and housing market value prediction
Supervised learning, like each of these machine learning types, serves as an umbrella for specific algorithms and statistical methods, including the ones below.
Classification
Used to further categorize data, classification algorithms are a great tool to sort, and even hide, that data. (If you use a Gmail or any large email client, you may notice that some emails are automatically redirected to a spam or promotions folder, essentially hiding those emails from view.)
A few popular classification algorithms used to sort data include K-nearest neighbor (KNN), naive Bayes classifier algorithms, support vector machine (SVM) algorithms, decision trees and random forest models.
Regression
Regression algorithms are frequently used tools for forecasting trends. These algorithms identify relationships between outcomes and other independent variables to make accurate predictions. Linear regression algorithms are the most widely used, but other commonly used regression algorithms include logistic regressions, ridge regressions and lasso regressions.
In simple linear regression, a feature acts as the x variable, while a label acts as the y variable.
2. Unsupervised Learning
With unsupervised learning, raw data that’s neither labeled nor tagged is processed by the system, meaning less work for humans. Unsupervised learning algorithms discover patterns or anomalies in large, unstructured data sets that may otherwise go undetected by humans. This makes it applicable for accomplishing tasks related to clustering or dimensionality reduction.
Unsupervised learning algorithms work by analyzing available data and grouping information based on similarities and differences, thus creating relationships between data points.
Unsupervised learning examples:
- Automated customer and audience segmentation
- Computer vision
- Breach and anomaly detection
These two types of unsupervised learning methods are among the most common:
Clustering
Clustering algorithms are the most widely used example of unsupervised machine learning. These algorithms focus on similarities within raw data, and then groups that information accordingly. More simply, these algorithms provide structure to raw data. Clustering algorithms are often used with marketing data to garner customer (or potential customer) insights, as well as for fraud detection.
Some clustering algorithms include hierarchical clustering and k-means clustering.
Dimensionality Reduction
Dimensionality reduction is the process of reducing the amount of features within a data set, all while preserving important properties of the data. This is done to reduce processing time, storage space, complexity and overfitting in a machine learning model.
The two main methods for applying dimensionality reduction include feature selection and feature extraction. Feature selection involves selecting a subset of relevant features from the original feature set to use as input into a model, which helps simplify the model and improve the accuracy of outputs. Feature extraction involves extracting new, significant features from the original raw data for input, which focuses on cutting through redundant data and choosing which features will most improve output.
Popular dimensionality reduction algorithms include principal component analysis (PCA), non-negative matrix factorization (NMF), linear discriminant analysis (LDA) and generalized discriminant analysis (GDA).
3. Semi-Supervised Learning
Semi-supervised learning offers a balanced mix of both supervised and unsupervised learning. With semi-supervised learning, a hybrid approach is taken as small amounts of labeled data are processed alongside larger chunks of raw data. This strategy essentially gives algorithms a head start when it comes to identifying relevant patterns and making accurate predictions when compared with unsupervised learning algorithms, without the time, effort and cost associated with more labor-intensive supervised learning algorithms.
Because semi-supervised learning uses labeled data and unlabeled data, it often relies on modified unsupervised and unsupervised algorithms trained for both data types.
Semi-supervised learning examples:
- Fraud detection
- Speech recognition
- Text document classification
Here’s a couple algorithms that fall under semi-supervised learning:
Self-Training
Self-training algorithms use a pre-existing, supervised classifier model, known as a pseudo-labeler, that’s trained on a small portion of labeled data in a set. The pseudo-labeler is then used to make predictions on the remainder of the dataset, which is unlabeled. Labels produced from this process are called pseudo-labels, and are added back into the labeled dataset. These actions are done repeatedly by the model until all data samples are labeled or there are no more to label, improving its accuracy over time.
Label Propagation
Label propagation algorithms assign labels to unlabelled observations by propagating, or allocating, labels through a dataset over time, usually in a graph neural network. These datasets tend to start with a small section already having labels, and assign labels based on direct connections between these data points in the graph. Label propagation can be used to quickly identify communities, detect abnormal behavior or accelerate marketing campaigns. For example, if one customer on a graph likes a certain product, a customer branched directly off of them may also like it.
4. Reinforcement Learning
With reinforcement learning, AI-powered computer software programs are outfitted with sensors, commonly referred to as intelligent agents, that respond to their surrounding environment to make decisions independently that achieve a desired outcome. (Think simulations, computer games and the real world.)
Intelligent agents are self-trained by being rewarded for desired behaviors or punished for undesired behaviors. By perceiving and interacting with their environment, these agents learn through trial and error, ultimately reaching optimal proficiency through positive reinforcement during the learning process.
Reinforcement learning examples:
- Robotics
- Self-driving cars
- Helping machines acquire specific skills and behaviors
These are some of the algorithms that fall under reinforcement learning:
Q-Learning
Q-learning is a reinforcement learning algorithm that does not require a model of the intelligent agent’s environment. Q-learning algorithms iteratively calculate the value of actions based on rewards resulting from those actions, which improves outcomes and behaviors over time.
Deep Reinforcement Learning
Used in the development of self-driving cars, video games and robots, deep reinforcement learning combines deep learning — machine learning based on artificial neural networks — with reinforcement learning where actions, or responses to the artificial neural network’s environment, are either rewarded or punished. With deep reinforcement learning, vast amounts of data and increased computing power are required.
Frequently Asked Questions
What is machine learning?
Machine learning is a subfield of artificial intelligence (AI) where systems learn from experiences and optimize processes through exposure to data, all without explicit programming.
What are the four types of machine learning?
- Supervised learning
- Unsupervised learning
- Semi-supervised learning
- Reinforcement learning
Is ChatGPT machine learning?
ChatGPT can be considered a machine learning-based chatbot since it is built on GPT (generative pre-trained transformer) architecture, a type of neural network and deep learning model. ChatGPT uses these machine learning processes to understand and generate human-like conversations.