While artificial intelligence, machine learning and deep learning are trending tech terms that we hear everywhere these days, there are major misconceptions about what these words actually mean. Many companies claim to incorporate some kind of artificial intelligence (AI) in their applications or services, but what does that mean in practice?
Broadly, AI describes when a machine mimics cognitive functions that humans associate with other human minds, such as learning and problem solving. On an even more elementary level, AI can merely be a programmed rule that tells the machine to behave in a specific way in certain situations. In other words, artificial intelligence can be nothing more than several if-else statements.
An if-else statement is a simple rule programmed by a human. Consider a robot moving on a road. A programmed rule for that robot could be:
if something_is_in_the_way is True: stop_moving() else: continue_moving()
So, when we’re talking about artificial intelligence it’s more worthwhile to consider two more specific subfields of AI: machine learning and deep learning.
Machine Learning vs. Deep Learning
Now that we better understand what AI actually means we can take a closer look at machine learning and deep learning to draw a clear distinction between these two.
AI vs. Machine Learning vs. Deep Learning
- Artificial Intelligence: a program that can sense, reason, act and adapt.
- Machine Learning: algorithms whose performance improve as they are exposed to more data over time.
- Deep Learning: subset of machine learning in which multilayered neural networks learn from vast amounts of data.
Machine Learning Is Not New Technology
What is machine learning? We can think of machine learning as a series of algorithms that analyze data, learn from it and make informed decisions based on those learned insights.
Machine learning can lead to a variety of automated tasks. It affects virtually every industry — from IT security malware search, to weather forecasting, to stockbrokers looking for optimal trades. Machine learning requires complex math and a lot of coding to achieve the desired functions and results. Machine learning also incorporates classical algorithms for various kinds of tasks such as clustering, regression or classification. We have to train these algorithms on large amounts of data. The more data you provide for your algorithm, the better your model gets.
When someone says they are working with a machine learning algorithm, you can get to the gist of its value by asking: what’s the objective function?
Machine learning is a relatively old field and incorporates methods and algorithms that have been around for dozens of years, some of them since the 1960s. These classic algorithms include the Naïve Bayes Classifier and the Support Vector Machines, both of which are often used in data classification. In addition to classification, there are also cluster analysis algorithms such as the K-Means and tree-based clustering. To reduce the dimensionality of data and gain more insight into its nature, machine learning uses methods such as principal component analysis and tSNE.
The training component of a machine learning model means the model tries to optimize along a certain dimension. In other words, machine learning models try to minimize the error between their predictions and the actual ground truth values.
For this we must define a so-called error function (also called a loss-function or an objective function) because the model has an objective. This objective could be to classify data into different categories (e.g. cat and dog pictures) or predict the expected price of a stock in the near future. When someone says they are working with a machine learning algorithm, you can get to the gist of its value by asking: what’s the objective function?
How Machine Learning Works: How Do We Minimize Error?
We can compare the model’s prediction with the ground truth value and adjust the parameters of the model so next time the error between these two values is smaller. This process is repeated millions of times until the parameters of the model that determine the predictions are so good that the difference between the predictions of the model and the ground truth labels are as small as possible.
In short, machine learning models are optimization algorithms. If you tune them right, they minimize error by guessing and guessing and guessing again.
Deep Learning vs. Machine Learning: The Next Big Thing
Unlike machine learning, deep learning is a young subfield of artificial intelligence based on artificial neural networks.
Since deep learning algorithms also require data in order to learn and solve problems, we can also call it a subfield of machine learning. The terms machine learning and deep learning are often treated as synonymous. However, these systems have different capabilities.
Deep learning uses a multi-layered structure of algorithms called the neural network.
Artificial neural networks have unique capabilities that enable deep learning models to solve tasks that machine learning models could never solve.
All recent advances in intelligence are due to deep learning. Without deep learning we would not have self-driving cars, chatbots or personal assistants like Alexa and Siri. Google Translate would remain primitive and Netflix would have no idea which movies or TV series to suggest.
We can even go so far as to say that the new industrial revolution is driven by artificial neural networks and deep learning. This is the best and closest approach to true machine intelligence we have so far because deep learning has two major advantages over machine learning.
Why Is Deep Learning Better Than Machine Learning?
No Feature Extraction
The first advantage of deep learning over machine learning is the redundancy of feature extraction.
Long before we used deep learning, traditional machine learning methods (decision trees, SVM, Naïve Bayes classifier and logistic regression) were most popular. These are otherwise known as flat algorithms. In this context “flat” means these algorithms cannot typically be applied directly to raw data (such as .csv, images, text, etc.). Instead we require a preprocessing step called feature extraction.
In feature extraction we provide an abstract representation of the raw data that classic machine learning algorithms can use to perform a task (i.e. the classification of the data into several categories or classes). Feature extraction is usually pretty complicated and requires detailed knowledge of the problem domain. This step must be adapted, tested and refined over several iterations for optimal results. Deep learning models don’t need feature extraction.
When it comes to deep learning models, we have artificial neural networks, which don’t require feature extraction. The layers are able to learn an implicit representation of the raw data on their own.
A deep learning model produces an abstract, compressed representation of the raw data over several layers of an artificial neural network. We then use a compressed representation of the input data to produce the result. The result can be, for example, the classification of the input data into different classes.
During the training process, the neural network optimizes this step to obtain the best possible abstract representation of the input data. Deep learning models require little to no manual effort to perform and optimize the feature extraction process. In other words, feature extraction is built into the process that takes place within an artificial neural network without human input.
If you want to use a machine learning model to determine whether a particular image shows a car or not, we humans first need to identify the unique features of a car (shape, size, windows, wheels, etc.), extract these features and give them to the algorithm as input data. The machine learning algorithm would then perform a classification of the image. That is, in machine learning, a programmer must intervene directly in the classification process.
This applies to every other task you’ll ever do with neural networks. Give the raw data to the neural network and let the model do the rest.
Deep Learning for Big Data
The other major advantage of deep learning, and a key part in understanding why it’s becoming so popular, is that it’s powered by massive amounts of data. The era of big data technology will provide huge amounts of opportunities for new innovations in deep learning.
Deep learning models tend to increase their accuracy with the increasing amount of training data, whereas traditional machine learning models such as SVM and Naïve Bayes classifier stop improving after a saturation point.
Deep learning models scale better with a larger amount of data. To paraphrase Andrew Ng, the chief scientist of China’s major search engine Baidu, co-founder of Coursera, and one of the leaders of the Google Brain Project, if a deep learning algorithm is a rocket engine, data is the fuel.