What Is a Decision Tree?

A decision tree is a supervised machine learning algorithm that creates a series of sequential decisions to reach a specific result. 

Written by Anthony Corbo
Published on Jan. 03, 2023
Decision tree image of a wooden cutout tree with small blocks featuring human silhouettes.
Image: Shutterstock / Built In
Brand Studio Logo
REVIEWED BY
Rahul Agarwal | Jan 06, 2023

Decision trees combine multiple data points and weigh degrees of uncertainty to determine the best approach to making complex decisions. This process allows companies to create product roadmaps, choose between suppliers, reduce churn, determine areas to cut costs and more.

More From Built In ExpertsWhat Is Decision Tree Classification?

 

What Is a Decision Tree Used For?

We typically use decision trees to create informed opinions that facilitate better decision making.

Decision trees allow us to break down information into multiple variables to arrive at a singular best decision to a problem. 

Decision Tree Components

  • A singular node, or “decision,” connecting two or more distinct arcs — decision branches — that present potential options. 
  • An event sequence comes next and is represented as a circular “chance node” that points out potential events that may result from a decision. 
  • Finally, we call the costs and benefits associated with each branch of a decision tree “consequences.” The endpoint of a tree is represented by a triangle, or bar, known as a terminal.

Decision trees must contain all possibilities clearly outlined in a structured manner in order to be effective, but they must also present multiple possibilities for data scientists to make collaborative decisions and optimize business growth.

Decision Tree Classification Clearly Explained! | Video: Normalized Nerd

 

Decision Trees vs. Random Forest: What’s the Difference?

Random forest algorithms differ from decision trees in their ability to form several decisions in order to reach a final majority decision.

Decision trees incorporate multiple variables to determine potential outcomes that ultimately allow us to make a single, best decision. Random forest algorithms go a step further and do not rely on a single decision. Instead, they assemble randomized decisions based on several decisions made beforehand, thereby basing the final decision on a majority opinion. A random forest is essentially the outputs of multiple decision trees weighed against each other to present a single outcome through continuous decision-making. That said, random forest doesn't necessarily  determine the best solution, but instead introduces more diversity to create a smoother prediction based on the outcome with the greatest possibility. 

When to Use Decision Tree Over Random Forest

Random forest is best when multiple pieces of data come from a complex data set and must be analyzed to generate a final output. We effectively sacrifice easy interpretability  to determine the most recurring output when we weight virtually limitless inputs against each other.  Decision trees are best used when working with simpler data sets due to easier interpretability and simpler model training.
Find out who's hiring.
See all Data + Analytics jobs at top tech companies & startups
View Jobs

 

What Are the Disadvantages of a Decision Tree?

The main disadvantages of decision trees lie in their tendency to quickly become complicated and full of information gain.

Decision trees are used to determine logical solutions to complex problems but are ineffective without containing all possible outcomes to a possible decision. Accordingly, decision trees have a tendency to become loaded with several branches containing many variables, often branching off into a separate outcomes entirely. This can lead to an overwhelming amount of data and more confusion than clarity when making decisions.

Decision trees may also lead to issues when using qualitative variables, those that aren’t numerical in value but rather fit into categories, to make decisions. Numbers may be assigned to qualitative variables for data analysis uses, but qualitative data still has the potential to create a staggering number of branches or may present unclear decision possibilities entirely.

Explore Job Matches.