A decision tree is a simple representation for classifying examples. It’s a form of supervised machine learning where we continuously split the data according to a certain parameter.

Components of Decision Tree Classification

  • Nodes: test for the value of a certain attribute
  • Edges/ Branch: correspond to the outcome of a test and connect to the next node or leaf
  • Leaf nodes: terminal nodes that predict the outcome and represent class labels or class distribution

To understand the concept of decision trees, consider the above example. Let’s say you want to predict whether a person is fit or unfit, given their age, eating habits and physical activity. The decision nodes are the questions like “What’s the age?,” “Does the person exercise?,” “Does the person eat a lot of pizza?” 

The leaves represent outcomes like fit or unfit.

2 Main Types of Decision Trees

  1. Classification Trees
  2. Regression Trees


1. Classification Trees (Yes/No Types)

What we’ve seen above is an example of a classification tree where the outcome was a variable like “fit” or “unfit.” Here the decision variable is categorical/discrete.

We build this kind of tree through a process known as binary recursive partitioning. This iterative process means we split the data into partitions and then split it up further on each of the branches.

Example of classification tree


2. Regression Trees (Continuous Data Types)

Regression trees are decision trees wherein the target variable contains continuous values or real numbers (e.g., the price of a house, or a patient’s length of stay in a hospital).

Example of a regression tree

Want a Deeper Dive? We Got You.How to Get Started With Regression Trees


How to Create a Decision Tree

In this method, we break down a set of training examples into smaller and smaller subsets; this process incrementally develops an associated decision tree. At the end of the learning process, the algorithm returns a decision tree covering the training set.

The key is to use decision trees to partition the data space into clustered (or dense) regions and empty (or sparse) regions.

In decision tree classification, we classify a new example by submitting it to a series of tests that determine the example’s class label. These tests are organized in a hierarchical structure called a decision tree. Decision trees follow the divide and conquer algorithm.

More From Our Data Science ExpertsA Friendly Introduction to Siamese Networks


Divide and Conquer

We build decision trees using a heuristic called recursive partitioning. This approach is also commonly known as divide and conquer because it splits the data into subsets, which then split repeatedly into even smaller subsets, and so on and so forth. The process stops when the algorithm determines the data within the subsets are sufficiently homogenous or have met another stopping criterion.

Advantages of Classification with Decision Trees

  • Inexpensive to construct
  • Extremely fast at classifying unknown records
  • Easy to interpret for small-sized trees
  • Their accuracy is comparable to other classification techniques for many simple data sets
  • Exclude unimportant features


Basic Divide and Conquer Algorithm

  1. Select a test for the root node. Create a branch for each possible outcome of the test.

  2. Split instances into subsets, one for each branch extending from the node.

  3. Repeat recursively for each branch, using only instances that reach the branch.

  4. Stop recursion for a branch if all its instances have the same class.

Disadvantages of Classification with Decision Trees

  • Easy to overfit
  • Decision boundaries are restricted to being parallel to attribute axes
  • Decision tree models are often biased toward splits on features having a large number of levels
  • Small changes in the training data can result in large changes to decision logic
  • Large trees can be difficult to interpret and the decisions they make may seem counter-intuitive
Decision and Classification Trees, Clearly Explained


Decision Tree Classifier

Using the decision algorithm, we start at the tree root and split the data on the feature that results in the largest information gain (IG) (i.e., reduction in uncertainty towards the final decision).

In an iterative process, we can then repeat this splitting procedure at each child node until the leaves are pure. This means that the samples at each leaf node all belong to the same class.

In practice, we may set a limit on the tree’s depth to prevent overfitting. We compromise on purity here somewhat as the final leaves may still have some impurity.

Classifying whether an insect is a Grasshopper or a Katydid based on antenna length and abdomen length.

Related ReadingA Primer on Model Fitting


Real-World Applications of Decision Trees

  1. Biomedical Engineering: Decision trees identify features used in implantable devices.

  2. Financial analysis: They measure customer satisfaction with a product or service.

  3. Astronomy: Decision trees are to classify galaxies.

  4. System Control: Decision trees have found their application in modern air conditioning and temperature controllers.

  5. Manufacturing and production: Decision trees aid in quality control, semiconductor manufacturing, and more.

  6. Healthcare: They help doctors diagnose patients in cardiology, psychiatry, and more.

  7. Physics: Decision trees are used for particle detection.

Expert Contributors

Built In’s expert contributor network publishes thoughtful, solutions-oriented stories written by innovative tech professionals. It is the tech industry’s definitive destination for sharing compelling, first-person accounts of problem-solving on the road to innovation.

Learn More

Great Companies Need Great People. That's Where We Come In.

Recruit With Us