Python Tree Implementation: A Guide

Trees are non-linear data structures that store data hierarchically and are made up of nodes connected by edges. Here’s how to implement it in Python using bigtree.

Written by Kay Jan Wong
Published on May. 10, 2024
Python Tree Implementation: A Guide
Image: Shutterstock / Built In
Brand Studio Logo

Python has built-in data structures for lists, arrays and dictionaries, but not for tree-like data structures. In LeetCode, questions for “Trees” are limited to binary search trees, and its implementation doesn’t have many functionalities. 

The bigtree Python package can construct and export trees to and from Python lists, dictionaries and Pandas DataFrames, integrating seamlessly with existing Python workflows.

Tree-like data structures can be used to show hierarchical relationships, such as family trees and organizational charts.

How to Implement Trees in Python

A tree is a data structure used to show a hierarchical relationship, such as an organizational chart. It can be created in Python using the bigtree package, 

This article will introduce basic tree concepts, how to construct trees with the bigtree Python package, tree traversal, search, modification and export methods. We’ll finish with ways to use trees for to-do list implementation, and extend tree implementation to trie and directed acyclic graph data structures.

Table of contents:

  • Tree basics and terminologies
  • Bigtree setup
  • Constructing trees
  • Tree traversal algorithms
  • Tree search methods
  • Tree modification methods
  • Exporting trees
  • Using to-do list with bigtree
  • Extending to Trie
  • Directed acyclic graph (DAG)

 

What Are Trees in Python?

Trees are non-linear data structures that store data hierarchically and are made up of nodes connected by edges. For example, in a family tree, a node would represent a person, and an edge would represent the relationship between two nodes.

Example of a tree in Python
Fig. 1: Example of a tree | Image: Kay Jan Wong

There are a few terminologies that extend to these components:

  • Root: A node that doesn’t have any parent, and the entire tree originates from it. In Fig. 1, the root is node a.
  • Leaf: Nodes that don’t have any child. In Fig. 1, leaf nodes are nodes c, d, and e.
  • Parent node: Immediate predecessor of a node. In Fig. 1, node a is the parent node of nodes b and c.
  • Child node: Immediate successor of a node. In Fig. 1, Node b is the child node of Node a.
  • Ancestors: All predecessors of a node. In Fig. 1, nodes a and b are ancestors of node d.
  • Descendants: All successors of a node. In Fig. 1, nodes d and e are descendants of node b.
  • Siblings: Nodes that have the same parent. In Fig. 1, node d and e are siblings.
  • Left sibling: Sibling to the left of the node. In Fig. 1, node d is the left sibling of node e.
  • Right sibling: Sibling to the right of the node. In Fig. 1, node e is the right sibling of node d.
  • Depth: Length of the path from node to root. In Fig. 1, the depth of node b is 2, and Node d is 3.
  • Height/Max depth: Maximum depth of root to a leaf node. In Fig. 1, the height of the tree is 3.

 

How to Setup Bigtree in Python

Bigtree is easy to set up: simply run the following command on Terminal.

$ pip install bigtree

If you want to export trees to image, run the following command on Terminal instead.

$ pip install 'bigtree[image]'

Without further ado, let’s dive into implementing trees!

More on PythonSeaborn Pairplot: A Guide

 

How to Create Trees in Python

To construct trees, we must first define the nodes and link the nodes by specifying the parent and children of the node.

For example, to construct a family tree:

from bigtree import Node

root = Node("a", age=90)
b = Node("b", age=65, parent=root)
c = Node("c", age=60, parent=root)

root.children
# (Node(/a/b, age=65), Node(/a/c, age=60))

root.depth, b.depth
# (1, 2)

root.max_depth
# 2

root.show(attr_list=["age"])
"""
a [age=90]
├── b [age=65]
└── c [age=60]
"""

root.hshow()
"""
     ┌─ b
─ a ─┤
     └─ c
"""

In the example above, we define Node b and c to be children of Node a with three lines of codes. We can also add attributes, such as the age attribute to nodes. To view the tree structure, we can use the show or hshow method.

We can also query the root, leaves, parent, children, ancestors, descendants, siblings, left_sibling, right_sibling, depth and max_depth of the nodes, as covered in the previous section.

The above method to define every node and edge can be manual and tedious. There are alternative ways to construct trees with lists, dictionaries and Pandas DataFrame.

If there are no node attributes, the simplest way to construct a tree would be using Python lists and the list_to_tree method.

from bigtree import list_to_tree

path_list = ["a/b/d", "a/b/e", "a/c"]

root = list_to_tree(path_list)

root.show()
"""
a
├── b
│   ├── d
│   └── e
└── c
"""

root.hshow()
"""
           ┌─ d
     ┌─ b ─┤
─ a ─┤     └─ e
     └─ c
"""

If there are node attributes, it’s best to construct a tree with a dictionary or Pandas DataFrame, using dict_to_tree and dataframe_to_tree methods respectively.

from bigtree import dict_to_tree

path_dict = {
    "a": {"age": 90},
    "a/b": {"age": 65},
    "a/c": {"age": 60},
    "a/b/d": {"age": 40},
    "a/b/e": {"age": 35},
}
root = dict_to_tree(path_dict)

root.show(attr_list=["age"])
"""
a [age=90]
├── b [age=65]
│   ├── d [age=40]
│   └── e [age=35]
└── c [age=60]
"""
import pandas as pd
from bigtree import dataframe_to_tree

path_data = pd.DataFrame(
    [
        ["a", 90],
        ["a/b", 65],
        ["a/c", 60],
        ["a/b/d", 40],
        ["a/b/e", 35],
    ],
    columns=["PATH", "age"],
)
root = dataframe_to_tree(path_data)

root.show(attr_list=["age"])
"""
a [age=90]
├── b [age=65]
│   ├── d [age=40]
│   └── e [age=35]
└── c [age=60]
"""

For more node attributes and methods and other data structures to tree methods, do refer to the bigtree Documentation.

 

Tree Traversal Algorithms in Python

There are two types of tree traversal, depth-first search (DFS) and breadth-first search (BFS),

  • Depth-first search starts at the root and explores each branch to its leaf node before moving to the next branch.
  • Breadth-first search starts at the root and explores every child node, and recursively does so for every node

Pre-Order Traversal (DFS, NLR)

Pre-order traversal is a DFS method that performs three steps recursively:

  1. Visit the current node (N).
  2. Recursively traversal the current node’s left subtree (L).
  3. Recursively traverse the current node’s right subtree (R).
Example of a tree in python
Fig. 2: Example of a tree in Python. | Image: Kay Jan Wong

For pre-order traversal, it will traverse the tree in Fig. 2 in the order:

['a', 'b', 'd', 'e', 'g', 'h', 'c', 'f']

Post-Order Traversal (DFS, LRN)

Post-order traversal is a depth-first search (DFS) method that performs three steps recursively:

  1. Recursively traversal the current node’s left subtree (L).
  2. Recursively traverse the current node’s right subtree (R)
  3. Visit the current node (N).
Example of a tree in python
Fig. 3: Example of a tree. | Image: Kay Jan Wong

For post-order traversal, it will traverse the tree in Fig. 3 in the order:

['d', 'g', 'h', 'e', 'b', 'f', 'c', 'a']

Level-Order Traversal (BFS)

Level-order traversal is a breadth-first search method.

For level-order traversal, it will traverse the tree in Fig. 3 in the order:

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

Level-Order Group Traversal (BFS)

Level-order group traversal is similar to level-order traversal, with the difference being that each level will be returned as a nested list; list[idx] denotes the items in depth idx + 1.

For level-order group traversal, it will traverse the tree in Fig. 3 in the order:

[['a'], ['b', 'c'], ['d', 'e', 'f'], ['g', 'h']]

There are other traversal algorithms available on bigtree, namely:

  • In-order traversal (DFS, LNR): Applicable to binary trees and not generic trees.
  • Zig-zag traversal (BFS): Similar to level-order traversal, but in a zig-zag manner across different levels.
  • Zig-zag group traversal (BFS): Similar to level-order group traversal, but in a zig-zag manner across different levels.

 

Python Tree Search Methods

We can use tree search methods to get one or multiple nodes that fulfill certain criteria, with methods find for one node and findall for multiple nodes.

from bigtree import Node, find, findall

root = Node("a", age=90)
b = Node("b", age=65, parent=root)
c = Node("c", age=60, parent=root)
d = Node("d", age=40, parent=b)

root.show(attr_list=["age"])
"""
a [age=90]
├── b [age=65]
│   └── d [age=40]
└── c [age=60]
"""

find(root, lambda node: node.age == 65)
# Node(/a/b, age=65)

findall(root, lambda node: node.age >= 60)
# (Node(/a, age=90), Node(/a/b, age=65), Node(/a/c, age=60))

For generic search methods without defining a lambda function, there are built-in methods:

  • find_attr and find_attrs: Find one/multiple nodes by attribute.
  • find_name and find_names: Find one/multiple nodes by name.
  • find_path and find_paths: Find one/multiple nodes by full or partial paths.
  • find_full_path: Find one node by their full path.
  • find_relative_path: Find nodes matching relative path, supports unix folder expression and wildcards.
  • find_child and find_children: Find one/multiple children nodes by custom condition, removes the need for searching the whole tree.
  • find_child_by_name: Find one child node by their name.

 

Tree Modification Methods in Python

bigtree supports cases where nodes must be shifted or copied from a location to a destination. For instance, we can shift and reorder nodes in a to-do list implementation.

from bigtree import Node, shift_nodes

root = Node("List")
groceries = Node("Groceries", parent=root)
urgent = Node("Urgent", parent=root)
groceries_milk = Node("Milk", parent=groceries)

shift_nodes(
    root,
    from_paths=["List/Groceries/Milk"],
    to_paths=["List/Urgent/Milk"],
)

root.show()
# List
# ├── Groceries
# └── Urgent
#     └── Milk

There are also other tree modifications methods, such as:

  • copy_nodes: Make a copy of the node from location to destination, node will exist in two locations.
  • shift_and_replace_nodes: Shift node from location to replace node in destination.
  • copy_nodes_from_tree_to_tree: Make a copy of the node from one tree to another tree. The node will exist in two trees.
  • copy_and_replace_nodes_from_tree_to_tree: Make a copy of the node from one tree and replace node in another tree.

 

Exporting Trees in Python

As mentioned at the start of the article, bigtree integrates seamlessly with Python dictionaries and Pandas DataFrame. Trees can be exported to a dictionary, nested dictionary, Pandas DataFrame and more formats.

Printing Tree to Console

Given a tree, we can print the tree to the console using print_tree with the ability to specify the attributes to print and the style of the tree. More customizations are also available in the bigtree Documentation.

from bigtree import print_tree

print_tree(root, attr_list=["age"], style="ascii")
"""
a [age=90]
|-- b [age=65]
|   |-- d [age=40]
|   +-- e [age=35]
|       |-- g [age=10]
|       +-- h [age=6]
+-- c [age=60]
    +-- f [age=38]
"""

For a generator method, you can use the yield_tree method instead.

Export Tree to Dictionary

Given a tree, we can export the tree to a dictionary using tree_to_dict with the ability to store all attributes with names as-is or map tree attributes to custom attribute names using a dictionary.

from bigtree import tree_to_dict

tree_to_dict(root, all_attrs=True)
# {
#     '/a': {'name': 'a', 'age': 90},
#     '/a/b': {'name': 'b', 'age': 65},
#     '/a/b/d': {'name': 'd', 'age': 40},
#     '/a/b/e': {'name': 'e', 'age': 35},
#     '/a/b/e/g': {'name': 'g', 'age': 10},
#     '/a/b/e/h': {'name': 'h', 'age': 6},
#     '/a/c': {'name': 'c', 'age': 60},
#     '/a/c/f': {'name': 'f', 'age': 38}
# }

tree_to_dict(root, attr_dict={"age": "AGE"})
# {
#     '/a': {'name': 'a', 'AGE': 90},
#     '/a/b': {'name': 'b', 'AGE': 65},
#     '/a/b/d': {'name': 'd', 'AGE': 40},
#     '/a/b/e': {'name': 'e', 'AGE': 35},
#     '/a/b/e/g': {'name': 'g', 'AGE': 10},
#     '/a/b/e/h': {'name': 'h', 'AGE': 6},
#     '/a/c': {'name': 'c', 'AGE': 60},
#     '/a/c/f': {'name': 'f', 'AGE': 38}
# }

The original tree can be reconstructed back using the dict_to_tree method.

Export Tree to DataFrame

Given a tree, we can export the tree to a DataFrame using tree_to_dataframe with the ability to store all attributes as columns with names as-is, or map tree attributes to custom column names using a dictionary.

from bigtree import tree_to_dataframe

tree_to_dataframe(root, all_attrs=True)
#        path name  age
# 0        /a    a   90
# 1      /a/b    b   65
# 2    /a/b/d    d   40
# 3    /a/b/e    e   35
# 4  /a/b/e/g    g   10
# 5  /a/b/e/h    h    6
# 6      /a/c    c   60
# 7    /a/c/f    f   38

The original tree can be reconstructed back using the dataframe_to_tree method.

Export Tree to Image

Given a tree, we can export the tree to an image or other graphics or files using tree_to_dot. This uses pydot under the hood, which uses Dot language and can be interfaced with Graphviz.

from bigtree import tree_to_dot

graph = tree_to_dot(root)
graph.write_png("tree.png")
graph.write_dot("tree.dot")
graph.to_string()

In the code snippet above, graph is of pydot.Dot data type, which has built-in implementation to write to dot, PNG, SVG file formats, and more. The output is similar to Fig. 3.

There are other export methods available, including:

  • tree_to_nested_dict: Export to nested dictionary, original tree can be reconstructed back using nested_dict_to_tree.
  • tree_to_newick: Export to Newick notation, original tree can be reconstructed back using newick_to_tree.
  • tree_to_mermaid: Export to mermaid Markdown text
  • tree_to_pillow: Export to Pillow image.

 

Using To-Do List With bigtree

If at this point you’re still wondering what you can do with bigtree, bigtree comes with a built-in to-do list workflow with the ability to import and export from a JSON file.

This to-do list implementation has three levels — app name, list name and item name. You can add lists to app or add items to list. For example:

from bigtree import AppToDo

app = AppToDo("To-Do App")
app.add_item(item_name="Homework 1", list_name="School")
app.add_item(item_name=["Milk", "Bread"], list_name="Groceries", description="Urgent")
app.add_item(item_name="Cook")

app.show()
# To Do App
# ├── School
# │   └── Homework 1
# ├── Groceries
# │   ├── Milk [description=Urgent]
# │   └── Bread [description=Urgent]
# └── General
#     └── Cook

app.save("list.json")
app2 = AppToDo.load("list.json")

In the example above:

  • App name refers to To-Do App.
  • List name refers to School, Groceries, and General.
  • Item name refers to Homework 1, Milk, Bread and Cook.

 

Extending to Trie in Python

Trie is a type of k-ary search tree used for storing and searching a specific key from a set, derived from the word reTRIEval. Trie can be used to sort a collection of strings alphabetically or search if a prefix is present for a string.

Example of a tree in python
Fig. 4: Example of Trie. | Image: Kay Jan Wong

To extend bigtree with Trie, we can add the leading symbol ^ and trailing symbol $ to each word, and use tree search methods to find a specific word or subword with find_path. A Trie can be constructed as such:

from bigtree import list_to_tree, find_path

bag_of_words = ["ant", "and", "angry", "buoy", "buoyant"]
list_of_paths = ["/".join(["^"] + list(x) + ["$"])  for x in bag_of_words]

list_of_paths
# [
#     "^/a/n/t/$",
#     "^/a/n/d/$",
#     "^/a/n/g/r/y/$",
#     "^/b/o/y/$",
#     "^/b/o/y/a/n/t/$"
# ]

trie = list_to_tree(list_of_paths)
find_path(trie, "/".join(list("^boy$")))

The code snippet above results in the image Fig. 4 if exported using the tree_to_dot method.

A tutorial on how to implement trees in Python. | Video: Computerphile

More on PythonRandom Forest Regression in Python Explained

 

Directed Acyclic Graph (DAG)

Directed acyclic graph (DAG) is a graph data structure where each node can have more than one parent. A tree is considered a restricted form of a graph. This results in the following differences:s

  • Root: There is no concept of root in DAG since a node can have multiple parents.
  • Depth: There is no concept of depth in DAG since there is no root node.
  • Height/Max depth: There is no concept of height in DAG since there is no root node.

DAGs are best used to represent workflows as workflows have a certain order (directed), and do not repeat indefinitely, i.e. it doesn’t not have loops (acyclic).

Similar to trees, DAGs in bigtree can be constructed manually, from Python lists, dictionaries or Pandas DataFrames with methods list_to_dag, dict_to_dag, and dataframe_to_dag respectively.

from bigtree import DAGNode, dag_to_dot

a = DAGNode("Ingest Data from Source A")
b = DAGNode("Ingest Data from Source B")
c = DAGNode("Clean data", parents=[a, b])
d = DAGNode("Feature Engineering", parents=[c])

graph = dag_to_dot(a, node_colour="gold")
graph.write_png("dag.png")

The above code snippet results in the following image:

Example of DAG
Fig. 5: Example of DAG. | Image: Kay Jan Wong

With that, I hope you have gained a better understanding of tree structures and how to implement them using the bigtree Python package.

Hiring Now
IG - tasty
Fintech • News + Entertainment • Software • Financial Services
SHARE