In Python, garbage collection is an automated memory management process in which an object is released when it’s no longer in use. If you’ve ever wondered how that works or had similar questions while learning Python, you’re in the right place.
What Is Garbage Collection in Python?
Garbage collection in Python is an automated memory management process that deletes objects when they are no longer in use. It uses two methods of garbage collection: reference counting and generational. Reference counting removes an object after it reaches zero, and generational garbage collection can break cyclic references to delete unused objects.
Throughout the article, I’ll seek to answer some common questions you may have about memory management and garbage collection in Python, including:
- How is memory managed in Python?
- What is garbage collection?
- Which algorithms are used for memory management?
- What is a cyclical reference?
- How are Python objects stored in memory?
I’ll start with the fundamentals.
How Memory Management Works in Python
Python is a dynamically typed language. We don’t declare the type of a variable when we assign a value to the variable in Python. It states the kind of variable in the runtime of the program. Other languages like C, C++ and Java, etc., there is a strict declaration of variables before assigning values to them.
As you can see below, we just assign a variable to an object and Python detects the type of the object.
What Are Python Objects?
Python objects have three things: Type, value and reference count. When we assign a name to a variable, Python automatically detects its type. Value is declared while defining the object. Reference count is the number of names pointing to that object.
How Are Python Objects Stored in Memory?
In C, C++ and Java we have variables and objects. Python has names, not variables. A Python object is stored in memory with names and references. A name is just a label for an object, so one object can have many names. A reference is a name(pointer) that refers to an object.
How Garbage Collection in Python Works
Garbage collection is the process of releasing memory when the object is no longer in use. This system destroys the unused object and reuses its memory slot for new objects. You can imagine this as a recycling system in computers.
Python has an automated garbage collection process. It has an algorithm to deallocate objects that are no longer needed. Python has two ways to delete the unused objects from the memory.
2 Methods of Garbage Collection in Python
There are two main ways that Python collects garbage: reference counting and generational garbage collection. Let’s look at each one:
1. Reference Counting
In reference counting, references are always counted and stored in memory.
In the example, we assign c
to 50. Even if we assign a new variable, the object is the same. The reference count increases by one. Because every object has its own ID, we print the IDs of objects to see if they are the same or different.
When we change the value of a
, we create a new object. Now, a
points to 60, b
and c
point to 50.
When we change a
to None
, we create a none object. Now the previous integer object has no reference. Garbage collection deletes it.
We assign b
to a boolean object. The previous integer object is not deleted because it still has a reference by c
.
Now, we delete c
. We decrease the reference count to c
by one.
As you can see above, del()
statement doesn’t delete objects, it removes the name (and reference) to the object. When the reference count is zero, the object is deleted from the system by the garbage collection.
Advantages and Disadvantages of Reference Counting
There are advantages and disadvantages of garbage collection by reference counting. For example, it’s easy to implement. Programmers don’t have to worry about deleting objects when they are no longer in use. However, this memory management is bad for memory itself. The algorithm always counts the reference numbers to the objects and stores the reference counts in the memory to keep the memory clean and make sure the programs run effectively.
Everything looks OK, but there’s still a problem.
The most important issue in reference counting garbage collection is that it doesn’t work in cyclical references.This is a situation in which an object refers to itself. The simplest cyclical reference is appending a list to itself.
Reference counting alone can’t destroy objects with cyclic references. If the reference count is not zero, the object can’t be deleted.
The solution to this problem is the second garbage collection method.
2. Generational Garbage Collection
Generational garbage collection is a type of trace-based garbage collection. It can break cyclic references and delete the unused objects even if they refer to themselves.
Python keeps track of every object in memory. Three lists are created when a program is run. Generation 0, 1, and 2 lists.
Newly created objects are put in the Generation 0 list. A list is created for objects to be discarded. Reference cycles are detected. If an object has no outside references, it’s discarded. The objects that survived this process are put in the Generation 1 list. The same steps are applied to the Generation 1 list. Survivals from the Generation 1 list are put in the Generation 2 list. The objects in the Generation 2 list stay there until the end of the program execution.
Importance of Garbage Collection in Python
Python is a high-level language, and we don’t have to manage the memory manually. Python’s garbage collection algorithm is very useful for opening up space in the memory. Garbage collection is implemented in Python in two ways: reference counting and generational. When the reference count of an object reaches 0, the reference counting garbage collection algorithm cleans up the object immediately.
If you have a cycle, the reference count doesn’t reach zero. You have to wait for the generational garbage collection algorithm to run and clean the object. While a programmer doesn’t have to think about garbage collection in Python, it can be useful to understand what is happening under the hood.