How to Write Pythonic Code

Pythonic describes code that doesn’t just get the syntax right but uses the language in the way it’s intended to be used. Here’s how to optimize your Python code.

Written by Xiaoxu Gao

Published on Dec. 06, 2023

Two developers reviewing python code on computer

Image: Shutterstock / Built In

Every programming language has its own idioms defined by its users. In the Python community, Pythonic describes code that doesn’t just get the syntax right but uses the language in the way it’s intended to be used. It improves the overall code quality from a maintainability, readability and efficiency perspective. It also creates a pattern of code that allows the entire development team to focus on the true essence of the problem rather than deciphering the code. For a library to be Pythonic is to make it natural for a Python developer to use in their codebase. Remember, code is read more often than it’s written.

What Is Pythonic Code

Pythonic is a term used in Python that defines code that uses both the correct syntax and is clean, readable and easily maintained. Pythonic code has a number of advantages, including enabling the development team to focus on the problem at hand and not deciphering existing code.

But what does it actually mean by Pythonic? This sounds like a vague concept. How am I going to crack the Python interview by showing them “authentic” Python code? In this article, we’ll examine eight widely promoted Pythonic features that will bring your code to the next level. They are primarily for Python beginners who want to quickly improve their skills, but there are a couple of tips for intermediates and experts, too. I’ll also touch tips for writing a Pythonic library or framework and share some good, free resources for your own self-learning. Here’s what to expect:

Pythonic code and The Zen of Python
What Is PEP8 in Python?
Value Swapping and Multiple Assignment
Passing multiple arguments (*args and **kwargs)
Comprehension
Underscores
Context manager
Generator
Namespace and scope
Mutable default argument
How to write a Pythonic library
Other free resources

Pythonic Code and the Zen of Python

This article wouldn’t be complete if I didn’t start with “The Zen of Python.” You can find it at any given time by typing import this. It’s a summary of 19 guiding principles for writing Python code. I would consider it a mindset rather than an actual syntax guideline. Nevertheless, the philosophy in this poem has influenced tons of Python developers globally.

Zen of Python poem. | Screenshot: Xiaoxu Gao — The Zen of Python poem. | Screenshot: Xiaoxu Gao

The examples I’m going to show you later follow this philosophy. Please read it through. I will convey some of the core concepts to you so you are ready for the examples.

More on PythonAn Introduction to Support Vector Machine (SVM) in Python

3 Pythonic Tips to Know

There are three important characteristics for Pythonic code: simplicity, clarity and readability. I put these three characteristics in the same bucket because, taken together, it means writing simple and clean code that everybody understands. You can interpret it in many different ways. In the poem, “flat is better than nested” means don’t have too many sub-categories (modules/packages) in your project. “Sparse is better than dense” means don’t cram too many functions in one line of code, 79-characters rule will break the line anyway. With that in mind, below are some additional tips for writing Pythonic code.

1. Breaking the Rules Can Lead to More Pythonic Code

Remember, it’s OK to break rules.

Python is less strict than other programming languages like Java in terms of structure. You can write pure procedures like script or object-oriented paradigm like Java, or both. The point is you don’t have to put your code into shoes that aren’t right for you. Adhering to rules too much can result in highly-abstract and boilerplate code.

2. Pay Attention to Error Handling

Errors should not be silently passed. It’s better to fail fast and catch the errors than to silence the error and continue the program. Bugs become harder to debug when they’re far away from the original place, thus raising the exception now instead of later.

3. Prioritize the Obvious Approach in Python

Although it’s written as a guideline, I feel it’s really hard to achieve this in Python. Python is considered a flexible programming language that’s supported by a large community. This means people can come up with new ideas on the existing solutions everyday. However, the main message it tries to send is that it isn’t worth the effort to learn every possible way. The community has already made some efforts to standardize the formats, which I will talk about in a second.

Tips on how to write Pythonic code. | Video: Next Day Video

What Is PEP8 in Python?

Python is a flexible language without too many restrictions on the formatting. That’s how PEP8 came into the picture. You are welcome to write Python code any way you want as long as it’s valid. However, using a consistent style makes your code easier to read and maintain. PEP8 provides a rich list of items that are worth checking out.

Some well-known linters like Flake8 and Pylint can spot the issues before you push the code, thus saving review time for your co-workers. Libraries like Black can even automatically fix formatting issues. A common practice is to integrate these tools into your IDE (e.g. vscode) and CI/CD pipeline.

6 Python Features to Know to Write Pythonic Code

Python contains a number of features that allow you to write simpler code. These include:

Value swapping and multiple assignment.
Passing multiple arguments
Comprehension.
Underscores.
Context manager.
Generators.

1. Value Swapping and Multiple Assignment

You’ve probably seen this question before: “How to swap two bottles of water?” The answer is by getting the third empty bottle. In most languages, you need an extra variable to swap the values.

In Python, life is easier. You can swap two values like this:

a = 1
b = 2
a, b = b, a

It looks like magic. The line a,b=b,a is called an assignment in which on the right side is an expression, and on the left side is a couple of variables. The expression b,a on the right side is actually a tuple. Don’t believe me? Try this out in a terminal:

>>> 1,2
(1, 2)

Parentheses are not really necessary in a tuple.

Python supports multiple assignments, meaning there could be multiple variables on the left side and each of them is assigned to a value in the tuple. This is also called an unpacking assignment. Another example of unpacking assignment is list:

fruits = ["apple", "banana"]
f1,f2 = fruits

The outcome would be f1="apple", f2="banana".

By doing so, you can easily, elegantly and naturally assign variables without boilerplate code.

2. Passing Multiple Arguments (*args and **kwargs)

Related to the previous point, Python allows you to pass multiple arguments to a function without having them defined in the function. An example could be a function that sums up a few numbers, but the size of numbers is unknown.

A naive approach is to create a list variable as the input of the function.

def calculate(values):
    for val in values:
        ....
calculate([1,2,3,4])

However, in Python, you can have an interface without providing a list.

def calculate(*values):
    for val in values:
        ....
calculate(1,2,3,4)
calculate(*[1,2,3,4]) # this works too

*values is equal to (1,2,3,4) which is a tuple (an iterable), and the logic inside the function can remain the same.

Similar to *args, **kwargs accepts named arguments and will unpack them into key, value pairs. This is useful when you have a bunch of optional arguments that have different meanings on their own. In this example, a house can be composed of different types of rooms. If you don’t like having too many arguments, you can always provide a dictionary instead.

def build_house(**kwargs):
    for room,num in **kwargs:
        ...
build_house(bedroom=2,kitchen=1,bathroom=1,garden=1)
build_house(bedroom=2,kitchen=1,bathroom=2,storage_room=1)

Another interesting thing with unpacking is that you can easily merge two lists or a dictionary.

first = [1,2,3]
second = [4,5,6]
result = [*first, *second] 
# [1,2,3,4,5,6]
first = {"k1":"v1"}
second = {"k2":"v2"}
result = {**first, **second}
# {"k1":"v1", "k2":"v2"}

3. Comprehension

Comprehension is cool. That was my first impression of it. Comprehension is used to create data structures in a single instruction instead of multiple operations. A classic example is to convert a for loop into one line of code.

result = []
for i in range(10):
    result.append(i**2)
# use list comprehension
result = [i**2 for i in range(10)]

Comprehension performs better because it has less operations. There’s no need to execute .append() for every item. In complex functions, comprehension can clearly reduce the line of code and makes it easy for readers to understand. Another approach is to use a Lambda expression. The same expression can be written like this:

result = list(map(lambda x:x**2, [i for i in range(3)]))

But, don’t force your code to be a one-liner if it creates convoluted expressions. The book Clean Code in Python offers a good example of this. The collect_account_ids_from_arns function receives a list of values and then parses, matches and finally adds them into collected_account_ids.

This is the naive solution with a for loop.

def collect_account_ids_from_arns(arns):
    collected_account_ids = set()
    for arn in arns:
        matched = re.match(ARN_REGEX, arn)
        if matched is not None:
            account_id = matched.groupdict()["account_id"]
            collected_account_ids.add(account_id)
    return collected_account_ids

This is the version with comprehension:

def collect_account_ids_from_arns(arns):
    matched_arns = filter(None, (re.match(ARN_REGEX, arn) for arn in arns))
    return {m.groupdict()["account_id"] for m in matched_arns}

Another, even more compact version is using the walrus operator. This example pushes the code to an actual one-liner. But this isn’t necessarily better than the second approach.

def collect_account_ids_from_arns(arns):
    return { matched.groupdict()["account_id"] for arn in arns if (matched := re.match(ARN_REGEX, arn)) is not None }

Comprehension can simplify the code and improve the performance, but taking into consideration the readability is also imperative.

4. Underscores

There is more than one way of using underscore in Python. Each type represents different characteristics of the attribute.

By default, all the attributes of an object are public. There is no private keyword that prevents you from accessing an attribute. Python uses an underscore in front of the function name (e.g. def _build() ) to delimit the interface of an object. Attributes starting with underscore should be respected as private and not be called externally. Private methods/attributes of a class are intended to be called only internally. If the class gets too many internal methods, it could be a sign that this class breaks the single responsibility principle. Perhaps you want to extract some of the responsibilities to other classes.

Another Pythonic feature with underscore is magic methods. Magic methods are surrounded by double underscores like __init__. Fun fact, according to The Original Hacker’s Dictionary, magic means: “A feature not generally publicized which allows something otherwise impossible.”

The Python community adopted this term after the Ruby community. Magic methods allow users to have access to the core features of the language to create rich and powerful objects. Being an expert on magic methods levels up your client with clean code. Sounds abstract? Let’s look at an example:

class House:
    def __init__(self, area):
        self.area = area
    def __gt__(self, other):
        return self.area > other.area
house1 = House(120)
house2 = House(100)

By overwriting the magic method __gt__ , the client who uses the class House can compare two houses with house1 > house2, instead of something like house1.size() > house2.size().

Another example is to change the representation of a class. If you print house1, you will get a Python object with an ID.

print(house1)
# <__main__.House object at 0x10181f430>

With magic method __repr__, the print statement becomes more self-explained. Magic methods hide implementation details from the client, and it gives developers the power to change its original behaviors.

def __repr__(self) -> str:
    return f"This house has {self.area} square meters."
print(house1)
# This house has 120 square meters.

Although using underscore is very common, you shouldn’t define attributes with leading double underscores or define your own magic method. It’s not Pythonic and will just confuse your peers.

5. Context Manager

Context manager is a useful feature that can help you in situations where you want to run things before and after certain actions. Resources management is a good use case for this. You want to make sure files or connections are closed after the processing.

In Python, you can use two approaches to allocate and release resources:

Use try .. finally block.
Use with construct.

For example, I want to open a file, read the content and then close it. This is how it looks like using try .. finally. finally statement guarantees that the resources are closed properly no matter what happens.

f = open("data.txt","r") 
try:
  text = f.read()
finally:
  f.close()

Nonetheless, you can make it more Pythonic using with statement. As you can see, a lot of boilerplate code is eliminated. When you use with statement, you enter a context manager, which means the file will be closed when the block is finished, even if an exception occurred.

with open("data.txt", "r") as f:
  text = f.read()

How does that happen? Any context manager consists of two magic methods: __enter__ and __exit__. The with statement will call the method __enter__ and whatever it returns will be assigned to the variable after as. After the last line of the code in that block finishes, Python will call __exit__ in which the resource is closed.

Let’s say I want to create a database handler for the backup. In general, we are free to implement a context manager with our own logic. The database should go offline before the backup and restart after the backup. Below are three different ways to implement a context manager.

1. Create a Context Manager Class.

In this example, nothing needs to be returned in the __enter__ sector and this is OK. The __exit__ sector receives the exceptions raised from the block. You can decide how to handle the exception. If you do nothing, then the exception will be raised to the caller after the resource is properly closed. Or you can handle exceptions in __exit__ block based on the exception type. But the general rule is not silently swallowing the errors. Another general tip is don’t return True in __exit__ block unless you know what you are doing. Returning True will ignore all the exceptions, and they won’t be raised to the caller.

def stop_db():
  # stop database
def start_db():
  # start database
def backup_db():
  # backup database
class DatabaseHandler:
  def __enter__(self):
    stop_db()
def __exit__(self, exc_type, ex_value, ex_traceback):
    start_db()
with DatabaseHandler():
  backup_db()

2. Use `contextmanager` Decorator.

You don’t have to create a class each time. Imagine you want to turn existing functions into context managers without refactoring the code too much. In that case, you can make use of the decorator. Decorator is another topic on its own. But what it essentially does is to turn the original function into a generator. Everything before the yield will be part of __enter__, the yielded value becomes the variable after as. In this example, nothing needs to be yielded. In general, if you just need a context manager function without preserving too many states, this is a better approach.

import contextlib
@contextlib.contextmanager
def db_handler():     
  try:         
    stop_db()         
    yield     
  finally:        
    start_db()
with db_handler():     
  db_backup()

3. Create a Decorator ClassBased on `contextlib.ContextDecorator`

The third option is to create a decorator class. Instead of using the with statement, which you still can, you use it as a decorator on top of the function. This has the advantage that you can reuse it as many times as you want by simply applying the decorators to other functions.

class db_handler_decorator(contextlib.ContextDecorator):
  def __enter__(self):
    stop_db()
  def __exit__(self, ext_type, ex_value, ex_traceback):
    start_db()
@db_handler_decorator()
def db_backup():
  # backup process

In general, you should at least understand the context manager’s working principle, even if you are a beginner. As an intermediate or expert, get your hands dirty with it and try to create a few context managers from scratch to discover its more nitty gritty.

6. Generator

In the previous item, I touched upon a concept called generator, which is also a peculiar feature that differentiates Python. Generator is an iterable which has a next() method defined. But the special thing is, you can only iterate it once because they don’t store all the values in memory.

Generator is implemented as a function, but instead of using return like a regular function, it uses yield.

def generator():
  for i in range(10):
    yield i**2
print(generator)
# <function generator at 0x109663d90>

You will see this being used a lot in asyncio, as coroutine is essentially a generator. One of its advantages is reducing memory usage, which could have a huge impact on big data sets. Let’s say I want to do some calculations for 1 million records.

This is how you’d do it before knowing yield. The problem is you have to store the result of all 1 million records in memory.

def calculate(size):
  result = []
  for i in range(size):
    result.append(i**2)
  return result
for val in calculate(1_000_000):
  print(val)

This is an alternative using yield. The result is only calculated when it’s its turn, saving a lot of memory usage.

def calculate(size):
  for i in range(size):
    yield i**2
for val in calculate(1_000_000):
  print(val)

Generator is also the secret behind lazy evaluation.

Namespace and Scope in Python

As the last line of the Zen of Python, let’s talk about namespace and scope in Python. A namespace is a system in Python to make sure that all the names (attributes, functions, classes, modules) are unique in the program. Namespaces are managed as a dictionary in Python, where the keys are object names and the values are objects themselves.

Generally speaking, there are four types of namespaces in Python: Python built-in, global, enclosing and local ordered by the hierarchy. This graph is also called the LEGB rule. The interpreter first searches for the name in local, then enclosing, then global, finally in built-in, meaning a name in low level (e.g. Local) will overwrite the same name in higher level (e.g. Enclosing).

LEGB rule hierarchy illustration. — LEGB rule hierarchy illustration. | Image: Xiaoxu Gao

How does this affect our coding? Most of the time, if you just follow LEGB rule, you don’t have to do anything special. Given an example here. What is the output?

val = 1
def outer():
  val = 2
  
  def inner():
    val = 3
    print(val)
  
  inner()
print(outer())
print(val)

According to the LEBG rule, the lower level should overwrite the higher level. In function inner(), val has a value 3, so calling function outer() will return 3. However, if you just print out val as print(val) does, you will get 1 because you are currently outside the function and trying to access the global value val = 1.

But if you want to modify a global value from lower levels, this is possible with global keyword. What you need is to add global val at the point where you want to change the global value.

val = 1
def outer():
  val = 2
  
  def inner():
    global val
    val = 3
    print(val)
  
  inner()
print(outer()) # 3
print(val) # 3

It’s only a declaration, syntax like global val = 3 is not correct. An alternative is globals()[“val”] = 3.

Mutable Default Argument in Python

Last but not least, I want to show you a Pythonic caveat which you might think is a bug, but is actually a feature. Despite the fact that it’s confusing, it’s still a Pythonic feature that everyone must get along with.

Consider the following example. The function add_to_shopping_cart adds food to shopping_cart. shopping_cart is by default an empty list if it isn’t provided. In this example, calling the function twice without providing shopping_cart should expect two lists with one element each.

def add_to_shopping_cart(food, shopping_cart = []):
  shopping_cart.append(food)
  return shopping_cart
print(add_to_shopping_cart("egg"))
# ["egg"]
print(add_to_shopping_cart("milk"))
# ["egg","milk"]

But this is what actually happened. The explanation is — the variable shopping_cart is created only once on the definition of the function, which is the first moment this function is called. From that point on, Python interpreter will use the same variable every time the function is called. This means whenever the value is changed, Python will pass it to the next call instead of recreating it with the default value.

The fix is simple, use None as the default sentinel value and assign the actual default value [] in the body of the function. Because of the namespace and local scope, shopping_cart will be recreated every time it is None.

def add_to_shopping_cart(food, shopping_cart=None):
  shopping_cart = shopping_cart or []
  shopping_cart.append(food)
  return shopping_cart
print(add_to_shopping_cart("egg"))
# ['egg']
print(add_to_shopping_cart("milk"))
# ["milk"]

My rule of thumb is do not mutate mutable default arguments unless you know what you are doing.

A tutorial on Python aesthetics. | Video: Next Day Video

More on PythonGuide to Python Socket Programming

How to Write a Pythonic Library

What has been discussed so far is all about each individual Python feature. When it comes to writing a Python library or framework, we should also think about how to design a Python API. Besides following common Python idioms, the interface aimed to be used by others is in general smaller and more lightweight than other languages. It’s considered not Pythonic if the library reinvents the wheels too much. Thinking about “only one way to do it,” it’s preferred to install the other third party package into your library.

Another general tip is, don’t write boilerplate code just for the sake of following design patterns like Java. An example is how to write a singleton in Python.

There are a lot more to say beyond each item and many other items not included in the article. I hope this can inspire you to revisit your Python code.

Recent Data Science Articles

Data Science vs. Computer Science: What’s the Difference?

What Is Skewed Data? How It Affects Statistical Models.

What Is Bootstrapping Statistics?