Python Pathlib Is Better Than the OS Module for Handling Files. Here’s How to Use It.

Pathlib is more convenient and straightforward.

Written by Artturi Jalli
Published on Sep. 16, 2022
Image: Shutterstock / Built In
Image: Shutterstock / Built In
Brand Studio Logo

In Python, traditionally the built-in OS module has been used for handling files. For instance, programmers can do bulk editing of files and edit files with specific patterns.

But there is an alternative library that can make things a lot easier for you: Pathlib, which is also a built-in Python library. With Pathlib, you can do all the basic file handling tasks that you did before, and there are some other features that don’t exist in the OS module. The key difference is that Pathlib is more intuitive and easy to use. All the useful file handling methods belong to the Path objects.

What Is Python Pathlib?

The Python pathlib module provides an easier method to interact with the filesystem no matter what the operating system is. It allows a more intuitive, more pythonic way to interface with file paths (the name of a file including any of its directories and subdirectories).

In the os module, paths are regular strings. This means you need to find separate functions even from different modules to perform actions on the paths. The latter is more inconvenient and time-consuming. It also makes the code less readable and manageable.

This guide teaches you how to use the Pathlib module in Python. You’ll see a bunch of examples that compare the os module and the Pathlib side by side.

Plenty of Python on BuiltIn.com4 Python Tools to Simplify Your Life

 

The Problems With the OS Module

The os module is a popular Python library. But the path-handling capabilities are somewhat awkward, as you’ll figure out in this guide.

Here are some key reasons why you might want to avoid using the os module altogether:

3 Reasons Not to Use the OS Module for Handling Files in Python

  1. The OS module is big.
  2. OS paths are raw strings.
  3. The OS module doesn’t allow for pattern matching.

 

1. The OS Module Is Big

The os module is a big Python library. Of course, there is the path submodule that lets you handle paths. But when it’s time to perform system-level operations on the paths, you’ll be lost, because the system-level operations reside in other parts of the module. Even worse, you might need to import some operations from other modules, such as the shutil module.

You will need to find separate methods for performing these file handling tasks, too:

  • To create a folder, you need os.makedirs.
  • To list the content in a folder, you need os.listdir.
  • To rename/delete files, you need os.rename.

Finding these separate methods is possible, but it takes some extra effort. Also, it makes the code more decoupled because you have to pick utilities from here and there. A better option would be to have a single path object that contains all the methods.

 

2. OS Paths Are Raw Strings

The os module uses string values to represent paths, which is quite limiting. Because the paths are strings, you don’t get direct access to properties and metadata. Moreover, you cannot do filesystem operations because you cannot call special methods on the paths.

As an example, if you want to find if a path exists, you can do:

os.path.exists(“/my/example/path”)

This works for sure. But it would be much easier to have a path object via which this information could be directly accessible.

For example:

pathObj.exists()

 

3. The OS Module Doesn’t Allow for Pattern Matching

You cannot use the os module to find file paths that match a pattern. For instance, if you wanted to find all files with names info.txt in a folder structure, you wouldn’t be able to do it.

To find all the files that match a pattern, you need to combine the os module with another module called glob. This is inconvenient, isn’t it? For sure you can get the job done with the two modules. But it would be much neater to do it with a single module.

So, there is a lot of inconvenience when it comes to using the os module. But the good news is that there is another file-management module that fixes the shortcomings. This is called the pathlib module.

More Python Tutorials5 Ways to Write More Pythonic Code

 

What Is Pathlib in Python?

Pathlib is a native Python library for handling files and paths on your operating system. It offers a bunch of path methods and attributes that make handling files more convenient than using the os module.

 

Pathlib vs OS

The key difference between pathlib and os module is the way in which they represent paths.

  • The os module represents paths as strings with which you cannot do much.
  • The pathlib module represents paths as special objects with useful methods and attributes.

Let’s inspect the current directory paths in both pathlib and os:

from pathlib import Path
import os

pathlib_cwd = Path.cwd()
os_cwd = os.getcwd()

print(type(pathlib_cwd))
print(type(os_cwd))

Output:

<class 'pathlib.PosixPath'>

<class 'str'>

Here you can see that the type of the pathlib path is pathlib.PosixPath and the os path is str (or string).

  • The pathlib.PosixPath comes with a whole bunch of useful methods and attributes.
  • But the path string is nothing but a regular string with which you cannot do much when it comes to handling files.

All in all, the pathlib module fixes the problems of the os module stated earlier. Now, let’s have a look at some of the key features of the pathlib module.

Python Primers #3 - os vs pathlib modules for path/filesystem operations

 

Key Features of the Python Pathlib and OS Modules Side by Side

Pathlib vs. OS Module: Table of Contents

  1. Show the current directory.
  2. Check if a file exists.
  3. Create a directory.
  4. Create an existing directory.
  5. Show directory contents.

 

1. Show the Current Directory

Let’s start by learning how to check the current directory using os and pathlib modules.

 

OS Module

To get the current directory using the os module:

  1. Import the os module.
  2. Call the os.getcwd() method. (CWD stands for the current working directory.)
import os

print(os.getcwd())

 

Pathlib

To see the current directory path using the pathlib module:

  1. Import the Path object from the pathlib module.
  2. Call the cwd() method on the Path object.
from pathlib import Path

print(Path.cwd())

 

2. Check If a File Exists

Checking if a file exists is one of the basic functionalities when it comes to handling files. While both the os and pathlib modules make this task easy, pathlib makes it more convenient.

 

OS Module

To check if a file path exists using the os module, you need to call the os.path.exists() method.

For example:

os.path.exists(‘example/info.txt’)

 

Pathlib

In pathlib, checking the existence of a file is more convenient. You can directly call the exists() method on the Path object instead of invoking a separate function.

As an example:

Path(‘example/info.txt’).exists()

 

3. Create a Directory

Let’s create sample directories by using the os and pathlib modules.

 

OS Module

To create a new directory using the os module, call the os.mkdir() function. Pass the path of the new folder/file as the argument.

For example:

os.mkdir(‘example_dir’)

 

Pathlib

To create a new directory using the pathlib module, create a Path object with the destination of the file/folder as a string argument. Then call the mkdir() function on the Path object.

For example:

Path(‘example_dir’).mkdir()

By looking at these examples, it’s hard to say which approach looks cleaner or easier, so to speak. But when you create a directory that already exists, you’ll notice the difference.

 

4. Create an Existing Directory

To create a new dictionary using the os module, you have to check that a directory with that path doesn’t already exist:

if not os.path.exists(‘example_dir’):

   os.mkdir(‘example_dir’)

If you don’t do the check, a FileExistError will crash your program.

But with the pathlib module, all you need to do is mark the exist_ok flag True as an argument to the mkdir() function.

Path(‘example_dir’).mkdir(exist_ok=True)

When you set this flag to True, the program ignores the FileExistError automatically. This saves you one line of code. Besides, you only need to call one method that belongs to the Path object. Convenient, isn’t it?

 

5. Show Directory Contents

Showing directory contents is another task where the pathlib does a better job than the os module.

 

OS Module

To check the contents of a directory using the os module, you need to call a separate os.listdir() function with the directory path as an argument.

For instance:

os.listdir(‘example_data’)

 

Pathlib

But to check the contents of a directory by using pathlib, you only need to call the iterdir() method of the Path object.

For example:

Path(‘example_data’).iterdir()

Notice that this method returns an iterator object. But you can easily convert it to a list with the list() function:

list(Path(‘example_data’).iterdir())

In case you’re wondering, the iterator object is good because it allows for efficient iteration through directories with huge number of paths. Instead of storing all the paths in memory, an iterator loops through them without storing them anywhere.

More Software Engineering Perspectives on BuiltIn.comPseudocode: What It Is and How to Write It

 

Useful Pathlib Methods and Attributes

Last but not least, let’s list some useful methods and attributes that belong to the pathlib Path objects.

  • exists(). This method checks if a file exists on the filesystem. You already saw an example of using this method earlier!
  • .is_dir(). This method checks if the path represents a dictionary.
  • .is_file(). This method checks if the path represents a file.
  • .is_absolute(). This method checks if the path is an absolute path.
  • .chmod(). Change the file mode and permissions.
  • .is_mount(). Checks if the path is a mount point.
  • .suffix. Get the file extension (such as .jpeg, .png, or .pdf)
Explore Job Matches.