File and directory-related operations are basic skills for software engineers. This isn’t just copying one file into another folder on your Windows File Explorer, rather it’s understanding how to conduct automatic batch operations using software functions.
What Is a Directory in Python?
A directory is a digital structure used to store and organize files on a computer. Python modules can be used to manipulate computer directories and the files they store.
This is a big topic. Today we will dive into one specific problem: How to list all file names under a specific directory. In Python, a directory contains a group of files and subdirectories.
I’ll introduce five ways to list and access files in a Python directory. Each of these methods are used in different scenarios.
How to List Files in a Python Directory
- Use
os.listdir()
to print all files. - Use
os.walk()
to access files deeper into a decision tree. - Use the
glob
module to search by regular expression. - Use the
pathlib
module to generate all path file names. - Use the
os.scandir()
function to return a generator.
How to List Files in a Python Directory
1. Use os.listdir() to Print All Files
One way to list files in a Python directory is to use the os.listdir()
method, which is from Python’s OS module:
>>> import os
>>> os.listdir()
The above code will print the names of all files and directories under the current path. If you would like to print the results based on another path, just give the os.listdir()
function an argument:
>>> os.listdir(myPath)
If you only want to print all files, the os.path.isfile()
will give you a hand:
>>> import os
>>> files = [f for f in os.listdir() if os.path.isfile(f)]
For directories, there is also a function named os.path.isdir()
:
import os
files = [f for f in os.listdir() if os.path.isdir(f)]
It’s simple and useful, but what if it returns a large list? Or what if you only need a specific type of file? Fortunately, Python provides you with plenty of options for more complex scenarios.
2. Use os.walk() to Access Files Deeper in a Directory Tree
You can also list files in a Python directory using walk()
, another method from the OS module.
As its name implies, it can “walk” through a directory tree layer by layer. When you call the os.walk()
method, it will return a generator. Every time you call the next()
method to generate its next value, it will go one layer deeper. The result will be a tuple that includes three items: (dirpath, dirnames, filenames)
.
For example, if you want to get the names of all folders in the second layer, your code will be as follows:
from os import walk
f = []
layer = 1
w = walk("/Users/yang")
for (dirpath, dirnames, filenames) in w:
if layer == 2:
f.extend(dirnames)
break
layer += 1
3. Use Glob Module to Search by Regular Expressions
Instead of retrieving the names of all files, sometimes you might want to get the names of a specific type of file. Since the glob
module is able to add regular expressions in a search, it will be your friend for this type of operation:
>>> import glob
>>> glob.glob("/sys/*.log")
The above code will list the file names ending with “log”
.
4. Use the Pathlib Module to Generate All Path File Names
Since Python 3.4, there is a module called pathlib
, which is helpful as well. With the help of list comprehension tricks, we can use one line of code to generate all file names of the current path:
import pathlib
files = [f for f in pathlib.Path().iterdir() if f.is_file()]
Surprisingly, the Path()
comes with the glob()
function, as well. There’s no need to import the glob
module explicitly on the top of your Python file.
import pathlib
files = [f for f in pathlib.Path().glob("/sys/*.log")]
5. Use the os.scandir() Function to Return a Generator
The classic os.listdir()
function is intuitive but not efficient for large directories that contain a huge amount of files. Therefore, Python 3.5 introduced a new similar function — os.scandir()
.
>>> a=os.scandir()
>>> next(a)
<DirEntry 'test1.py'>
>>> next(a)
<DirEntry 'test2.py'>
Yes, you probably guessed it. This function will return a generator instead of a list of all names. And you can get names as you need. It’s more efficient in situations where you don’t need to get all of the names at once.
Frequently Asked Questions
How do you access files in Python?
The Python open() function can be used to access and open text or binary files on a computer. This function also lets you specify whether you want to read a file, append new data into an existing file or write data into a brand new file.
How do I get a list of files in a directory in Python?
A list of files can be retrieved from a directory using the following Python modules:
- os.listdir() - to print all files and directories under the current path (file location)
- os.walk() - to access files from a directory layer by layer
- glob module - to search files by regular expression
- pathlib module - to generate all file names of the current path
- os.scandir() - to return a Python generator of files
What is a directory in Python?
A directory is a digital structure that stores and organizes files, other directories or a combination of both on a computer. Directories follow a hierarchical structure known as a directory tree, which starts at the root directory and branches into lower levels of files and/or subdirectories. Python can be used to directly manage and manipulate directories present on a computer system.
How to get the list of folders in Python?
The os.listdir() function in Python can be used to get a list of directories (also known as folders) and files within a directory. The os.walk() function can also be used to retrieve folders and their files level by level in a directory tree.
How do I see all directories in Python?
The os.listdir() function in Python can be used to see all directories, either from a specified path or from the entirety of the current working directory on a computer system.