File and directory-related operations are basic skills for software engineers. This isn’t just copying one file into another folder on your Windows File Explorer, rather it’s understanding how to conduct automatic batch operations using software functions.
How to List Files in a Python Directory
os.listdir()to print all files.
os.walk()to access files deeper into a decision tree.
- Use the
globmodule to search by regular expression.
- Use the
pathlibmodule to generate all path file names.
- Use the
os.scandir()function to return a generator.
This is a big topic. Today we will dive into one specific problem: How to list all file names under a specific directory. In Python, a directory contains a group of files and subdirectories.
I’ll introduce five ways to list and access files in a Python directory. Each of these methods are used in different scenarios.
5 Methods to List Files in a Python Directory
1. Use os.listdir() to Print All Files
One way to list files in a Python directory is to use the
os.listdir() method, which is from Python’s OS module:
>>> import os >>> os.listdir()
The above code will print the names of all files and directories under the current path. If you would like to print the results based on another path, just give the
os.listdir() function an argument:
If you only want to print all files, the
os.path.isfile() will give you a hand:
>>> import os >>> files = [f for f in os.listdir() if os.path.isfile(f)]
For directories, there is also a function named
import os files = [f for f in os.listdir() if os.path.isdir(f)]
It’s simple and useful, but what if it returns a large list? Or what if you only need a specific type of file? Fortunately, Python provides you with plenty of options for more complex scenarios.
2. Use os.walk() to Access Files Deeper in a Directory Tree
You can also list files in a Python directory using
walk(), another method from the OS module.
As its name implies, it can “walk” through a directory tree layer by layer. When you call the
os.walk() method, it will return a generator. Every time you call the
next() method to generate its next value, it will go one layer deeper. The result will be a tuple that includes three items:
(dirpath, dirnames, filenames).
For example, if you want to get the names of all folders in the second layer, your code will be as follows:
from os import walk f =  layer = 1 w = walk("/Users/yang") for (dirpath, dirnames, filenames) in w: if layer == 2: f.extend(dirnames) break layer += 1
3. Use Glob Module to Search by Regular Expressions
Instead of retrieving the names of all files, sometimes you might want to get the names of a specific type of file. Since the
glob module is able to add regular expressions in a search, it will be your friend for this type of operation:
>>> import glob >>> glob.glob("/sys/*.log")
The above code will list the file names ending with
4. Use the Pathlib Module to Generate All Path File Names
import pathlib files = [f for f in pathlib.Path().iterdir() if f.is_file()]
Path() comes with the
glob() function, as well. There’s no need to import the
glob module explicitly on the top of your Python file.
import pathlib files = [f for f in pathlib.Path().glob("/sys/*.log")]
5. Use the os.scandir() Function to Return a Generator
os.listdir() function is intuitive but not efficient for large directories that contain a huge amount of files. Therefore, Python 3.5 introduced a new similar function —
>>> a=os.scandir() >>> next(a) <DirEntry 'test1.py'> >>> next(a) <DirEntry 'test2.py'>
Yes, you probably guessed it. This function will return a generator instead of a list of all names. And you can get names as you need. It’s more efficient in situations where you don’t need to get all of the names at once.