Whether we are searching for a specific file or building directory trees, we need to access the file system programmatically, and that’s where we need to know which directory contains which files and folders. Based on this list of data, we can perform some file management operations.
Here are the five ways to print a list of files in a directory and its subdirectories in Python:
- Using os.walk()
- Using the pathlib module
- Using os.scandir()
- Using glob.glob()
- Using os.listdir()
Here is the directory “output”, whose files we will print:
Using os.walk()
The os.walk() function recursively walks through a directory tree, yielding (dirpath, dirnames, filenames) at each level, including the root directory. It helps traverse through all subdirectories.
import os for dirpath, dirnames, filenames in os.walk('output'): for filename in filenames: print(os.path.join(dirpath, filename)) # output/data_17-05-28.csv # output/file_20250421-164734.txt # output/session_20250421-175530_ffcb6d.log # output/data_2025-04-21_08-36-32_EDT.py # output/sensor_data_20250421_17.json # output/log_20250421-171139-810.log
The main advantage of this approach is that we don’t need to install any extra libraries to get recursive. We are using its built-in recursiveness by default, which makes it a little faster and efficient.
Using the “pathlib” module
If you’re using Python 3.4 or later and prefer the object-oriented (OO) style of the pathlib API.
Pathlib is a modern, object-oriented approach for managing filesystem paths, and we can use its pathlib.Path.rglob() method to traverse the directory and yields Path objects matching a pattern recursively.
Let’s take an example of a directory that contains files and another directory that also contains files.
Here is the current screenshot of the directory “parent”:
from pathlib import Path def list_files(directory): path = Path(directory) # rglob pattern '*' matches all files and directories for file_path in path.rglob('*'): if file_path.is_file(): # Check if it's a file print(file_path) list_files("parent") # Output: # parent/file1.txt # parent/report.csv # parent/app.log # parent/child/next.txt
From the above output, you can see that we walked through the directory and its subdirectories, printing the files in the console.
Using os.scandir()
The os.scandir() is similar to os.listdir(), but it returns directory entries along with file attribute information, which can make it more efficient.
But like os.listdir(), it doesn’t inherently traverse subdirectories.
import os def list_files(dir): with os.scandir(dir) as entries: for entry in entries: if entry.is_file(): print(entry.path) elif entry.is_dir(): list_files(entry.path) list_files("output") # output/data_17-05-28.csv # output/file_20250421-164734.txt # output/session_20250421-175530_ffcb6d.log # output/data_2025-04-21_08-36-32_EDT.py # output/sensor_data_20250421_17.json # output/log_20250421-171139-810.log
When you need maximum speed on huge directories, os.scandir() is the most helpful method. It is significantly faster than listdir() + stat loops. However, you need more boilerplate code to handle recursion and errors.
Using glob.glob()
The glob.glob() function finds pathnames matching a Unix shell-style wildcard pattern, with optional recursive **. By using the **, you can match files in all subdirectories.
If you prefer pattern-based searching, which is a simple one-liner pattern matching, this should be your go-to approach.
import glob import os for file in glob.glob('parent/**/*', recursive=True): if os.path.isfile(file): print(file) # parent/file1.txt # parent/report.csv # parent/app.log # parent/child/next.txt
Using os.listdir()
The os.listdir() returns a list of names (files + directories) in a single directory specified by the input path.
It doesn’t traverse subdirectories. For that, you will need additional logic to handle them.
For this section, let’s use a different directory for the example. Here is the screenshot of “newdir” directory:
import os def list_files(startpath): for root, dirs, files in os.walk(startpath): for file in files: print(os.path.join(root, file)) list_files("newdir") # Output: newdir/filename.txt
This approach is old-school, and I cannot recommend it for production due to its verbosity and inefficiency.
That’s it!