Here are the five ways to print a list of files in a directory and subdirectories in Python:
- Using os.walk()
- Using pathlib module
- Using os.listdir()
- Using os.scandir()
- Using glob.glob()
Here is the directory “dir”, whose files we will print:
Method 1: Using os.walk()
The os.walk() function is a generator that yields a tuple of (dirpath, dirnames, filenames) for each directory in the tree, including the root directory. It’s helpful for traversing through all subdirectories.
import os
for dirpath, dirnames, filenames in os.walk('dir'):
for filename in filenames:
print(os.path.join(dirpath, filename))
Output
dir/file.txt
dir/main.txt
dir/data.txt
Method 2: Using pathlib module
The pathlib is a modern, object-oriented approach for managing filesystem paths, and we can use it to print the file paths of specified directory.
from pathlib import Path
def list_files(directory):
path = Path(directory)
# rglob pattern '*' matches all files and directories
for file_path in path.rglob('*'):
if file_path.is_file(): # Check if it's a file
print(file_path)
list_files("dir")
Output
dir/file.txt
dir/main.txt
dir/data.txt
This method
Method 3: Using os.listdir()
The os.listdir() returns a list of the names of the entries in the directory given by path.
This method doesn’t traverse subdirectories, so you’ll need additional logic to handle them.
import os
def list_files(startpath):
for root, dirs, files in os.walk(startpath):
for file in files:
print(os.path.join(root, file))
list_files("dir")
Output
dir/file.txt
dir/main.txt
dir/data.txt
Method 4: Using os.scandir()
The os.scandir() is similar to os.listdir(), but it returns directory entries along with file attribute information, which can make it more efficient.
But like os.listdir(), it doesn’t inherently traverse subdirectories.
import os
def list_files(dir):
with os.scandir(dir) as entries:
for entry in entries:
if entry.is_file():
print(entry.path)
elif entry.is_dir():
list_files(entry.path)
list_files("dir")
Output
dir/file.txt
dir/main.txt
dir/data.txt
Method 5: Using glob.glob()
The glob.glob() function returns a list of pathnames matching a specified pattern.
By using the ** pattern, you can match files in all subdirectories.
import glob
import os
for file in glob.glob('dir/**/*', recursive=True):
if os.path.isfile(file):
print(file)
Output
dir/file.txt
dir/main.txt
dir/data.txt
Conclusion
Each of these methods has specific use cases. Choose the method that best fits your needs based on efficiency, simplicity, and the specific requirements of your task.
- The os.walk() is a straightforward way to traverse directories and subdirectories.
- The pathlib module is quite efficient, readable, and modern approach.
- The os.listdir() and os.scandir() are more manual but offer fine control.
- The glob.glob() is helpful for pattern matching in file paths.