AppDividend
Latest Code Tutorials

Python os.walk() Method: How to Traverse a Directory Tree

0

Python OS module provides methods for interacting with an operating system. The OS module comes under Python’s standard utility modules. Python OS module provides functions for creating and removing a directory, fetching files and their contents, changing and identifying the current directory, etc.

To work with the OS module in Python, you need to import it first. The os.walk() is one of the OS module’s most used methods, which we will discuss in this article.

Python os.walk() Method

To traverse the directories in Python, use the os.walk() function. The os.walk() function in Python generates the file names in the file index tree by walking a tree either top-down or bottom-up. For each directory in the tree rooted at the directory top, it generates a 3-tuple, which are dirpath, dirnames, filenames.

  1. root: It prints out directories only from what you specified. Meaning you need to specify the path suggesting where it should start walking and printing.
  2. dirs: It prints out sub-directories from the root.
  3. files: It prints out all files from root and directories.

Syntax

os.walk(top, topdown=True, onerror=None, followlinks=False)

Parameters

  1. top − Each directory rooted at the directory generates 3-tuples, for example, (dirpath, dirnames, filenames).

  2. topdown − This is an optional argument in which if it is True or not specified, the directories are scanned from top-down. If the topdown is applied to False, directories are considered from the bottom-up.

  3. onerror − This is an optional argument showing an error to continue with a walk or raise the exception to abort the walk.

  4. followlinks − This visits directories pointed to by symlinks, if set to True.

Return Value

It returns a 3-tuple (dirpath, dirnames, filenames).

Example

import os
if __name__ == "__main__":
    for (root, dirs, files) in os.walk('/Users/krunal/Desktop/code/pyt/database', topdown=True):
        print("The root is: ")
        print(root)
        print("The directories are: ")
        print(dirs)
        print("The files are: ")
        print(files)
        print('--------------------------------')

Output

The root is:
/Users/krunal/Desktop/code/pyt/database
The directories are:
['.vscode']
The files are:
['shows.csv', 'Netflix.csv', 'marketing.csv', 'new_file.json', 
'data.json', 'Netflix', 'shows.db', 'app.py', 'purchase.csv', 'final.zip', 'sales.csv']
--------------------------------
The root is:
/Users/krunal/Desktop/code/pyt/database/.vscode
The directories are:
[]
The files are:
['settings.json']
--------------------------------

By default, Python will walk the directory tree in a top-down order (a directory will be passed to you for the processing), and then Python will descend into any sub-directories.

When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to inform walk() about directories the caller creates or renames before it resumes walk() again.

Modifying dirnames when topdown is False is ineffective because in bottom-up mode, the directories in dirnames are generated before dirpath itself is generated.

If you change the value of topdown to False, then it will give you the following result.

import os
if __name__ == "__main__":
    for (root, dirs, files) in os.walk('/Users/krunal/Desktop/code/pyt/database', topdown=False):
        print("The root is: ")
        print(root)
        print("The directories are: ")
        print(dirs)
        print("The files are: ")
        print(files)
        print('--------------------------------')

Output

The root is:
/Users/krunal/Desktop/code/pyt/database/.vscode
The directories are:
[]
The files are:
['settings.json']
--------------------------------
The root is:
/Users/krunal/Desktop/code/pyt/database
The directories are:
['.vscode']
The files are:
['shows.csv', 'Netflix.csv', 'marketing.csv', 'new_file.json', 'data.json', 
'Netflix', 'shows.db', 'app.py', 'purchase.csv', 'final.zip', 'sales.csv']

As you can see from the path that returns the path, a list of directories, and a list of files from the bottom-up. By default, errors from the listdir() function are ignored. If the optional parameter onerror is specified, it should be the function; it will be called with one argument, an OSError instance.

It can report the error to continue with the walk or raise the exception to abort the walk. Note that the filename is available as the filename attribute of the exception object. That is it for the Python os.walk() function tutorial.

Leave A Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.