Pandas DataFrame filter() Method

Pandas DataFrame filter() method is used to subset rows or columns of a DataFrame based on specific label criteria.

Syntax

DataFrame.filter(items=None, like=None, regex=None, axis=None)

Parameters

Name Description
items It is a list of labels to filter by; it returns columns or rows with these labels.
like A string to filter columns or rows based on a string contained in the label.
regex A string pattern filters labels that match the regex pattern.
axis It specifies whether to filter by labels in the index (0 or ‘index’) or columns (1 or ‘columns’).

Return value

This method returns a DataFrame that is a subset of the original DataFrame based on the provided filtering criteria.

Important points

  1. Filtering axis: You can specify whether to filter by rows (axis=0) or columns (axis=1).
  2. Multiple criteria: Allows filtering using lists of labels, like column names.
  3. Regex support: Supports regular expressions for more complex filtering conditions.
  4. Flexibility: Provides a flexible way to look at a subset of data based on label criteria.

Example 1: Filtering columns by name

Basic understanding of Pandas DataFrame filter() Method

import pandas as pd

df = pd.DataFrame({'A1': range(1, 6), 'B1': range(11, 16), 'C2': range(21, 26)})

filtered_df = df.filter(items=['A1', 'C2'])

print(filtered_df)

Output

Output of Pandas DataFrame filter() Method

In this code, we created a new DataFrame filtered_df by selecting only the columns ‘A’ and ‘C’ from the original DataFrame df.

Example 2: Filtering columns using “like”

Pictorial representation of Filtering columns using like

import pandas as pd

df = pd.DataFrame({'A1': range(1, 6), 'B1': range(11, 16), 'C2': range(21, 26)})

filtered_df = df.filter(like='1')

print(filtered_df)

Output

Output of filtering columns using like

In this code, we filtered columns from the DataFrame df to create filtered_df, which includes only those columns whose names contain ‘1’.

Example 3: Filtering columns with regular expressions

Filtering DataFrame columns with regular expressions

import pandas as pd

df = pd.DataFrame({'A1': range(1, 6), 'B1': range(11, 16), 'C2': range(21, 26)})

filtered_df = df.filter(regex='[A-C]1')

print(filtered_df)

Output

Output of filtering with regular expressions

We created filtered_df by filtering columns from df whose names match the regular expression [A-C]1 (i.e., names starting with ‘A’, ‘B’, or ‘C’ and ending with ‘1’).

Example 4: Filtering rows by index labels

Filtering rows by index labels

import pandas as pd

df = pd.DataFrame({'A1': [1, 2, 3], 'B1':[4, 5, 6]},
                   index = ['row1', 'row2', 'row3'])

filtered_df = df.filter(items=['row1', 'row3'], axis=0)

print(filtered_df)

Output

Output of filtering rows by index

We created filtered_df by selecting rows ‘row1’ and ‘row3’ from the DataFrame df.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.