Pandas isnull() and notnull() Methods

When working with a large, real-time dataset, it is filled with Null, NaN, or missing values, and you need to handle those values to create an accurate machine-learning model. One such method to handle null values in the dataset is the isnull() method provided by Pandas.

Pandas DataFrame isnull()

Pandas DataFrame isnull() method is “used to detect missing values for an array-like object.” 

Syntax

DataFrame.isnull()

Parameters

None.

Return value

The isnull() method returns the Dataframe of Boolean values, which are True for NaN values.

Pictorial Representation

Visualization of Pandas isnull() method

Example

We are using Kaggle’s books.csv dataset. To import the dataset in Python, use the Pandas.read_csv() method.

import pandas as pd

data = pd.read_csv('./DataSets/books.csv')

print(data.head())

Output

Kaggle's Books.csv dataset

To check for missing values, we will now use the isnull() method.

missing_values = data.isnull()

print(missing_values)

It will return this output.

Practial use of isnull() method on real-time dataset

You can see that the isnull() method detected the null value and marked it as True. Non-null values are marked as False.

To get the sum of null values column-wise, you can further chain the method with the .sum() method.

missing_values = data.isnull().sum()

print(missing_values)

Output

get the sum of null values column-wise

The next step would be to either replace null values with mean values of the column or remove the null values, which will depend on the context of your project.

Pandas DataFrame notnull()

Pandas DataFrame notnull() method is “used to detect non-missing values for an array-like object.” It works the opposite way of the isnull() method.

Syntax

DataFrame.notnull()

Parameters

None.

Return value

It returns a Dataframe of Boolean values: False for NaN or Null values and True for non-null values.

Visualization

Visualization of notnull() method

Example

We will use the same book.csv dataset.

To check for non-missing values in Pandas DataFrame, we will now use the notnull() method.

import pandas as pd

data = pd.read_csv('./DataSets/books.csv')

non_missing_values = data.notnull()

print(non_missing_values)

Output

Pandas DataFrame notnull()

You can see that it returns True for all the non-null values and False for null / nan values.

You can use the .sum() method to get the total of non-null values column-wise.

non_missing_values = data.notnull().sum()

print(non_missing_values)

It gives this output:

data.notnull().sum()

I hope this tutorial has helped you understand Pandas’s isnull() and notnull() methods.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.