Numpy.nan (np.nan) in Python

The numpy.nan (np.nan) is a special floating-point constant defined by the IEEE 754 standard representing a missing or undefined numerical value in an array. It stands for Not a Number (NaN) and has a float type.

Syntax

import numpy as np

np.nan

Scalar np.nan

import numpy as np

nan_value = np.nan

print(nan_value)
# Output: nan

print(f"Type: {type(nan_value)}")
# Output: Type: <class 'float'>

print(f"Data type: {np.dtype(type(nan_value))}")
# Output: Data type: float64

Numpy array with a nan value

Let’s create an array that contains a NaN value.

import numpy as np

array_with_nan = np.array([1, 0, np.nan, 3])

print(array_with_nan)

# Output: [ 1.  0. nan  3.]

The important thing I would like you to take away from this is that all our integers have been converted to floats because NumPy has defined the NaN data type as a floating-point number.

Comparing np.nan values

You can use the double equal (==) operator to compare two NaN values.

import numpy as np

print(np.nan == np.nan)

# Output: False

You can see that two NaN values cannot be the same because we don’t know what they are, so the system can’t recognize and tell us that both are different values.

Here are other different checks as well:

import numpy as np

print(np.nan != np.nan)
# Output: True

print(np.nan < 1)
# Output: False

print(np.nan > 1)
# Output: False

print(np.nan <= np.nan)
# Output: False

print(np.nan >= np.nan)
# Output: False

Checking for NaN values

To check for NaN values in an array, use the np.isnan() method.

If the isnan() method encounters a NaN value, it returns True; otherwise, it returns False.

import numpy as np

arr = np.nan
arr2 = 1

print(np.isnan(np.nan))
# Output: True

print(np.isnan(arr2))
# Output: False

Apart from NaN values, it will return False, as shown in the above program.

Replacing NaN Values

Using the np.nan_to_num() method, we can replace the NaN value with the specific value provided by the user.

import numpy as np

arr = np.array([9, 11, np.nan, 19, np.nan])

print("Original array:", arr)
# Output: Original array: [ 9. 11. nan 19. nan]

clean_array = np.nan_to_num(arr, nan=0.0)

print("Array after replacing NaN with 0.0:", clean_array)
# Output: Array after replacing NaN with 0.0: [ 9. 11. 0. 19. 0.]

In this code, we replaced NaN values with 0.0, which is a logical move because if you are performing addition, this helps prevent the output from being NaN.

NaN Propagation in Arithmetic Operations

If you are performing arithmetic operations that are not element-wise and your input contains NaN values, what will be the answer? Well, if at any specific time, if NaN is found, the output will be NaN because you can’t quantify NaN, and NaN is just NaN; there is no other value.

Here, propagation means that if there is a NaN in the operation, the output will most probably be NaN.

import numpy as np

# Arithmetic operations with NaN
arr = np.array([11, 12, np.nan, 14, 15])

print(f"Sum: {np.sum(arr)}")         
# Output: Sum: nan

print(f"Mean: {np.mean(arr)}")       
# Output: Mean: nan

print(f"Max: {np.max(arr)}")         
# Output: Max: nan

print(f"Min: {np.min(arr)}")         
# Output: Min: nan

What about element-wise operations? Well, in that case, only NaN will return NaN; other values will be computed.

import numpy as np

arr = np.array([11, 21, np.nan, 41, 51])

# Element-wise operations
print(f"Array + 10: {arr + 10}")
# Output: Array + 10: [21. 31. nan 51. 61.]

print(f"Array * 2: {arr * 2}")
# Output: Array * 2: [ 22.  42.  nan  82. 102.]

You can see in the above code that, in the case of addition, each element of the input array is incremented by 10, except for NaN, because, as we discussed, anything with NaN remains NaN.

The same thing happens with multiplication as well.

NaN in Pandas Dataframe

You can check for NaN values by using the isnull() method. The output will be a boolean mask with the dimensions of the original dataframe.

import numpy as np
import pandas as pd

df = pd.DataFrame([(1.0, np.nan, -1.0, 21.0),
                   (np.nan, 12.0, np.nan, 11),
                   (21.0, 15.0, np.nan, 91.0),
                   (np.nan, 14.0, -31.0, 19.0)],
                  columns=list('abcd'))

print(df)

We have created a data frame that contains NaN values.

Now, I will write some code to detect it:

import numpy as np
import pandas as pd

df = pd.DataFrame([(1.0, np.nan, -1.0, 21.0),
     (np.nan, 12.0, np.nan, 11),
     (21.0, 15.0, np.nan, 91.0),
     (np.nan, 14.0, -31.0, 19.0)],
     columns=list('abcd'))

print(pd.isnull(df))

That’s all!

Post Views: 379

Krunal Lathiya

With a career spanning over eight years in the field of Computer Science, Krunal’s expertise is rooted in a solid foundation of hands-on experience, complemented by a continuous pursuit of knowledge.