np.where: What is Numpy where() Function in Python

The np.where() method returns elements chosen from x or y depending on the condition. The np.where() function accepts a conditional expression as an argument and returns a new numpy array.

To select the elements based on condition, use the np.where() function.

Using the numpy where() method, the elements of the Numpy array and ndarray that satisfy the conditions can be replaced or performed specified processing. You have to install numpy for this tutorial.

You can also check your numpy version.

Syntax

numpy.where(condition[, x, y])

Parameters

condition: A conditional expression that returns the Numpy array of boolean.
x, y: Arrays (Optional, i.e., either both are passed or not passed)

  • If all arguments –> condition, x & y are given in the numpy.where() method will return elements selected from x & y depending on values in the bool array yielded by the condition. All three arrays must be of the same size.
  • If x & y arguments are not passed, and only the condition argument is passed, then it returns a tuple of arrays (one for each axis) containing the indices of the elements that are True in the bool numpy array returned by the condition.

This says that if the condition returns True for some element in our array, the new array will choose items from x.

Otherwise, if it’s False, items from y will be taken.

With that, our final output array will be an array with items from x wherever condition = True and items from y whenever condition = False.

One thing to note here is that although x and y are optional if you specify x, you MUST also specify y. You have to do this because, in this case, the output array shape must be the same as the input array.

Return Value

The where() method returns a new numpy array after filtering based on a condition, which is a numpy-like array of boolean values.

Example

import numpy as np

data = np.where([True, False, True], [11, 21, 46], [19, 29, 18])
print(data)

Output

[11 29 46]

Numpy.where() iterates over the bool array, and for every True, it yields the corresponding element array x, and for every False, it yields the corresponding element from array y. So, it returns an array of items from x where condition is True and elements from y elsewhere.

The condition can take the value of an array([[True, True, True]]), which is a numpy-like boolean array. (By default, NumPy only supports numeric values, but we can cast them to bool also).

Let’s take another example, if the condition is array([[True, True, False]]), and our array is a = ndarray([[1, 2, 3]]), on applying a condition to array (a[:, condition]), we will get the array ndarray([[1 2]]).

Replacing Elements with numpy.where()

We will use np.random.randn() function to generate a two-dimensional array, and we will only output the positive elements. See the code.

import numpy as np

# Random initialization of (2D array)
arr = np.random.randn(2, 3)
print(arr)

# result will be all elements of a whenever the condition holds true (i.e only positive elements)
# Otherwise, set it as 0
result = np.where(arr > 0, arr, 0)

print(result)

Output

[[-1.49929393  0.68739761 -0.59852165]
 [ 0.59212319  1.81549763 -0.32777714]]
[[0.         0.68739761 0.        ]
 [0.59212319 1.81549763 0.        ]]

From the output, you can see those negative value elements are removed, and 0 is replaced with negative values.

Using np where() with Multiple conditions

If each conditional expression is enclosed in () and & or | is used, the processing is applied to multiple conditions.

import numpy as np

# Random initialization of a (2D array)
arr = np.random.randn(2, 3)
print(arr)

result = np.where((arr > 0.1) & (arr < 1) | (arr == 0.5), -1, 19)

print(result)

Output

[[-0.51877986  2.29435425  0.76549418]
 [-0.94666634  1.74349695 -0.82869105]]
[[19 19 -1]
 [19 19 19]]

You can see from the output that we have applied three conditions with the help of an operator and/or operator. If the value of the array elements is between 0.1 to 0.99 or 0.5, then it will return -1; otherwise, 19.

Even in the case of multiple conditions, it is not necessary to use np.where() to obtain the bool value ndarray.

Processing the elements that satisfy the condition

Instead of the original ndarray, you can specify the operation that will perform on the elements if the elements satisfy the condition.

import numpy as np

# Random initialization of a (2D array)
arr = np.random.randn(2, 3)
print(arr)

result = np.where(arr < 1, arr * 10, 19)
print('\n')
print('The array after performing the operation')
print(result)

Output

[[ 0.4934594  -0.43502907 -0.01968412]
 [-0.52953907  0.41415299 -0.29620816]]


The array after performing the operation
[[ 4.93459402 -4.35029075 -0.19684123]
 [-5.2953907   4.14152986 -2.96208157]]

You can see that it will multiply every element by 10 if any item is less than 10. Otherwise, it will return 19 in that place.

Broadcasting with numpy.where()

If we provide all the condition, x, and y arrays, numpy will broadcast them together. 

import numpy as np

arr = np.arange(6).reshape(2, 3)

brr = np.arange(3).reshape(1, 3)

print(arr)
print(brr)

# Broadcasts (arr < 2, arr, and brr * 5)
# of shape (2, 3), (2, 3) and (1, 3)
c = np.where(arr < 2, arr, brr * 5)
print(c)

Output

[[0 1 2]
 [3 4 5]]
[[0 1 2]]
After broadcasting:
[[ 0  1 10]
 [ 0  5 10]]

That’s it.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.