How to Convert Python Set to Numpy Array

To efficiently and quickly convert Set to Numpy Array, use the np.fromiter() method for large sets to avoid intermediate lists, reducing memory overhead and improving performance.

Another method is np.array(), which is also helpful, but it requires creating an intermediate list, making it less efficient.

Numpy arrays are more memory efficient than Python’s built-in data structures. Since the Set does not allow duplicated elements, the output numpy array will also have unique elements.

Method 1: Using np.fromiter()

The .fromiter() method accepts set and dtype arguments and returns the numpy array.

import numpy as np

my_set = {x for x in range(1000000)}

print(my_set)
# Output: {     0      1      2 ... 999998 999999}

print(type(my_set))
# Output: <class 'set'>

array = np.fromiter(my_set, dtype=np.int32)

print(array)
# Output: [     0      1      2 ... 999998 999999]

print(type(array))
# Output: <class 'numpy.ndarray'>

Pre-allocation

The count argument in the np.fromiter() method controls how many elements are read from the iterable, and len(set) ensures that all of the set’s elements are converted to the NumPy array.

import numpy as np

my_set = {11, 24, 43, 33, 21, 19}

print(my_set)
# Output: {33, 19, 21, 24, 43, 11}

print(type(my_set))
# Output: <class 'set'>

array = np.fromiter(my_set, dtype=np.float64, count=len(my_set))

print(array)
# Output: [33. 19. 21. 24. 43. 11.]

print(type(array))
# Output: <class 'numpy.ndarray'>

Method 2: Using numpy.array()

The numpy.array() function accepts an iterable and creates an array out of it. If you pass the set to np.array() directly, it returns a 1D numpy array. The Data type (dtype) is inferred, but you can set it explicitly.

Before using numpy.array(), convert the input set into the list using the list() method to ensure order. Then, pass the list to the numpy.array() method.

import numpy as np

my_set = {1, 2, 3, 4}

print(my_set)
# Output: {1, 2, 3, 4}

print(type(my_set))
# Output: <class 'set'>

 # Convert set to list first to ensure order
array = np.array(list(my_set)) 

print(array)
# Output: [1 2 3 4]

print(type(array))
# Output: <class 'numpy.ndarray'>

Mixed Data Types

If your input Set contains mixed data types, the output numpy array will have an object-type array.

import numpy as np

my_set = {"Krunal", 21, True}

print(my_set)
# Output: {True, 21, 'Krunal'}

print(type(my_set))
# Output: <class 'set'>

array = np.array(my_set)

print(array)
# Output: {True, 'Krunal', 21}

print(array.dtype)
# Output: object

You can see from the above output that the final array’s data type is an object.

Preserving Order via Sorting

If you want to preserve the order, you can use the sorted() function and pass the set to it, which returns a list.

In the next step, we pass that list to the numpy.array() method to get an array.

import numpy as np

my_set = {3, 2, 4, 1}

print(my_set)
# Output: {3, 2, 4, 1}

print(type(my_set))
# Output: <class 'set'>

# Sorting before conversion
array = np.array(sorted(my_set))

print(array)
# Output: [1 2 3 4]

print(type(array))
# Output: <class 'numpy.ndarray'>

Specifying Data Type (dtype)

If you want to control the output array data type, you can use the dtype argument in the np.array() method.

import numpy as np

my_set = {3, 2, 4, 1}

print(my_set)
# Output: {3, 2, 4, 1}

print(type(my_set))
# Output: <class 'set'>

# Changing the array type to float32
array = np.array(list(my_set), dtype=np.float32)

print(array)
# Output: [1. 2. 3. 4.]

print(type(array))
# Output: <class 'numpy.ndarray'>

print(array.dtype)
# Output: float32

In the above code, we changed the numpy array’s type to float32.

Handling empty sets

If your input set is empty, the output numpy array would be empty, too.

import numpy as np

# Initializing an empty set
my_set = set()

print(my_set)
# Output: {}

print(type(my_set))
# Output: <class 'set'>

# Converting an empty set to numpy array
# Explicitly define dtype to int to avoid object arrays
array = np.array(list(my_set), dtype=int)

print(array)
# Output: []

print(type(array))
# Output: <class 'numpy.ndarray'>

print(array.dtype)
# Output: int64

In the above code, the data type of an output numpy array’s element is int64, not object.

Structured Arrays

If you are working with a set of tuples, you can convert it to a structured numpy array using the np.array() and sorted() functions.

import numpy as np

# Creating a set of tuples
my_set = {(1, 'a'), (2, 'b')}

print(my_set)
# Output: {(2, 'b'), (1, 'a')}

dt = [('id', int), ('name', 'U1')]

print(type(my_set))
# Output: <class 'set'>

# Converting a set of tuples to structured array
array = np.array(sorted(my_set), dtype=dt)

print(array)
# Output: [(1, 'a') (2, 'b')]

print(type(array))
# Output: <class 'numpy.ndarray'>

The above output shows that we get the array of tuples from a set of tuples.

Conclusion

For large sets, np.fromiter() is recommended for efficiency, while np.array(list(set)) remains a simple alternative for small sets.

Post Views: 12

Krunal Lathiya

With a career spanning over eight years in the field of Computer Science, Krunal’s expertise is rooted in a solid foundation of hands-on experience, complemented by a continuous pursuit of knowledge.