To efficiently and quickly convert a Set to a Numpy Array, use the np.fromiter() method for large sets to avoid intermediate lists, reducing memory overhead and improving performance.
import numpy as np
my_set = {x for x in range(1000000)}
print(my_set)
# Output: { 0 1 2 ... 999998 999999}
print(type(my_set))
# Output: <class 'set'>
array = np.fromiter(my_set, dtype=np.int32)
print(array)
# Output: [ 0 1 2 ... 999998 999999]
print(type(array))
# Output: <class 'numpy.ndarray'>
The .fromiter() method accepts set and dtype arguments and returns the numpy array.
Another method is np.array(), which is also helpful, but it requires creating an intermediate list, making it less efficient.
Numpy arrays are more memory efficient than Python’s built-in data structures. Since the Set does not allow duplicate elements, the output numpy array will also have unique elements.
Pre-allocation
The count argument in the np.fromiter() method controls how many elements are read from the iterable, and len(set) ensures that all of the set’s elements are converted to the NumPy array.
import numpy as np
my_set = {11, 24, 43, 33, 21, 19}
print(my_set)
# Output: {33, 19, 21, 24, 43, 11}
print(type(my_set))
# Output: <class 'set'>
array = np.fromiter(my_set, dtype=np.float64, count=len(my_set))
print(array)
# Output: [33. 19. 21. 24. 43. 11.]
print(type(array))
# Output: <class 'numpy.ndarray'>
Using numpy.array()
The numpy.array() function accepts an iterable and creates an array out of it. If you pass the set directly to np.array(), it returns a 1D NumPy array. The Data type (dtype) is inferred, but you can set it explicitly.
Before using numpy.array(), convert the input set into a list using the list() method to ensure order. Then, pass the list to numpy.array() method.
import numpy as np
my_set = {1, 2, 3, 4}
print(my_set)
# Output: {1, 2, 3, 4}
print(type(my_set))
# Output: <class 'set'>
# Convert set to list first to ensure order
array = np.array(list(my_set))
print(array)
# Output: [1 2 3 4]
print(type(array))
# Output: <class 'numpy.ndarray'>
Mixed Data Types
If your input Set contains mixed data types, the output numpy array will have an object-type array.
import numpy as np
my_set = {"Krunal", 21, True}
print(my_set)
# Output: {True, 21, 'Krunal'}
print(type(my_set))
# Output: <class 'set'>
array = np.array(my_set)
print(array)
# Output: {True, 'Krunal', 21}
print(array.dtype)
# Output: object
As shown in the output above, the final array’s data type is an object.
Preserving Order via Sorting
If you want to preserve the order, you can use the sorted() function and pass the set to it, which returns a list.
In the next step, we pass that list to numpy.array() method to get an array.
import numpy as np
my_set = {3, 2, 4, 1}
print(my_set)
# Output: {3, 2, 4, 1}
print(type(my_set))
# Output: <class 'set'>
# Sorting before conversion
array = np.array(sorted(my_set))
print(array)
# Output: [1 2 3 4]
print(type(array))
# Output: <class 'numpy.ndarray'>
Specifying Data Type (dtype)
If you want to control the output array data type, you can use the dtype argument in the np.array() method.
import numpy as np
my_set = {3, 2, 4, 1}
print(my_set)
# Output: {3, 2, 4, 1}
print(type(my_set))
# Output: <class 'set'>
# Changing the array type to float32
array = np.array(list(my_set), dtype=np.float32)
print(array)
# Output: [1. 2. 3. 4.]
print(type(array))
# Output: <class 'numpy.ndarray'>
print(array.dtype)
# Output: float32
In the above code, we changed the type of the numpy array to float32.
Handling empty sets
If your input set is empty, the output numpy array would be empty, too.
import numpy as np
# Initializing an empty set
my_set = set()
print(my_set)
# Output: {}
print(type(my_set))
# Output: <class 'set'>
# Converting an empty set to numpy array
# Explicitly define dtype to int to avoid object arrays
array = np.array(list(my_set), dtype=int)
print(array)
# Output: []
print(type(array))
# Output: <class 'numpy.ndarray'>
print(array.dtype)
# Output: int64
In the above code, the data type of an element in the output numpy array is int64, not object.
Structured Arrays
If you are working with a set of tuples, you can convert it to a structured numpy array using the np.array() and sorted() functions.
import numpy as np
# Creating a set of tuples
my_set = {(1, 'a'), (2, 'b')}
print(my_set)
# Output: {(2, 'b'), (1, 'a')}
dt = [('id', int), ('name', 'U1')]
print(type(my_set))
# Output: <class 'set'>
# Converting a set of tuples to structured array
array = np.array(sorted(my_set), dtype=dt)
print(array)
# Output: [(1, 'a') (2, 'b')]
print(type(array))
# Output: <class 'numpy.ndarray'>
The above output indicates that we obtain an array of tuples from a set of tuples.
That’s all!


