Pandas Dataframe.to_numpy() is an inbuilt method that is used to convert a DataFrame to a Numpy array. The DataFrame is a two-dimensional data structure that can have the mutable size and is present in a tabular structure. To convert this data structure in the Numpy array, we use the function DataFrame.to_numpy() method.
To convert Pandas DataFrame to numpy array, you can use the DataFrame.to_numpy() function.
The data type of the returned array will be the standard Numpy datatype of all the types in the DataFrame.
For example, if the datatype is float32, then the resultant datatype will also be float32.
Syntax
DataFrame.to_numpy(dtype= None, copy= False)
Parameters
DataFrame.to_numpy() function contains following two parameters.
- dtype: It is used to mention the data type we are passing. (Example: string, int)
- copy: It is a boolean value, and by default, it takes False. It ensures that the returned value is not the view on another array.
Return Value
The to_numpy() method returns a numpy array.
Example
Write a program to show the working of DataFrame.to_numpy().
See the following code.
import pandas as pd data = pd.DataFrame({'year': [2015, 2016, 2017, 2018, 2019, 2020], 'month': [2, 3, 4, 5, 6, 7], 'day': [4, 5, 6, 7, 8, 9]}) data_numpy = data.to_numpy() print(data_numpy)
Output
[[2015 2 4] [2016 3 5] [2017 4 6] [2018 5 7] [2019 6 8] [2020 7 9]]
In the above example, we can see that we have created a DataFrame named data that contains data of year, month, and day.
Then, we have converted that data to numpy using to_numpy() and got out the desired output in the form of an array.
You can check the data type of the array using the type() function.
import pandas as pd data = pd.DataFrame({'year': [2015, 2016, 2017, 2018, 2019, 2020], 'month': [2, 3, 4, 5, 6, 7], 'day': [4, 5, 6, 7, 8, 9]}) print('The data type of data is: ', type(data)) data_numpy = data.to_numpy() print('The data type of data_numpy is: ', type(data_numpy))
Output
The data type of data is: <class 'pandas.core.frame.DataFrame'> The data type of data_numpy is: <class 'numpy.ndarray'>
You can see that both have different data types, and the to_numpy() function successfully converts DataFrame to Numpy array.
Example 2: Write a program to show the working of DataFrame.to_numpy() on heterogeneous data.
See the following code.
import pandas as pd data = pd.DataFrame({'science marks': [84, 77, 66, 44, 37, 89], 'maths marks': [62.5, 73.6, 84.3, 67.5, 56.9, 87.5]}) data_innumpy = data.to_numpy() print(data_innumpy)
Output
[[84. 62.5] [77. 73.6] [66. 84.3] [44. 67.5] [37. 56.9] [89. 87.5]]
Here in the above code, we can see that we have created a DataFrame that contains marks of science and maths.
The thing to notice here is that marks of science are present in integer format, and marks of maths are present in decimal.
Hence while converting it in numpy array, it takes the value of the lowest common type used.
Always remember that when dealing with a lot of data, you should clean the data first to get high accuracy.
Import CSV Data and convert it to numpy array
To import CSV data, you can use the read_csv() method.
It will convert CSV data to DataFrame automatically.
I am importing the shows_data.csv file. You can download it from here. You can name it whatever you like for your convenience. I have named it the shows_data.csv file.
In this example, we will get the data of the Title column of the first five rows.
import pandas as pd data = pd.read_csv('shows_data.csv') data.dropna(inplace=True) shows = pd.DataFrame(data['Title'].head()) print(shows.to_numpy())
Output
[['Breaking Bad'] ['Stranger Things'] ['Money Heist'] ['Sherlock'] ['Better Call Saul']]
You can see that we only got the title of the first five shows in the numpy array.
We can also pass the dtype argument to the to_numpy() function.
import pandas as pd data = pd.read_csv('shows_data.csv') data.dropna(inplace=True) shows = pd.DataFrame(data['Netflix'].head()) print(shows.to_numpy(dtype='float32'))
Output
[[1.] [1.] [1.] [1.] [1.]]
You can see that the output array is in the float data type.
So, to convert Pandas DataFrame to Numpy array, the to_numpy() array function is useful.
Finally, Pandas DataFrame to_numpy() example is over.