Pandas DataFrame transform() method calls the function on itself, producing a DataFrame with transformed values that have the same axis length as of the initial DataFrame. The transform() function is super useful when you are looking to manipulate rows or columns.
Pandas dataframe transform
Pandas DataFrame transform() is an inbuilt method that calls a function on self-producing a DataFrame with transformed values, and that has the same axis length as self. The transform is an operation used in conjunction with a groupby method(which is one of the most useful operations in pandas).
Almost, pandas users likely have used an aggregate, filter, or apply with groupby to summarize data. However, the transform() method is a little more challenging to understand, especially coming from an Excel world.
To import and read excel files in Python, use the Pandas read_excel() method. The read_excel() function is to read the excel sheet data into the DataFrame object. It is represented in the two-dimensional tabular view.
Pandas Transform vs. Pandas Aggregate
While aggregation must return a reduced version of the data, the transformation can return some transformed version of the full data to recombine.
For such a transformation, the output is the same shape as the input. The common example is to center the data by subtracting the group-wise mean.
Difference Between Apply And Transform Function
The apply() function sends a complete copy of the DataFrame to work upon so we can manipulate all the rows or columns simultaneously.
The transform() function manipulates a single row or column based on axis value and doesn’t manipulate the whole DataFrame. So, we can use either apply() or the transform() function depending on the requirement.
Let’s see the syntax of the df.transform() method.
Syntax
DataFrame.transform(func, axis=0, *args, **kwargs)
Parameters
It has four parameters, which are briefly defined below.
- function: It is the function, string, list, or dictionary. It is the function which is used for transforming the data.
- axis: It takes either 0 or 1. If 0 (also called ‘index’) they the function is applied to each column. If 1(also called ‘columns’), then the function is applied to each row.
- *args: It is the positional arguments that are passed to the functions.
- **kwargs: It’s the keyword arguments to pass to function.
Return Value
The transform() function returns a transformed DataFrame.
Example program on pandas.DataFrame.transform()
Write a program to show the working of pandas.DataFrame.transform().
import pandas as pd df = pd.DataFrame({"A": [3, 4, 5, 6, 7], "B": [8, 9, 10, 11, 12], "C": [13, 64, 74, 23, 76], "D": [53, 35, 64, 76, 85]}) print(df) resultdf = df.transform(func=lambda x: x + 2) print("\nDataFrame after being transformed:\n") print("\n", resultdf)
Output
A B C D 0 3 8 13 53 1 4 9 64 35 2 5 10 74 64 3 6 11 23 76 4 7 12 76 85 DataFrame after being transformed: A B C D 0 5 10 15 55 1 6 11 66 37 2 7 12 76 66 3 8 13 25 78 4 9 14 78 87
In the above code, we have seen that we have created a DataFrame, then Transformed the DataFrame by adding 2 to each element of the DataFrame and printed the transformed DataFrame.
Write a program to multiply each element of the DataFrame by 5 and then print the resulting DataFrame.
See the following code.
import pandas as pd df = pd.DataFrame({"A": [3, 4, 5, 6, 7], "B": [8, 9, 10, 11, 12], "C": [13, 64, 74, 23, 76], "D": [53, 35, 64, 76, 85]}) print(df) resultdf = df.transform(func=lambda x: x*5) print("\nDataFrame after being transformed:\n") print("\n", resultdf)
Output
A B C D 0 3 8 13 53 1 4 9 64 35 2 5 10 74 64 3 6 11 23 76 4 7 12 76 85 DataFrame after being transformed: A B C D 0 15 40 65 265 1 20 45 320 175 2 25 50 370 320 3 30 55 115 380 4 35 60 380 425
In the above example, we have seen that we have created a DataFrame, then transformed the DataFrame by multiplying each element by 5 of the DataFrame and printed the transformed DataFrame.
Pandas DataFrame and Numpy
Let’s create a DataFrame from a numpy array and use the transform() function.
import pandas as pd import numpy as np df = pd.DataFrame( np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), columns=['a', 'b', 'c']) print(df) resultdf = df.transform(func=lambda x: x*5) print("\nDataFrame after being transformed:\n") print("\n", resultdf)
Output
a b c 0 1 2 3 1 4 5 6 2 7 8 9 DataFrame after being transformed: a b c 0 5 10 15 1 20 25 30 2 35 40 45
Conclusion
The DataFrame.transform() function returns the self-produced DataFrame with transformed values after applying the function specified in its parameter. This output DataFrame has the same length as the passed DataFrame.