AppDividend
Latest Code Tutorials

Pandas DataFrame apply() Function Example

0

Pandas DataFrame apply() function allows the users to pass a function and apply it to every single value of the Pandas series. Objects passed to the apply() method are series objects whose indexes are either DataFrame’s index, which is axis=0 or the DataFrame’s columns, which is axis=1.

Pandas DataFrame apply()

To apply a function to every row in a Pandas DataFrame, use Pandas df.apply() function.

Syntax

DataFrame.apply(self, func, axis=0, raw=False, result_type=None, args=(), **kwds)

Parameters

The apply() method has the following parameters: 

  • func: It is the function to apply to each row or column.
  • axis: It takes integer values and can have values 0 and 1. By default, its value is 0. 0 signifies index, and 1 signifies columns. It tells the axis along which the function is applied.
  • raw: It takes boolean values. Its default value is False. It determines if row or column is passed as a Series or ndarray object. 
  • result_type: It can have 3 values ‘expand’, ‘reduce’, ‘broadcast,’ or it can be labeled as None. This function only acts when we work column-wise that is when axis=1.
    • expand’ = In this list-like results will be turned into columns.
    • reduce’ = It returns a series if possible rather than expanding list-like results.
    • broadcast’ = In this results will be broadcast to the original shape of the DataFrame, the original index and columns will be retained in this case. The default value, which is none depends on the return value of the applied function.
  • args: It takes the form of a tuple. Positional arguments to pass to the method in addition to array/series.
  • **kwds: It is the additional keyword arguments to pass as keywords arguments to functions.

Return Value

The DataFrame apply() method returns a Series or DataFrame, which is the result of applying function along the given axis of the DataFrame.

Example program on pandas.apply()

Write a program to show the working of pandas.apply()

import numpy as np
import pandas as pd

df = pd.DataFrame([[1, 4], [9, 16], [25, 36]], columns=['1st', '2nd'])
print(df, '\n')
df2 = df.apply(np.sqrt)
print(df2)

Output

 1st  2nd
0    1    4
1    9   16
2   25   36

   1st  2nd
0  1.0  2.0
1  3.0  4.0
2  5.0  6.0

In the above code, we can see that we have created a DataFrame named data1 in which we’ve taken different values such as 1,4,9,16 and so on.

After that, we have used the universal function np sqrt() in the apply method to reduce the DataFrame values to the square root of the inserted values(We can also use user-defined functions here in the apply() method). After that, we printed the DataFrame.

Example 2

In this example, we will add a new column called sum, which adds the values of the rows.

import numpy as np
import pandas as pd

df = pd.DataFrame([[1, 4], [9, 16], [25, 36]], columns=['1st', '2nd'])
print(df, '\n')
df['add'] = df.apply(np.sum, axis=1)

print('\nAfter Applying Function: ')

# printing the new dataframe
print(df)

Output

   1st  2nd
0    1    4
1    9   16
2   25   36


After Applying Function:
   1st  2nd  add
0    1    4    5
1    9   16   25
2   25   36   61

From the output, you can see that the new column add has the sum of particular row values.

Apply lambda function to each row or each column in Dataframe

Python lambda or anonymous function is a type of method that is defined without the name. While the standard functions are defined using the def keyword and in Python, the anonymous functions are defined using a lambda keyword.

Let’s say; we have the lambda function that accepts a series as argument returns the new series object by multiplying 11 in each value of the given Series for example,

lambda a : a * 11

Okay, now let’s see how to apply the above lambda function to each row or column of our DataFrame.

We can apply the lambda a: a * 11 function to each column in the DataFrame, pass the lambda function as the only argument in DataFrame.apply() with the above-created DataFrame object.

See the following code.

import pandas as pd

matrix = [(11, 21, 19), (22, 42, 38), (33, 63, 57), (44, 84, 76),
          (55, 105, 95)]

# Create a DataFrame object
dfObj = pd.DataFrame(matrix, columns=list('xyz'))
print('Before Lambda Function applied')
print(dfObj)
print('------------------')

# modify the dataframe by applying lambda function
modDfObj = dfObj.apply(lambda a: a * 11)
print('After Lambda Function applied')
print(modDfObj)

Output

Before Lambda Function applied
    x    y   z
0  11   21  19
1  22   42  38
2  33   63  57
3  44   84  76
4  55  105  95
------------------
After Lambda Function applied
     x     y     z
0  121   231   209
1  242   462   418
2  363   693   627
3  484   924   836
4  605  1155  1045

Apply a lambda function to each row

To apply the lambda function to each row in DataFrame, pass the lambda function as first and only argument in DataFrame.apply() with the above created DataFrame object.

Also, we have to pass axis = 1 as a parameter that indicates that the apply() function should be given to each row.

import pandas as pd

matrix = [(11, 21, 19), (22, 42, 38), (33, 63, 57), (44, 84, 76),
          (55, 105, 95)]

# Create a DataFrame object
dfObj = pd.DataFrame(matrix, columns=list('xyz'))
print('Before Lambda Function applied')
print(dfObj)
print('------------------')

# modify the dataframe by applying lambda function
modDfObj = dfObj.apply(lambda a: a * 11, axis=1)
print('After Lambda Function applied')
print(modDfObj)

Output

Before Lambda Function applied
    x    y   z
0  11   21  19
1  22   42  38
2  33   63  57
3  44   84  76
4  55  105  95
------------------
After Lambda Function applied
     x     y     z
0  121   231   209
1  242   462   418
2  363   693   627
3  484   924   836
4  605  1155  1045

So, DataFrame.apply() calls the passed lambda method for each row and passes each row contents as Series to this lambda function.

Finally, the apply() function returns the modified copy of the DataFrame constructed with rows returned by lambda functions, instead of altering an original DataFrame.

Apply a User Defined function

Instead of we pass the lambda function, we will pass the user-defined function in the apply() method, and it will return the output based on the logic of the user-defined function.

import pandas as pd


def sicmundus(x):
    return x + 33


matrix = [(11, 21, 19), (22, 42, 38), (33, 63, 57), (44, 84, 76),
          (55, 105, 95)]

# Create a DataFrame object
dfObj = pd.DataFrame(matrix, columns=list('xyz'))
print('Before User defined Function applied')
print(dfObj)
print('------------------')

# modify the dataframe by applying user defined function
modDfObj = dfObj.apply(sicmundus)
print('After User defined Function applied')
print(modDfObj)

Output

Before User defined Function applied
    x    y   z
0  11   21  19
1  22   42  38
2  33   63  57
3  44   84  76
4  55  105  95
------------------
After User defined Function applied
    x    y    z
0  44   54   52
1  55   75   71
2  66   96   90
3  77  117  109
4  88  138  128

In this example, we are adding 33 to all the DataFrame values using User-defined function.

Conclusion

In this article, we have discussed how to apply a given lambda function or the user-defined function or numpy function to each row or column in a DataFrame.

That is for the Pandas DataFrame apply() function.

See also

How To Apply Formula To Entire Column And Row

Pandas DataFrame merge()

Pandas DataFrame groupby()

Leave A Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.