AppDividend
Latest Code Tutorials

Pandas Assign: How to Assign New Columns to DataFrame

0

Pandas Dataframe.assign() method assigns new columns to a DataFrame, returning the new object (a copy) with the new columns added to the original ones. Please be careful while assigning the new columns because existing columns that are re-assigned will be overwritten.

Pandas assign example

To assign new columns to a DataFrame, use the Pandas assign() method. The assign() returns the new object with all original columns in addition to new ones. Existing columns that are re-assigned will be overwritten. The length of the newly assigned column must match the number of rows in the DataFrame.

Syntax

DataFrame.assign(**kwargs)

Parameters

**kwargs: dict of {str: callable or Series}

The column names are keywords. If the values are callable, then they are computed on the DataFrame and assigned to the new columns. The callable values must not change input DataFrame. If the values are not callable, (for example, Series, scalar, or array), they are simply assigned.

Return Value

It returns the new DataFrame with the new columns in addition to all the existing columns.

Example of Pandas assign() function

Let’s define a DataFrame which has only one column called price.

Now, the price is increased by 5%. So we will add a new column called revised_price.

Let’s see how we can add a new column with the help of the pandas DataFrame.assign() method.

See the following code.

import pandas as pd

dt = {'price': [520, 500]}
df1 = pd.DataFrame(data=dt)
print(df1)
print('----------------------')
print('After assign new column of Revised Price')
df2 = df1.assign(revised_price=lambda x: x.price + x.price * 0.05)
print(df2)

Output

python3 app.py
   price
0    520
1    500
----------------------
After assign new column of Revised Price
   price  revised_price
0    520          546.0
1    500          525.0

We have used Python lambda function to add 5% in the price column values and created a new column called revised_price and assign it to the DataFrame.

Add new column in DataFrame with values based on other columns

You can also get the same behavior that can be achieved by directly referencing the existing Series or sequence.

# app.py

import pandas as pd

dt = {'price': [520, 500]}
df1 = pd.DataFrame(data=dt)
print(df1)
print('----------------------')
print('After assign new column of Revised Price')
df2 = df1.assign(revised_price=df1['price'] + df1['price'] * 0.05)
print(df2)

Output

python3 app.py
   price
0    520
1    500
----------------------
After assign new column of Revised Price
   price  revised_price
0    520          546.0
1    500          525.0

Pandas assign multiple columns

Let’s add two new columns called revised_price and changed_price.

# app.py

import pandas as pd

dt = {'price': [520, 500]}
df1 = pd.DataFrame(data=dt)
print(df1)
print('----------------------')
print('After assigning two new columns of Revised Price')
df2 = df1.assign(revised_price=df1['price'] + df1['price'] * 0.05,
                 changed_price=df1['price'] + df1['price'] * 0.10)
print(df2)

Output

python3 app.py
   price
0    520
1    500
----------------------
After assigning two new columns of Revised Price
   price  revised_price  changed_price
0    520          546.0          572.0
1    500          525.0          550.0

In the first new added column, we have increased 5% of the price.

In the second new added column, we have increased 10% of the price.

So, we can add multiple new columns in DataFrame using pandas.DataFrame.assign() method.

Pandas: Add a new column with values in the list

Let’s say we want to add a new column ‘Items’ with default values from a list. Let’s see how to do this,

# app.py

import pandas as pd

dt = {'price': [520, 500]}
df1 = pd.DataFrame(data=dt)
print(df1)
print('----------------------')
print('After adding new column')
df1['items'] = ['Apple Watch', 'Air Pod']
print(df1)

Output

python3 app.py
   price
0    520
1    500
----------------------
After adding new column
   price        items
0    520  Apple Watch
1    500      Air Pod

DataFrame df1 didn’t have any column with name ‘items’, so it will add a new column in this DataFrame.

Leave A Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.