AppDividend
Latest Code Tutorials

Pandas Drop Column: How to Drop Column in DataFrame

0

Pandas DataFrame drop() method allows us to remove columns and rows from the DataFrame object.

Pandas Drop Column

To drop or remove the column in DataFrame, use the Pandas DataFrame drop() method. The df.Drop() method deletes specified labels from rows or columns. It removes the rows or columns by specifying label names and corresponding axis, or by specifying index or column names directly.

When using a multi-index, labels on different levels can be removed by specifying the level.

Syntax

DataFrame.drop(labels=None, axis=0, index=None, columns=None, 
               level=None, inplace=False, errors='raise')

Parameters

labels: single label or list-like

Index or column labels to drop.

axis{0 or ‘index’, 1 or ‘columns’}, default 0

Whether to drop labels from an index (0 or ‘index’) or columns (1 or ‘columns’).

index: single label or list-like

Alternative to defining the axis (labels, axis=0 is equivalent to index=labels).

columns: single label or list-like

Alternative to specifying axis (labels, axis=1 is equivalent to columns=labels).

level: int or level name, optional

For MultiIndex, the level from which the labels will be removed.

inplace: bool, default False

If False, return a copy. Otherwise, do operation inplace and returns None.

errors{‘ignore’, ‘raise’}, default ‘raise’.

If ‘ignore’, suppress error, and only existing labels are dropped.

Return Value

The drop() function returns the DataFrame without the removed index or column labels.

Raises

The drop() method can raise the KeyError If any of the labels are not found in the selected axis.

How to Drop Column in DataFrame

Drop one or more than one column from the DataFrame can be achieved in multiple ways.

  1. To drop columns in DataFrame, use the df.drop() method.
  2. Drop columns from a DataFrame using iloc[ ] and drop() method.
  3. Drop columns from a DataFrame using loc[ ] and drop() method.

Removing columns using df.drop()

To create a DataFrame from Dictionary, use the pd.DataFrame.from_dict() function.

import pandas as pd

data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'],
        'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'],
        'Season': [3, 12, 1, 2],
        'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']}
df = pd.DataFrame.from_dict(data)
print(df)

Output

              Show     Streaming  Season  Main Actor
0  Stranger Things       Netflix       3      Millie
1      The X-Files            Fx      12     Gillian
2      Mandalorian   Disney Plus       1       Padro
3         The Boys  Amazon Prime       2  Karl Urban

You can see that DataFrame is created with four rows and four columns.

To drop a single column from DataFrame, use the drop() method and pass only one column in the columns list like below.

# app.py

import pandas as pd

data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'],
        'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'],
        'Season': [3, 12, 1, 2],
        'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']}
df = pd.DataFrame.from_dict(data)
df.drop(columns=['Season'], inplace=True)
print(df)

Output

python3 app.py
              Show     Streaming  Main Actor
0  Stranger Things       Netflix      Millie
1      The X-Files            Fx     Gillian
2      Mandalorian   Disney Plus       Padro
3         The Boys  Amazon Prime  Karl Urban

You can see that we tried to remove the Season column, and it does remove the column.

Removing multiple columns from DataFrame

To remove multiple columns from DataFrame, pass the list of columns that needs to be removed while using the drop() function.

# app.py

import pandas as pd

data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'],
        'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'],
        'Season': [3, 12, 1, 2],
        'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']}
df = pd.DataFrame.from_dict(data)
df.drop(columns=['Season', 'Streaming'], inplace=True)
print(df)

Output

python3 app.py
              Show  Main Actor
0  Stranger Things      Millie
1      The X-Files     Gillian
2      Mandalorian       Padro
3         The Boys  Karl Urban

You can see that we passed a list of columns like Season and Streaming, and in the output, it is removed from the DataFrame.

Removing columns based on the column index

To remove columns as index base, use df.columns() function.

# app.py

import pandas as pd

data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'],
        'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'],
        'Season': [3, 12, 1, 2],
        'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']}
df = pd.DataFrame.from_dict(data)
df.drop(df.columns[[1, 2]], axis=1, inplace=True)
print(df)

In this example, we want to remove the column index 1 and 2, which is Streaming and Season. So, we are eliminating the columns using column index using df.columns[] property and pass the column indexes to the list.

Drop Columns using iloc[ ] and drop()

To remove all the columns between the specific columns, use the iloc[ ] and drop() method.

# app.py

import pandas as pd

data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'],
        'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'],
        'Season': [3, 12, 1, 2],
        'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']}
df = pd.DataFrame.from_dict(data)
df.drop(df.iloc[:, 1:3], inplace=True, axis=1)
print(df)

Output

python3 app.py
              Show  Main Actor
0  Stranger Things      Millie
1      The X-Files     Gillian
2      Mandalorian       Padro
3         The Boys  Karl Urban

Pandas.DataFrame.iloc is the unique inbuilt property that returns integer-location based indexing for selection by position. We use this function to get the index of the column and then pass that to the drop() method and remove the columns based on the indices.

Drop Columns using loc[ ] and drop()

Pandas DataFrame loc[] is used to access the group of rows and columns by labels or a Boolean array. See the following code.

# app.py

import pandas as pd

data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'],
        'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'],
        'Season': [3, 12, 1, 2],
        'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']}
df = pd.DataFrame.from_dict(data)
df.drop(df.loc[:, 'Streaming':'Season'].columns, axis=1, inplace=True)
print(df)

Output

python3 app.py
              Show  Main Actor
0  Stranger Things      Millie
1      The X-Files     Gillian
2      Mandalorian       Padro
3         The Boys  Karl Urban

In this example, we use the loc[ ] method to group the columns and remove those columns from the DataFrame using the df.drop() method.

The Difference between loc( ) and iloc( ) is that iloc( ) excludes the last column range element.

Suppressing Errors in Dropping Columns and Rows

If the DataFrame doesn’t contain the given labels, KeyError is raised.

# app.py

import pandas as pd

data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'],
        'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'],
        'Season': [3, 12, 1, 2],
        'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']}
df = pd.DataFrame.from_dict(data)
df.drop(columns=['ABC'])
print(df)

Output

python3 app.py
Traceback (most recent call last):
KeyError: "['ABC'] not found in axis"

We can suppress this error by specifying errors=’ignore’ in the drop() function call.

# app.py

import pandas as pd

data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'],
        'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'],
        'Season': [3, 12, 1, 2],
        'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']}
df = pd.DataFrame.from_dict(data)
df.drop(columns=['ABC'], errors='ignore')
print(df)

Output

python3 app.py
              Show     Streaming  Season  Main Actor
0  Stranger Things       Netflix       3      Millie
1      The X-Files            Fx      12     Gillian
2      Mandalorian   Disney Plus       1       Padro
3         The Boys  Amazon Prime       2  Karl Urban

Conclusion

Pandas DataFrame drop() is a beneficial method to remove unwanted columns and rows. We have seen how to use iloc[] and loc[] with the drop() method.

See also

DataFrame.loc
Returns label-location based indexer for selection by the label.

DataFrame.dropna
Returns DataFrame with labels on given axis omitted, where (all or any) data are missing.

DataFrame.drop_duplicates
Returns DataFrame with duplicate rows removed, optionally only considering specific columns.

Leave A Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.