Pandas DataFrame drop() method allows us to remove columns and rows from the DataFrame object.
Pandas Drop Column
To drop or remove the column in DataFrame, use the Pandas DataFrame drop() method. The df.Drop() method deletes specified labels from rows or columns. It removes the rows or columns by specifying label names and corresponding axis, or by specifying index or column names directly.
When using a multi-index, labels on different levels can be removed by specifying the level.
Syntax
DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')
Parameters
labels: single label or list-like
Index or column labels to drop.
axis{0 or ‘index’, 1 or ‘columns’}, default 0
Whether to drop labels from an index (0 or ‘index’) or columns (1 or ‘columns’).
index: single label or list-like
Alternative to defining the axis (labels, axis=0 is equivalent to index=labels).
columns: single label or list-like
Alternative to specifying axis (labels, axis=1 is equivalent to columns=labels).
level: int or level name, optional
For MultiIndex, the level from which the labels will be removed.
inplace: bool, default False
If False, return a copy. Otherwise, do operation inplace and returns None.
errors{‘ignore’, ‘raise’}, default ‘raise’.
If ‘ignore’, suppress error, and only existing labels are dropped.
Return Value
The drop() function returns the DataFrame without the removed index or column labels.
Raises
The drop() method can raise the KeyError If any of the labels are not found in the selected axis.
How to Drop Column in DataFrame
Drop one or more than one column from the DataFrame can be achieved in multiple ways.
- To drop columns in DataFrame, use the df.drop() method.
- Drop columns from a DataFrame using iloc[ ] and drop() method.
- Drop columns from a DataFrame using loc[ ] and drop() method.
Removing columns using df.drop()
To create a DataFrame from Dictionary, use the pd.DataFrame.from_dict() function.
import pandas as pd data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'], 'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'], 'Season': [3, 12, 1, 2], 'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']} df = pd.DataFrame.from_dict(data) print(df)
Output
Show Streaming Season Main Actor 0 Stranger Things Netflix 3 Millie 1 The X-Files Fx 12 Gillian 2 Mandalorian Disney Plus 1 Padro 3 The Boys Amazon Prime 2 Karl Urban
You can see that DataFrame is created with four rows and four columns.
To drop a single column from DataFrame, use the drop() method and pass only one column in the columns list like below.
# app.py import pandas as pd data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'], 'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'], 'Season': [3, 12, 1, 2], 'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']} df = pd.DataFrame.from_dict(data) df.drop(columns=['Season'], inplace=True) print(df)
Output
python3 app.py Show Streaming Main Actor 0 Stranger Things Netflix Millie 1 The X-Files Fx Gillian 2 Mandalorian Disney Plus Padro 3 The Boys Amazon Prime Karl Urban
You can see that we tried to remove the Season column, and it does remove the column.
Removing multiple columns from DataFrame
To remove multiple columns from DataFrame, pass the list of columns that needs to be removed while using the drop() function.
# app.py import pandas as pd data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'], 'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'], 'Season': [3, 12, 1, 2], 'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']} df = pd.DataFrame.from_dict(data) df.drop(columns=['Season', 'Streaming'], inplace=True) print(df)
Output
python3 app.py Show Main Actor 0 Stranger Things Millie 1 The X-Files Gillian 2 Mandalorian Padro 3 The Boys Karl Urban
You can see that we passed a list of columns like Season and Streaming, and in the output, it is removed from the DataFrame.
Removing columns based on the column index
To remove columns as index base, use df.columns() function.
# app.py import pandas as pd data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'], 'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'], 'Season': [3, 12, 1, 2], 'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']} df = pd.DataFrame.from_dict(data) df.drop(df.columns[[1, 2]], axis=1, inplace=True) print(df)
In this example, we want to remove the column index 1 and 2, which is Streaming and Season. So, we are eliminating the columns using column index using df.columns[] property and pass the column indexes to the list.
Drop Columns using iloc[ ] and drop()
To remove all the columns between the specific columns, use the iloc[ ] and drop() method.
# app.py import pandas as pd data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'], 'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'], 'Season': [3, 12, 1, 2], 'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']} df = pd.DataFrame.from_dict(data) df.drop(df.iloc[:, 1:3], inplace=True, axis=1) print(df)
Output
python3 app.py Show Main Actor 0 Stranger Things Millie 1 The X-Files Gillian 2 Mandalorian Padro 3 The Boys Karl Urban
Pandas.DataFrame.iloc is the unique inbuilt property that returns integer-location based indexing for selection by position. We use this function to get the index of the column and then pass that to the drop() method and remove the columns based on the indices.
Drop Columns using loc[ ] and drop()
Pandas DataFrame loc[] is used to access the group of rows and columns by labels or a Boolean array. See the following code.
# app.py import pandas as pd data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'], 'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'], 'Season': [3, 12, 1, 2], 'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']} df = pd.DataFrame.from_dict(data) df.drop(df.loc[:, 'Streaming':'Season'].columns, axis=1, inplace=True) print(df)
Output
python3 app.py Show Main Actor 0 Stranger Things Millie 1 The X-Files Gillian 2 Mandalorian Padro 3 The Boys Karl Urban
In this example, we use the loc[ ] method to group the columns and remove those columns from the DataFrame using the df.drop() method.
The Difference between loc( ) and iloc( ) is that iloc( ) excludes the last column range element.
Suppressing Errors in Dropping Columns and Rows
If the DataFrame doesn’t contain the given labels, KeyError is raised.
# app.py import pandas as pd data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'], 'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'], 'Season': [3, 12, 1, 2], 'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']} df = pd.DataFrame.from_dict(data) df.drop(columns=['ABC']) print(df)
Output
python3 app.py Traceback (most recent call last): KeyError: "['ABC'] not found in axis"
We can suppress this error by specifying errors=’ignore’ in the drop() function call.
# app.py import pandas as pd data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'], 'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'], 'Season': [3, 12, 1, 2], 'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']} df = pd.DataFrame.from_dict(data) df.drop(columns=['ABC'], errors='ignore') print(df)
Output
python3 app.py Show Streaming Season Main Actor 0 Stranger Things Netflix 3 Millie 1 The X-Files Fx 12 Gillian 2 Mandalorian Disney Plus 1 Padro 3 The Boys Amazon Prime 2 Karl Urban
Conclusion
Pandas DataFrame drop() is a beneficial method to remove unwanted columns and rows. We have seen how to use iloc[] and loc[] with the drop() method.
See also
DataFrame.loc
Returns label-location based indexer for selection by the label.
DataFrame.dropna
Returns DataFrame with labels on given axis omitted, where (all or any) data are missing.
DataFrame.drop_duplicates
Returns DataFrame with duplicate rows removed, optionally only considering specific columns.