We can remove one or more than one row from a DataFrame using multiple ways. For example, we can drop the rows using a particular index or list of indexes to remove multiple rows.
How To Remove Rows In DataFrame
To remove rows in Pandas DataFrame, use the drop() method. The Pandas dataframe drop() is a built-in function that is used to drop the rows. The drop() removes the row based on an index provided to that function.
Pandas DataFrame provides a member function drop() whose syntax is following.
DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')
Parameters: |
|
---|
See the following code example.
# app.py import pandas as pd shows = [('The Witcher', 'Henry Cavil', 'Geralt'), ('Stranger Things', 'Millie Brown', 'Eleven'), ('BoJack Horseman', 'Will', 'BoJack'), ('Adventures of Sabrina', 'Kiernan Shipka', 'Spellman'), ('House of Cards', 'Kevin Spacey', 'Frank Underwood')] df = pd.DataFrame(shows, columns=['Series', 'Name', 'Character Name'], index=['a', 'b', 'c', 'd', 'e']) print(df) print('------------------------------') print("After dropping 'C indexed' row") print('------------------------------') print(df.drop('c'))
In the above code, we have defined one dataframe and then print that dataframe, containing five rows. Each row has its index, so we can easily remove the particular row using its index.
In our code, I have removed the ‘C’ indexed row. So and print the dataframe.
Output
python3 app.py Series Name Character Name a The Witcher Henry Cavil Geralt b Stranger Things Millie Brown Eleven c BoJack Horseman Will BoJack d Adventures of Sabrina Kiernan Shipka Spellman e House of Cards Kevin Spacey Frank Underwood ------------------------------ After dropping 'C indexed' row ------------------------------ Series Name Character Name a The Witcher Henry Cavil Geralt b Stranger Things Millie Brown Eleven d Adventures of Sabrina Kiernan Shipka Spellman e House of Cards Kevin Spacey Frank Underwood
Remove Multiple rows in Pandas DataFrame
If we pass a list of indexes to the drop() function, it will remove the multiple rows.
See the following code.
# app.py import pandas as pd shows = [('The Witcher', 'Henry Cavil', 'Geralt'), ('Stranger Things', 'Millie Brown', 'Eleven'), ('BoJack Horseman', 'Will', 'BoJack'), ('Adventures of Sabrina', 'Kiernan Shipka', 'Spellman'), ('House of Cards', 'Kevin Spacey', 'Frank Underwood')] df = pd.DataFrame(shows, columns=['Series', 'Name', 'Character Name'], index=['a', 'b', 'c', 'd', 'e']) print(df) print('------------------------------') print("After dropping 'C indexed' row") print('------------------------------') print(df.drop(['c', 'd', 'e']))
Output
python3 app.py Series Name Character Name a The Witcher Henry Cavil Geralt b Stranger Things Millie Brown Eleven c BoJack Horseman Will BoJack d Adventures of Sabrina Kiernan Shipka Spellman e House of Cards Kevin Spacey Frank Underwood ------------------------------ After dropping 'C indexed' row ------------------------------ Series Name Character Name a The Witcher Henry Cavil Geralt b Stranger Things Millie Brown Eleven
From the output, you can see that we have removed three rows whose indexes are c, d, and e.
So, this is the one way to remove single or multiple rows in the Python pandas dataframe.
Delete rows based on condition on a column
As in SQL, we can also remove a specific row based on the condition.
See the following code.
# app.py import pandas as pd shows = [('The Witcher', 'Henry Cavil', 'Geralt'), ('Stranger Things', 'Millie Brown', 'Eleven'), ('BoJack Horseman', 'Will', 'BoJack'), ('Adventures of Sabrina', 'Kiernan Shipka', 'Spellman'), ('House of Cards', 'Kevin Spacey', 'Frank Underwood')] df = pd.DataFrame(shows, columns=['Series', 'Name', 'Character Name'], index=['a', 'b', 'c', 'd', 'e']) print(df) print('------------------------------') print("After dropping 'Spellman' row") print('------------------------------') index = df[df['Character Name'] == 'Spellman'].index df.drop(index, inplace=True) print(df)
In the above code, we are getting an index based on the condition, which is the Character Name == ‘Spellman‘.
index = df[df['Character Name'] == 'Spellman'].index
It will give an Index object containing index labels for which column ‘Character Name’ has value ‘Spellman‘ value. So, we get the d index.
Index(['d'], dtype='object')
Now pass this to dataframe.drop() to delete these rows, for example,
df.drop(index, inplace=True)
It will delete all rows for which column ‘Character Name’ has the value ‘Spellman‘.
Output
python3 app.py Series Name Character Name a The Witcher Henry Cavil Geralt b Stranger Things Millie Brown Eleven c BoJack Horseman Will BoJack d Adventures of Sabrina Kiernan Shipka Spellman e House of Cards Kevin Spacey Frank Underwood ------------------------------ After dropping 'Spellman' row ------------------------------ Series Name Character Name a The Witcher Henry Cavil Geralt b Stranger Things Millie Brown Eleven c BoJack Horseman Will BoJack e House of Cards Kevin Spacey Frank Underwood
Drop rows based on multiple conditions on a column
Let’s delete all rows for which column ‘Character Name‘ has a value ‘BoJack‘ or ‘Name‘ is ‘Will‘.
See the following code.
# app.py import pandas as pd shows = [('The Witcher', 'Henry Cavil', 'Geralt'), ('Stranger Things', 'Millie Brown', 'Eleven'), ('BoJack Horseman', 'Will', 'BoJack'), ('Adventures of Sabrina', 'Kiernan Shipka', 'Spellman'), ('House of Cards', 'Kevin Spacey', 'Frank Underwood')] df = pd.DataFrame(shows, columns=['Series', 'Name', 'Character Name'], index=['a', 'b', 'c', 'd', 'e']) print(df) print('------------------------------') print("After dropping 'BoJack' row") print('------------------------------') indexNames = df[(df['Character Name'] == 'BoJack') | (df['Name'] == 'Will')].index df.drop(indexNames, inplace=True) print(df)
Output
python3 app.py Series Name Character Name a The Witcher Henry Cavil Geralt b Stranger Things Millie Brown Eleven c BoJack Horseman Will BoJack d Adventures of Sabrina Kiernan Shipka Spellman e House of Cards Kevin Spacey Frank Underwood ------------------------------ After dropping 'BoJack' row ------------------------------ Series Name Character Name a The Witcher Henry Cavil Geralt b Stranger Things Millie Brown Eleven d Adventures of Sabrina Kiernan Shipka Spellman e House of Cards Kevin Spacey Frank Underwood
Remove rows based on multiple conditions on different columns
Let’s delete all rows for which column ‘Character Name’ has ‘Eleven‘ and ‘Series’ has ‘Stranger Things‘.
See the following code.
# app.py import pandas as pd shows = [('The Witcher', 'Henry Cavil', 'Geralt'), ('Stranger Things', 'Millie Brown', 'Eleven'), ('BoJack Horseman', 'Will', 'BoJack'), ('Adventures of Sabrina', 'Kiernan Shipka', 'Spellman'), ('House of Cards', 'Kevin Spacey', 'Frank Underwood')] df = pd.DataFrame(shows, columns=['Series', 'Name', 'Character Name'], index=['a', 'b', 'c', 'd', 'e']) print(df) print('------------------------------') print("After dropping 'Eleven' row") print('------------------------------') indexNames = df[(df['Character Name'] == 'Eleven') & (df['Series'] == 'Stranger Things')].index df.drop(indexNames, inplace=True) print(df)
In the above case, we need to use & between multiple conditions.
If it satisfies the condition, then it removes the row; otherwise, it won’t remove the Pandas row.
Output
python3 app.py Series Name Character Name a The Witcher Henry Cavil Geralt b Stranger Things Millie Brown Eleven c BoJack Horseman Will BoJack d Adventures of Sabrina Kiernan Shipka Spellman e House of Cards Kevin Spacey Frank Underwood ------------------------------ After dropping 'BoJack' row ------------------------------ Series Name Character Name a The Witcher Henry Cavil Geralt c BoJack Horseman Will BoJack d Adventures of Sabrina Kiernan Shipka Spellman e House of Cards Kevin Spacey Frank Underwood
Conclusion
Pandas dataframe drop() function is used to remove the rows with the help of their index, or we can apply multiple conditions. Whichever conditions hold, we will get their index and ultimately remove the row from the dataframe.