To compare two DataFrames in Pandas, use the .compare() method. It performs comparisons by rows and columns and checks each element against the corresponding element in the other DataFrame.
This method requires the two DataFrames to have the exact same shape (same number of rows and columns) and identical row and column labels. If these conditions are not met, the method will raise an error.
This method is specifically helpful in identifying changes between two datasets that are supposed to be similar or identical.
Syntax
DataFrame.compare(other, align_axis=1, keep_shape=False, keep_equal=False)
Parameters
Name | Description |
other | It is the DataFrame for comparison. |
align_axis | This suggests the axis of comparison, with 0 for rows and 1, the default value, for columns. |
keep_shape | This is a boolean parameter. Setting this to True prevents the dropping of any row or column and compares dropped rows and columns with all elements the same for the two data frames for the default value False. |
keep_equal | Another boolean parameter. Setting this to True shows equal values between the two DataFrames, while compare shows the positions with the same values for the two data frames as NaN for the default value False. |
Return value
This method returns a new DataFrame that shows the differences side by side.
The new DataFrame has a MultiIndex in the columns, where each original column is expanded into two levels: one showing the value from the first DataFrame and the other showing the corresponding value from the second DataFrame.
The structure of the returned DataFrame depends on keep_shape and align_axis parameters.
Example 1: Comparing two DataFrames
import pandas as pd
# Creating two DataFrames for comparison
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
df2 = pd.DataFrame({'A': [1, 2, 4], 'B': [4, 5, 7]})
# Comparing the DataFrames
diff = df1.compare(df2)
print(diff)
Output
Example 2: Keeping shape and equal values
import pandas as pd
# Creating two DataFrames for comparison
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
df2 = pd.DataFrame({'A': [1, 2, 4], 'B': [4, 5, 7]})
# Comparing the DataFrames
diff = df1.compare(df2, keep_shape=True, keep_equal=True)
print(diff)
Output
Example 3: Comparing with alignment along an index
import pandas as pd
# Creating two DataFrames for comparison
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
df2 = pd.DataFrame({'A': [1, 2, 4], 'B': [4, 5, 7]})
# Compare with align_axis set to 0
result = df1.compare(df2, align_axis=0)
print(result)
Output
Example 4: DataFrames with Different Shapes
import pandas as pd
# Creating two DataFrames for comparison
df1 = pd.DataFrame({'A': [1, 2], 'B': [4, 5]})
df2 = pd.DataFrame({'A': [1, 2, 4], 'B': [4, 5, 7]})
# Compare
result = df1.compare(df2)
print(result)
Output
ValueError: Can only compare identically-labeled (both index and columns) DataFrame objects
We got the ValueError because we compared two data frames with different shapes.
Example 5: Comparing with non-overlapping columns
import pandas as pd
# Creating two DataFrames with non-overlapping columns
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
df2 = pd.DataFrame({'C': [1, 2, 4], 'D': [4, 5, 7]})
# Compare
result = df1.compare(df2)
print(result)
Output
ValueError: Can only compare identically-labeled (both index and columns) DataFrame objects
We got an error because the index and columns of Data Frame objects are different.
Krunal Lathiya is a seasoned Computer Science expert with over eight years in the tech industry. He boasts deep knowledge in Data Science and Machine Learning. Versed in Python, JavaScript, PHP, R, and Golang. Skilled in frameworks like Angular and React and platforms such as Node.js. His expertise spans both front-end and back-end development. His proficiency in the Python language stands as a testament to his versatility and commitment to the craft.
The main problem is just you don’t have brackets enter function where. so let’s see this example
dfA[‘matchPrice?’] = np.where((dfA[‘price’]) == (dfB[‘mrp’]), ‘True’, ‘False’)
I really found this very helpful. Where can i sign up to get more information about the upcoming posts?