Python raises a “ValueError: columns must be same length as key” error in Pandas when you try to create a DataFrame, and the number of columns and keys do not match.
To fix the ValueError: columns must be same length as key error in Pandas, make sure that the number of keys and the number of values in each row match and that each key corresponds to a unique value.
Why ValueError occurs in Pandas?
- When you attempt to assign a list-like object (For example lists, tuples, sets, numpy arrays, and pandas Series) to a list of DataFrame columns as new arrays but the number of columns doesn’t match the second (or last) dimension (found using
np.shape
) of the list-like object. - When you attempt to assign a DataFrame to a list (or pandas Series or numpy array or pandas Index) of columns but the respective numbers of columns don’t match.
- When you attempt to replace the values of an existing column with a DataFrame (or a list-like object) whose number of columns doesn’t match the number of columns it’s replacing.
Python code that generates the error
import pandas as pd
list1 = [11, 21, 19]
list2 = [[46, 51, 61], [71, 81, 91]]
df1 = pd.DataFrame(list1, columns=['column1'])
df2 = pd.DataFrame(list2, columns=['column2', 'column3', 'column4'])
df1[['a']] = df2
Output
In the above code example, the interpreter raised a ValueError: Columns must be same length as key error because the number of columns in df2(3 columns) is different from the number of rows in df1(1 row).
How to fix it?
Code that fixes the error
Pandas DataFrame requires that the number of columns matches the number of values for each row.
import pandas as pd
list1 = [11, 21, 19]
list2 = [[46, 51, 61], [71, 81, 91]]
df1 = pd.DataFrame(list1, columns=['column1'])
# Increase the number of rows in df1 to match the number of columns in df2
df1 = pd.concat([df1] * len(list2), ignore_index=True)
df2 = pd.DataFrame(list2, columns=['column2','column3','column4'])
df1[['column2', 'column3', 'column4']] = df2
print(df1)
Output
column1 column2 column3 column4
0 11 46.0 51.0 61.0
1 21 71.0 81.0 91.0
2 19 NaN NaN NaN
3 11 NaN NaN NaN
4 21 NaN NaN NaN
5 19 NaN NaN NaN
In this code example, a new DataFrame df1 with the same number of rows as df2 by concatenating df1 with itself multiple times and then adding the columns from df2 to df1. This ensures that the number of columns and rows match and averts the ValueError from being raised.
If the values are not there in the column, NaN will be placed.
You can also check the shape of the object you’re trying to assign the df columns using the np.shape.
The second (or the last) dimension must match the number of columns you’re trying to assign to. For example, if you try to assign a 2-column numpy array to 3 columns, you’ll see the ValueError.
I hope this article helped you resolve your error.