How to Fix ValueError: columns must be same length as key in Pandas

Python raises a “ValueError: columns must be same length as key” error in Pandas when you try to create a DataFrame, and the number of columns and keys do not match.

To fix the ValueError: columns must be same length as key error in Pandas, make sure that the number of keys and the number of values in each row match and that each key corresponds to a unique value.

Why ValueError occurs in Pandas?

  1. When you attempt to assign a list-like object (For example lists, tuples, sets, numpy arrays, and pandas Series) to a list of DataFrame columns as new arrays but the number of columns doesn’t match the second (or last) dimension (found using np.shape) of the list-like object.
  2. When you attempt to assign a DataFrame to a list (or pandas Series or numpy array or pandas Index) of columns but the respective numbers of columns don’t match.
  3. When you attempt to replace the values of an existing column with a DataFrame (or a list-like object) whose number of columns doesn’t match the number of columns it’s replacing.

Python code that generates the error

import pandas as pd

list1 = [11, 21, 19]
list2 = [[46, 51, 61], [71, 81, 91]]

df1 = pd.DataFrame(list1, columns=['column1'])
df2 = pd.DataFrame(list2, columns=['column2', 'column3', 'column4'])

df1[['a']] = df2

Output

ValueError - columns must be same length as key in Pandas

In the above code example, the interpreter raised a ValueError: Columns must be same length as key error because the number of columns in df2(3 columns) is different from the number of rows in df1(1 row).

How to fix it?

Code that fixes the error

Pandas DataFrame requires that the number of columns matches the number of values for each row.

import pandas as pd

list1 = [11, 21, 19]
list2 = [[46, 51, 61], [71, 81, 91]]

df1 = pd.DataFrame(list1, columns=['column1'])

# Increase the number of rows in df1 to match the number of columns in df2
df1 = pd.concat([df1] * len(list2), ignore_index=True)

df2 = pd.DataFrame(list2, columns=['column2','column3','column4'])

df1[['column2', 'column3', 'column4']] = df2

print(df1)

Output

   column1  column2  column3  column4
0    11      46.0     51.0     61.0
1    21      71.0     81.0     91.0
2    19      NaN      NaN      NaN
3    11      NaN      NaN      NaN
4    21      NaN      NaN      NaN
5    19      NaN      NaN      NaN

In this code example, a new DataFrame df1 with the same number of rows as df2 by concatenating df1 with itself multiple times and then adding the columns from df2 to df1. This ensures that the number of columns and rows match and averts the ValueError from being raised.

If the values are not there in the column, NaN will be placed.

You can also check the shape of the object you’re trying to assign the df columns using the np.shape.

The second (or the last) dimension must match the number of columns you’re trying to assign to. For example, if you try to assign a 2-column numpy array to 3 columns, you’ll see the ValueError.

I hope this article helped you resolve your error.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.