Python Pandas: How To Iterate Columns In DataFrame

0
928
Python Pandas - How To Iterate Columns In DataFrame

Python Pandas Data frame is the two-dimensional data structure in which the data is aligned in the tabular fashion in rows and columns. The DataFrame is a two-dimensional size-mutable, potentially composite tabular data structure with labeled axes (rows and columns). In this example, we will see different ways to iterate over all or specific columns of a Dataframe.

Pandas iterate over columns

Python Pandas DataFrame consists of rows and columns so, to iterate DataFrame, we have to iterate the DataFrame like a dictionary. In the dictionary, we iterate over the keys of the object in the same way we have to iterate in the Dataframe.

In Pandas Dataframe, we can iterate an item in two ways:

  1. Iterating over rows
  2. Iterating over columns

Let’s create a DataFrame first.

# app.py

import pandas as pd

tigerking = [
    ('Joe', 'Exotic', 7),
    ('Carole', 'Baskin', 7),
    ('Howard', 'Baskin', 6),
    ('John', 'Finlay', 6),
    ('Bhagavan', 'Antle', 6),
]
df = pd.DataFrame(tigerking, columns=['First Name', 'Last Name', 'Total Episodes'])
print(df)

Output

python3 app.py
  First Name Last Name  Total Episodes
0        Joe    Exotic               7
1     Carole    Baskin               7
2     Howard    Baskin               6
3       John    Finlay               6
4   Bhagavan     Antle               6

Iterate columns in Dataframe using column names

DataFrame.columns returns the sequence of column names.

We can iterate these column names, and for each column name, we can select the column contents by column name.

See the following code.

# app.py

import pandas as pd

tigerking = [
    ('Joe', 'Exotic', 7),
    ('Carole', 'Baskin', 7),
    ('Howard', 'Baskin', 6),
    ('John', 'Finlay', 6),
    ('Bhagavan', 'Antle', 6),
]
df = pd.DataFrame(tigerking,
                  columns=['First Name', 'Last Name', 'Total Episodes'])

for column in df:
    # Select column contents by column name using [] operator
    columnObj = df[column]
    print('Colunm Name : ', column)
    print('Column Contents : ', columnObj.values)

Output

python3 app.py
Colunm Name :  First Name
Column Contents :  ['Joe' 'Carole' 'Howard' 'John' 'Bhagavan']
Colunm Name :  Last Name
Column Contents :  ['Exotic' 'Baskin' 'Baskin' 'Finlay' 'Antle']
Colunm Name :  Total Episodes
Column Contents :  [7 7 6 6 6]

If you analyze the output, then you can see that first, we have gotten the column name, and then we got the content of the columns in the form of the list.

In this approach, you don’t need to use any method to iterate the columns. In the next approach, we will see a function to iterate the columns.

How to Iterate columns using DataFrame.iteritems()

DataFrame class provides a member function iteritems(). It yields an iterator that can be used to iterate all the columns of the dataframe.

For each column in a DataFrame, it returns the iterator to the tuple containing the column name and column contents as Series.

DataFrame iteritems() function is used to iterator over (column name, Series) pairs.

It iterates over the DataFrame columns, returning a tuple with the column name and the content as a Series.

The column names for the DataFrame is being iterated over.

Let’s apply the Pandas DataFrame iteritems() function.

# app.py

import pandas as pd

tigerking = [
    ('Joe', 'Exotic', 7),
    ('Carole', 'Baskin', 7),
    ('Howard', 'Baskin', 6),
    ('John', 'Finlay', 6),
    ('Bhagavan', 'Antle', 6),
]
df = pd.DataFrame(tigerking,
                  columns=['First Name', 'Last Name', 'Total Episodes'])

for (columnName, columnData) in df.iteritems():
    print('Colunm Name : ', columnName)
    print('Column Values : ', columnData.values)

Output

Colunm Name :  First Name
Column Values :  ['Joe' 'Carole' 'Howard' 'John' 'Bhagavan']
Colunm Name :  Last Name
Column Values :  ['Exotic' 'Baskin' 'Baskin' 'Finlay' 'Antle']
Colunm Name :  Total Episodes
Column Values :  [7 7 6 6 6]

Here, you can see that we are getting the first column name and then get the list of values of that column.

How to Iterate specific columns in DataFrame

Let’s say we have a scenario in which we have to select those columns only from DataFrame and then iterate over them. Let’s tackle that issue.

# app.py

import pandas as pd

tigerking = [
    ('Joe', 'Exotic', 7),
    ('Carole', 'Baskin', 7),
    ('Howard', 'Baskin', 6),
    ('John', 'Finlay', 6),
    ('Bhagavan', 'Antle', 6),
]
df = pd.DataFrame(tigerking,
                  columns=['First Name', 'Last Name', 'Total Episodes'])

for column in df[['First Name', 'Total Episodes']]:
    # Select column contents by column name using [] operator
    columnsObj = df[column]
    print('Colunm Name : ', column)
    print('Column Contents : ', columnsObj.values)

Output

python3 app.py
Colunm Name :  First Name
Column Contents :  ['Joe' 'Carole' 'Howard' 'John' 'Bhagavan']
Colunm Name :  Total Episodes
Column Contents :  [7 7 6 6 6]

We have selected two columns, and in the output, we got the two columns with their values.

Iterate columns in dataframe by index using iloc[]

We can iterate over the columns of the Dataframe using an index.

For example, we can iterate over a range i.e., 0 to Max number of columns; then, for each index, we can select the column contents using iloc[].

See the following code.

# app.py

import pandas as pd

tigerking = [
    ('Joe', 'Exotic', 7),
    ('Carole', 'Baskin', 7),
    ('Howard', 'Baskin', 6),
    ('John', 'Finlay', 6),
    ('Bhagavan', 'Antle', 6),
]
df = pd.DataFrame(tigerking,
                  columns=['First Name', 'Last Name', 'Total Episodes'])

for index in range(df.shape[1]):
   print('Column Number : ', index)
   # Select column by index position using iloc[]
   columnsObj = df.iloc[: , index]
   print('Column Contents : ', columnsObj.values)

Output

python3 app.py
Column Number :  0
Column Contents :  ['Joe' 'Carole' 'Howard' 'John' 'Bhagavan']
Column Number :  1
Column Contents :  ['Exotic' 'Baskin' 'Baskin' 'Finlay' 'Antle']
Column Number :  2
Column Contents :  [7 7 6 6 6]

In the above code, we didn’t output the name of the column, but instead, we have printed the index of the column and then the content of the column.

Finally, Pandas iterate over columns example is over.

See also

Pandas DataFrame.filter()

Pandas DataFrame.transpose()

Pandas DataFrame dropna()

Leave A Reply

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.