AppDividend
Latest Code Tutorials

Pandas Set Index Example | Python DataFrame.set_index() Tutorial

0

Pandas Set Index Example | Python DataFrame.set_index() Tutorial is today’s topic. Pandas set_index() is a method to set a List, Series or Data frame as an index of a Data Frame. Pandas DataFrame is a 2-D labeled data structure with columns of a potentially different type. Pandas DataFrame is nothing but an in-memory representation of an excel sheet via Python programming language.

The index object is an immutable array. Indexing allows us to access a row or column using the label.

If you want to learn more about Python then check out this course Learn Python Programming Masterclass

Content Overview

Pandas Set Index Example

The syntax for Pandas Set Index is following.

DataFrame.set_index(keys, drop=True, append=False, inplace=False, verify_integrity=False)

Set the DataFrame index (row labels) using one or more existing columns. By default yields the new object.

keys: Column name or list of a column name.
drop: It’s a Boolean value which drops the column used for the index if True.
append: It appends the column to the existing index column if True.
inplace: It makes the changes in the DataFrame if True.
verify_integrity: It checks the new index column for duplicates if True.

We will use Real data which can be found on the following google doc link.

https://docs.google.com/spreadsheets/d/1zeeZQzFoHE2j_ZrqDkVJK9eF7OH1yvg75c8S-aBcxaU/edit#gid=0

Now, open the Jupyter Notebook and import the Pandas Library first.

Write the following code inside the first cell in Jupyter Notebook.

import pandas as pd

Run the cell by hitting Ctrl + Enter.

Okay, now we will use the  read_csv() function of DataFrame data structure in Pandas. So write the following code in the next cell.

data = pd.read_csv('data.csv', skiprows=4)
data

So, we have used the read_csv() function and skipped the first four rows and then display the remaining rows. Run the cell and see the output. It will show the first 30 rows and last 30 rows if there are so many rows. In our data file, there are above 29,000 rows. That is why we can see the first and last 30 rows.

Pandas Set Index Example | Python DataFrame.set_index() Tutorial

If you get the above output, then you have successfully imported the data.

Now, let’s see the type of index object.

Okay, in the next cell, type the following code to see the type of index object.

type(data.index)

See the below output.

 

Python DataFrame.set_index() Tutorial

Here you can see that the index has its type. 

Remember that the index data is immutable and we can not be able to change that in any circumstances.

#Pandas DataFrame set_index()

Now, we will set an index for the Python DataFrame using the set_index() method.

There are two ways to set the DataFrame index.

  1. Use the parameter inplace=True to set the current DataFrame index.
  2. Assign the newly created DataFrame index to a variable and use that variable further to use the Indexed result.

Let’s see the first way. Let’s choose the Athlete as an index and set that column as an index.

Write the following code in the next cell and see the output.

data.set_index('Athlete',inplace=True)

Run the cell and now display the DataFrame using the following code in the next cell.

data

We can see that in the output that the DataFrame is indexed based on the Athlete Names.

Pandas DataFrame set_index() Example

Here, in the code, we have passed the inplace=True as a parameter which means that we are assigning the Athlete index to the current DataFrame.

#Reset Index in Pandas DataFrame

We can use the reset_index() function to reset the index. Let’s see the following code.

data.reset_index(inplace=True)
data

See the output below.

Reset Index in Pandas DataFrame

Now, see the second way to use the set_index() method.

Write the following code in the next cell.

indexedData = data.set_index('Athlete')
indexedData

See the below output.

Pandas Set Index Example | Python DataFrame.set_index() Tutorial For Beginners

 

Here, we can see that we have not passed the second parameter and also we have saved the data to the other variable and display that data into the Jupyter Notebook.

So, in this tutorial, we have seen both the methods to use any column as an index and also see how we can reset that index using reset_index() method.

#Other Examples of Python Set Index

Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.

Pandas set_index() is a method to set a List, Series or Data frame as an index of a Data Frame. Index column can be set while making a data frame too. But sometimes a data frame is made out of two or more data frames and hence later index can be changed using this method.

>>> df = pd.DataFrame({'month': [1, 4, 7, 10],
...                    'year': [2012, 2014, 2013, 2014],
...                    'sale': [55, 40, 84, 31]})
>>> df
   month  year  sale
0      1  2012    55
1      4  2014    40
2      7  2013    84
3     10  2014    31

Set the index to become the ‘month’ column:

>>> df.set_index('month')
       year  sale
month
1      2012    55
4      2014    40
7      2013    84
10     2014    31

Create a MultiIndex using columns ‘year’ and ‘month’:

>>> df.set_index(['year', 'month'])
            sale
year  month
2012  1     55
2014  4     40
2013  7     84
2014  10    31

Create a MultiIndex using an Index and a column:

>>> df.set_index([pd.Index([1, 2, 3, 4]), 'year'])
         month  sale
   year
1  2012  1      55
2  2014  4      40
3  2013  7      84
4  2014  10     31

Create a MultiIndex using two Series:

>>> s = pd.Series([1, 2, 3, 4])
>>> df.set_index([s, s**2])
      month  year  sale
1 1       1  2012    55
2 4       4  2014    40
3 9       7  2013    84
4 16     10  2014    31

Finally, Pandas Set Index Example | Python DataFrame.set_index() Tutorial For Beginners is over.

Leave A Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.