Pandas DataFrame

Pandas DataFrame is a two-dimensional size-mutable, potentially composite tabular data structure with labeled axes (rows and columns). DataFrame can contain the following data type of data.

  1. The Pandas Series: a one-dimensional labeled array capable of holding any data type with axis labels or indexes. An example of a Series object is one column from a DataFrame.
  2. The NumPy ndarray, which can be a record or structure.
  3. The two-dimensional ndarray using NumPy.
  4. Dictionaries of one-dimensional ndarray, lists, dictionaries, or Series.

How to Create a Pandas DataFrame

To create a Pandas DataFrame, you can use the “pandas.DataFrame()” constructor.

pandas.DataFrame( data, index, columns, dtype, copy)

The data parameter takes forms like ndarray, Series, map, lists, dict, constants, and other DataFrame.

For the row labels, the index parameter for the resulting frame is an Optional Default np.arange(n) if no index is passed to the function.

For column labels, the optional default syntax is np.arange(n). This is only true if no index is passed.

The dtype is the Data type of each column.

The copy parameter is for data copying if the default is False.

Create an Empty DataFrame in Python

To create an empty DataFrame in Pandas, you can use the “pd.DataFrame()” function without any arguments.

import pandas as pd
df1 = pd.DataFrame()

print(df1)

Output

Python Pandas DataFrame Tutorial | Data Structure In Pandas

Create DataFrame from ndarrays

import pandas as pd
import numpy as np

data = np.array([18, 19, 21])
df1 = pd.DataFrame(data, index=[1, 2, 3])
print(df1)

In the above example, we have created data from NumPy ndarray and then passed it to the Dataframe function to construct the DataFrame.

Output

Create DataFrame from ndarrays

Let’s add columns to construct the full table in DataFrame.

import pandas as pd
import numpy as np

data = np.array([['Game Of Thrones', 'HBO'], 
        ['Stranger Things', 'Netflix'],
        ['Casual', 'Hulu']])
df1 = pd.DataFrame(data, index=[1, 2, 3], columns=['Show Name', 'Streaming Service'])
print(df1)

Okay, we have added the two columns, Show Name and Streaming Service. 

Output

How To Create a Pandas DataFrame

Create DataFrame from Dictionary

import pandas as pd
import numpy as np

data = {'Show Name': ['GameOfThrones', 'StrangerThings', 'Casual'], 
        'Streaming Service': ['HBO', 'Netflix', 'Hulu']}

df1 = pd.DataFrame(data)
print(df1)

Output

Create a DataFrame from Dictionary

Create a DataFrame from the Series

Let’s create a DataFrame from Series.

import pandas as pd
import numpy as np

data = {'name' : 'krunal', 'website' : 'appdividend.com', 'role' : 'author'}
series = pd.Series(data)
df1 = pd.DataFrame(series)
print(df1)

Output

Create a DataFrame from Series

Adding a Column to Your DataFrame

import pandas as pd
import numpy as np

data = {'age': [18, 19, 21], 'name': ['krunal', 'ankit', 'tejash']}
df1 = pd.DataFrame(data, index=[1, 2, 3])
print('Before column added')
print(df1)
df1['education'] = ['BE', 'MCA', 'MBA']
print('After column added')
print(df1)

In the above example, we have added one more column called education.

Output

Adding a Column to Your DataFrame

Removing a Column to Your DataFrame

import pandas as pd
import numpy as np

data = {'age': [18, 19, 21], 
        'name': ['krunal', 'ankit', 'tejash'],
        'education': ['BE', 'MCA', 'MBA']
}
df1 = pd.DataFrame(data, index=[1, 2, 3])
print('After column deleted')
del df1['education']
print(df1)

We have deleted the education column using the del function in the above example.

Output

Removing a Column to Your DataFrame

Adding a Row to Your DataFrame

We can add new rows to the DataFrame using an append() function. The append() function will append the rows at the end.

import pandas as pd
import numpy as np

data = {'age': [18, 19, 21], 
       'name': ['krunal', 'ankit', 'tejash'],
       'education': ['BE', 'MCA', 'MBA']
}
df1 = pd.DataFrame(data, index=[1, 2, 3])
print('Before row added')
print(df1)

data2 = {'age': 22, 'name': 'rushabh', 'education': 'CA'}
df2 = pd.DataFrame(data2, index=[4])
print('After row added')
dfAdd = df1.append(df2)
print(dfAdd)

In the above example, we defined df1 DataFrame and df2 DataFrame.

Our goal is to add the row to the first DataFrame.

For the added context, each data frame here works as a row. So we can add the DataFrame to another dataframe counted as an additional row to another row.

So, to add the row, we need to add the DataFrame to another DataFrame.

The result in DataFrame is the addition of both the DataFrames. In the above example, dfAdd is the final DataFrame which is the addition of the previous DataFrames.

Output

Adding a Row to Your DataFrame

Deleting a Row to Your DataFrame

We can delete the row using an index label or drop rows from a DataFrame. If the label is duplicated, then multiple rows will be dropped.

import pandas as pd
import numpy as np

data = {'age': [18, 19, 21], 
 'name': ['krunal', 'ankit', 'tejash'],
 'education': ['BE', 'MCA', 'MBA']
 }
df1 = pd.DataFrame(data, index=[1, 2, 3])
print('Before row deleted')
print(df1)
print('After row deleted')
df2 = df1.drop(2)
print(df2)

In the above example, we remove the row whose index is 2.

Output

Deleting a Row to Your DataFrame

That’s it.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.