Pandas DataFrame is a two-dimensional size-mutable, potentially composite tabular data structure with labeled axes (rows and columns). DataFrame can contain the following data type of data.
- The Pandas Series: a one-dimensional labeled array capable of holding any data type with axis labels or indexes. An example of a Series object is one column from a DataFrame.
- The NumPy ndarray, which can be a record or structure.
- The two-dimensional ndarray using NumPy.
- Dictionaries of one-dimensional ndarray, lists, dictionaries, or Series.
How to Create a Pandas DataFrame
To create a Pandas DataFrame, you can use the “pandas.DataFrame()” constructor.
pandas.DataFrame( data, index, columns, dtype, copy)
The data parameter takes forms like ndarray, Series, map, lists, dict, constants, and other DataFrame.
For the row labels, the index parameter for the resulting frame is an Optional Default np.arange(n) if no index is passed to the function.
For column labels, the optional default syntax is np.arange(n). This is only true if no index is passed.
The dtype is the Data type of each column.
The copy parameter is for data copying if the default is False.
Create an Empty DataFrame in Python
To create an empty DataFrame in Pandas, you can use the “pd.DataFrame()” function without any arguments.
import pandas as pd
df1 = pd.DataFrame()
print(df1)
Output
Create DataFrame from ndarrays
import pandas as pd
import numpy as np
data = np.array([18, 19, 21])
df1 = pd.DataFrame(data, index=[1, 2, 3])
print(df1)
In the above example, we have created data from NumPy ndarray and then passed it to the Dataframe function to construct the DataFrame.
Output
Let’s add columns to construct the full table in DataFrame.
import pandas as pd
import numpy as np
data = np.array([['Game Of Thrones', 'HBO'],
['Stranger Things', 'Netflix'],
['Casual', 'Hulu']])
df1 = pd.DataFrame(data, index=[1, 2, 3], columns=['Show Name', 'Streaming Service'])
print(df1)
Okay, we have added the two columns, Show Name and Streaming Service.
Output
Create DataFrame from Dictionary
import pandas as pd
import numpy as np
data = {'Show Name': ['GameOfThrones', 'StrangerThings', 'Casual'],
'Streaming Service': ['HBO', 'Netflix', 'Hulu']}
df1 = pd.DataFrame(data)
print(df1)
Output
Create a DataFrame from the Series
Let’s create a DataFrame from Series.
import pandas as pd
import numpy as np
data = {'name' : 'krunal', 'website' : 'appdividend.com', 'role' : 'author'}
series = pd.Series(data)
df1 = pd.DataFrame(series)
print(df1)
Output
Adding a Column to Your DataFrame
import pandas as pd
import numpy as np
data = {'age': [18, 19, 21], 'name': ['krunal', 'ankit', 'tejash']}
df1 = pd.DataFrame(data, index=[1, 2, 3])
print('Before column added')
print(df1)
df1['education'] = ['BE', 'MCA', 'MBA']
print('After column added')
print(df1)
In the above example, we have added one more column called education.
Output
Removing a Column to Your DataFrame
import pandas as pd
import numpy as np
data = {'age': [18, 19, 21],
'name': ['krunal', 'ankit', 'tejash'],
'education': ['BE', 'MCA', 'MBA']
}
df1 = pd.DataFrame(data, index=[1, 2, 3])
print('After column deleted')
del df1['education']
print(df1)
We have deleted the education column using the del function in the above example.
Output
Adding a Row to Your DataFrame
We can add new rows to the DataFrame using an append() function. The append() function will append the rows at the end.
import pandas as pd
import numpy as np
data = {'age': [18, 19, 21],
'name': ['krunal', 'ankit', 'tejash'],
'education': ['BE', 'MCA', 'MBA']
}
df1 = pd.DataFrame(data, index=[1, 2, 3])
print('Before row added')
print(df1)
data2 = {'age': 22, 'name': 'rushabh', 'education': 'CA'}
df2 = pd.DataFrame(data2, index=[4])
print('After row added')
dfAdd = df1.append(df2)
print(dfAdd)
In the above example, we defined df1 DataFrame and df2 DataFrame.
Our goal is to add the row to the first DataFrame.
For the added context, each data frame here works as a row. So we can add the DataFrame to another dataframe counted as an additional row to another row.
So, to add the row, we need to add the DataFrame to another DataFrame.
The result in DataFrame is the addition of both the DataFrames. In the above example, dfAdd is the final DataFrame which is the addition of the previous DataFrames.
Output
Deleting a Row to Your DataFrame
We can delete the row using an index label or drop rows from a DataFrame. If the label is duplicated, then multiple rows will be dropped.
import pandas as pd
import numpy as np
data = {'age': [18, 19, 21],
'name': ['krunal', 'ankit', 'tejash'],
'education': ['BE', 'MCA', 'MBA']
}
df1 = pd.DataFrame(data, index=[1, 2, 3])
print('Before row deleted')
print(df1)
print('After row deleted')
df2 = df1.drop(2)
print(df2)
In the above example, we remove the row whose index is 2.
Output
That’s it.