The easiest way to convert Pandas DataFrame into a Dictionary is using the .to_dict() method. The to_dict() method uses the “orient” argument that determines the type of dictionary values.
When working with Web APIs, you often encounter JSON data, a string representation of lists and dictionaries.
Some machine learning algorithms might expect data in the form of dictionaries or lists of dictionaries instead of DataFrame. That’s where we need the conversion.
Types of “Orient”
There are following seven types of orients:
Name | Value |
“dict” (default) | It will create a nested dictionary, with outer keys being column names and inner dictionaries having index labels as keys and cell values. |
“list” | It will create a dictionary in which keys are column names and values are lists of the values of those columns. |
“Series” | It will create a dictionary in which keys are column names and values are Pandas Series objects containing column data. |
“split” | It will create a dictionary with three key-value pairs. |
“tight” | It has information about the index and column names if explicitly set. |
“records” | It will create a list of dictionaries where each dictionary represents the row of DataFrame. |
“index” | It will create a dictionary where keys are index labels and values are dictionaries representing each row. |
orient=’dict’ (default)
If you want a nested-column-centric dictionary, you can use this orientation.
import pandas as pd data = {'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] } df = pd.DataFrame(data) df_to_dict = df.to_dict() print(df_to_dict) # Default Dictionary
Output
{ 'Name': {0: 'Krunal', 1: 'Ankit', 2: 'Rushabh'}, 'Age': {0: 32, 1: 30, 2: 33}, 'City': {0: 'New York', 1: 'London', 2: 'Paris'} }
The output is a nested dictionary, where each column name becomes a “key,” and the dictionary becomes a “value.“
orient=’list’
If you need a simple dictionary of columns and their values as lists from a DataFrame, you can use this approach.
import pandas as pd data = {'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] } df = pd.DataFrame(data) dict_with_list = df.to_dict(orient='list') print(dict_with_list)
Output
{ 'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] }
orient=’records’
Another famous orientation is “records,” which is suitable for JSON serialization or iteration over rows. If you need a list of row-based dictionaries, you can use this option.
import pandas as pd data = {'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] } df = pd.DataFrame(data) dict_with_json = df.to_dict(orient='records') print(dict_with_json)
Output
[ {'Name': 'Krunal', 'Age': 32, 'City': 'New York'}, {'Name': 'Ankit', 'Age': 30, 'City': 'London'}, {'Name': 'Rushabh', 'Age': 33, 'City': 'Paris'} ]
orient=’index’
If you have to access data by row index, you can use this approach.
import pandas as pd data = {'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] } df = pd.DataFrame(data) dict_with_index = df.to_dict(orient='index') print(dict_with_index)
Output
{ 0: {'Name': 'Krunal', 'Age': 32, 'City': 'New York'}, 1: {'Name': 'Ankit', 'Age': 30, 'City': 'London'}, 2: {'Name': 'Rushabh', 'Age': 33, 'City': 'Paris'} }
orient=’series’
You can use this option if you want to retain the Series functionality (like indexing, vectorized operations, etc.) while working with a dictionary structure.
import pandas as pd data = {'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] } df = pd.DataFrame(data) df_to_dict = df.to_dict(orient='series') print(df_to_dict)
Output
orient=’split’
The “split” argument will create a dictionary with three key-value pairs:
- index: It is a list of row index labels.
- columns: It is a list of the column names.
- data: It is a list of lists.
If you want to transfer structured data, you can use this orientation.
import pandas as pd data = {'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] } df = pd.DataFrame(data) dict_with_split = df.to_dict(orient='split') print(dict_with_split)
Output
{ 'index': [0, 1, 2], 'columns': ['Name', 'Age', 'City'], 'data': [['Krunal', 32, 'New York'], ['Ankit', 30, 'London'], ['Rushabh', 33, 'Paris']] }
orient=’tight’
This argument is helpful when you need to preserve metadata about DataFrame’s structure.
import pandas as pd data = {'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] } df = pd.DataFrame(data) dict_with_tight = df.to_dict(orient='tight') print(dict_with_tight)
Output
{ 'index': [0, 1, 2], 'columns': ['Name', 'Age', 'City'], 'data': [['Krunal', 32, 'New York'], ['Ankit', 30, 'London'], ['Rushabh', 33, 'Paris']], 'index_names': [None], 'column_names': [None] }
That’s all!