Pandas DataFrame to_dict() method converts Pandas DataFrame into a Dictionary. It provides the “orient” argument that determines the type of dictionary values.
When working with Web APIs, you often encounter JSON data, a string representation of lists and dictionaries.
Some machine learning algorithms might expect data in the form of dictionaries or lists of dictionaries instead of a DataFrame. That’s where we need the conversion.
Types of “Orient”
There are the following seven types of orientations:
Name | Value |
“dict” (default) | It will create a nested dictionary, with outer keys being column names and inner dictionaries having index labels as keys and cell values. |
“list” | It will create a dictionary in which keys are column names and values are lists of the values of those columns. |
“Series” | It will create a dictionary in which keys are column names and values are Pandas Series objects containing column data. |
“split” | It will create a dictionary with three key-value pairs. |
“tight” | It has information about the index and column names if explicitly set. |
“records” | It will create a list of dictionaries where each dictionary represents the row of DataFrame. |
“index” | It will create a dictionary where keys are index labels and values are dictionaries representing each row. |
orient=’dict’ (default)
If you want a nested-column-centric dictionary, you can use this orientation.
import pandas as pd data = {'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] } df = pd.DataFrame(data) df_to_dict = df.to_dict() print(df_to_dict) # Default Dictionary
Output
{ 'Name': {0: 'Krunal', 1: 'Ankit', 2: 'Rushabh'}, 'Age': {0: 32, 1: 30, 2: 33}, 'City': {0: 'New York', 1: 'London', 2: 'Paris'} }
The output is a nested dictionary, where each column name becomes a “key,” and the dictionary becomes a “value.“
orient=’list’
If you need a simple dictionary of columns and their values as lists from a DataFrame, you can use this approach.
import pandas as pd data = {'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] } df = pd.DataFrame(data) dict_with_list = df.to_dict(orient='list') print(dict_with_list)
Output
{ 'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] }
orient=’records’
Another famous orientation is “records,” which is suitable for JSON serialization or iteration over rows. If you need a list of row-based dictionaries, you can use this option.
import pandas as pd data = {'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] } df = pd.DataFrame(data) dict_with_json = df.to_dict(orient='records') print(dict_with_json)
Output
[ {'Name': 'Krunal', 'Age': 32, 'City': 'New York'}, {'Name': 'Ankit', 'Age': 30, 'City': 'London'}, {'Name': 'Rushabh', 'Age': 33, 'City': 'Paris'} ]
orient=’index’
If you have to access data by row index, you can use this approach.
import pandas as pd data = {'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] } df = pd.DataFrame(data) dict_with_index = df.to_dict(orient='index') print(dict_with_index)
Output
{ 0: {'Name': 'Krunal', 'Age': 32, 'City': 'New York'}, 1: {'Name': 'Ankit', 'Age': 30, 'City': 'London'}, 2: {'Name': 'Rushabh', 'Age': 33, 'City': 'Paris'} }
orient=’series’
You can use this option if you want to retain the Series functionality (like indexing, vectorized operations, etc.) while working with a dictionary structure.
import pandas as pd data = {'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] } df = pd.DataFrame(data) df_to_dict = df.to_dict(orient='series') print(df_to_dict)
Output
orient=’split’
The “split” argument will create a dictionary with three key-value pairs:
- index: It is a list of row index labels.
- columns: It is a list of the column names.
- data: It is a list of lists.
If you want to transfer structured data, you can use this orientation.
import pandas as pd data = {'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] } df = pd.DataFrame(data) dict_with_split = df.to_dict(orient='split') print(dict_with_split)
Output
{ 'index': [0, 1, 2], 'columns': ['Name', 'Age', 'City'], 'data': [['Krunal', 32, 'New York'], ['Ankit', 30, 'London'], ['Rushabh', 33, 'Paris']] }
orient=’tight’
This argument is helpful when you need to preserve metadata about DataFrame’s structure.
import pandas as pd data = {'Name': ['Krunal', 'Ankit', 'Rushabh'], 'Age': [32, 30, 33], 'City': ['New York', 'London', 'Paris'] } df = pd.DataFrame(data) dict_with_tight = df.to_dict(orient='tight') print(dict_with_tight)
Output
{ 'index': [0, 1, 2], 'columns': ['Name', 'Age', 'City'], 'data': [['Krunal', 32, 'New York'], ['Ankit', 30, 'London'], ['Rushabh', 33, 'Paris']], 'index_names': [None], 'column_names': [None] }
That’s all!