AppDividend
Latest Code Tutorials

Pandas to_json: How to export DataFrame to JSON File

Pandas DataFrame to_json() function converts the object to a JSON string. If your DataFrame contains NaN’s and None values, it will be converted to Null, and the datetime objects will be converted to the UNIX timestamps.

Pandas to_json

To convert the object to a JSON string, then use the Pandas DataFrame.to_json() function. Pandas to_json() is an inbuilt DataFrame function that converts the object to a JSON string. To export pandas DataFrame to a JSON file, then use the to_json() function.

Syntax

DataFrame.to_json(self, path_or_buf=None, orient=None, 
date_format=None, double_precision=10, 
force_ascii=True, 
date_unit='ms', 
default_handler=None, lines=False, 
compression='infer', index=True)

Parameters

path_or_buf:  File path or object. If the path is not specified, the result is returned as a string. The parameter is optional.

orient: Indication of expected JSON string format. The parameter is required.

  • Series
    • default is ‘index’
    • allowed values are: {‘split’,’ records’,’ index’,’table’}
  • DataFrame
    • default is ‘columns’.
    • allowed values are: {‘split’,’ records’,’ index’,’ columns’,’ values’,’ table’}
  • The format of the JSON string
    • ‘split’: dictionary like {‘index’ -> [index], ‘columns’ ->
      , ‘data’ -> [values]}
    • ‘records’: list like [{column -> value}, … , {column -> value}]
    • ‘index’: dictionary like {index -> {column -> value}}
    • ‘columns’: dictionary like {column -> {index -> value}}
    • ‘values’: just the values array.
    • ‘table’: dictionary like {‘schema’: {schema}, ‘data’: {data}} describing the data, and a data component is like orient=’records’.

date_format: The time unit to encode to govern timestamp and ISO8601 precision. One of ‘s’, ‘ms’, ‘us’, ‘ns’ for a second, millisecond, microsecond, and nanosecond, respectively. The parameter is optional.

default_handler: Handler to call if the object cannot otherwise be converted to a suitable format for JSON. Should receive a single argument, which is the object to convert and return a serializable object. This parameter is optional.

lines: If ‘orient’ is ‘records’ write outline delimited json format. Will throw ValueError if incorrect ‘orient’ since others are not like lists. The parameter is optional.

compression: A string representing a compression in the output file, only used when the first argument is the filename. By default, compression is inferred from the filename. This parameter is optional.

index: Whether to include index values in the JSON string. Not including an index (index=False) is only supported when the orient is ‘split’ or ‘table’. This parameter is optional.

Return Value

If path_or_buf is None, it returns the resulting json format as a string. Otherwise returns None.

Pandas DataFrame to JSON Example

The first step will be let’s create a DataFrame.

import pandas as pd

# Creating Dataframe
df = pd.DataFrame(
    [['Stranger Things', 'Money Heist'], ['Most Dangerous Game', 'The Stranger']],
    columns=['Netflix', 'Quibi'])

print(df)

Output

python3 app.py
               Netflix         Quibi
0      Stranger Things   Money Heist
1  Most Dangerous Game  The Stranger

Now, let’s convert DataFrame to JSON using to_json() function.

# app.py

import pandas as pd

# Creating Dataframe
df = pd.DataFrame(
    [['Stranger Things', 'Money Heist'], ['Most Dangerous Game', 'The Stranger']],
    columns=['Netflix', 'Quibi'])


data = df.to_json(orient='index')
print(data)

Output

python3 app.py
{"0":{"Netflix":"Stranger Things","Quibi":"Money Heist"},"1":{"Netflix":"Most Dangerous Game","Quibi":"The Stranger"}}

Different JSON Formats in to_json() function

There are different ways to format a JSON string, as described in the Parameters section.

You will need to set the orient to your desired format. Here are the options:

  1. split
  2. records
  3. index
  4. values
  5. table
  6. columns (the default format)

For example, the general syntax would look like if you set orient=’columns’.

# app.py

import pandas as pd

# Creating Dataframe
df = pd.DataFrame(
    [['Stranger Things', 'Money Heist'], ['Most Dangerous Game', 'The Stranger']],
    columns=['Netflix', 'Quibi'])


data = df.to_json(orient='columns')
print(data)

Output

python3 app.py
{"Netflix":{"0":"Stranger Things","1":"Most Dangerous Game"},"Quibi":{"0":"Money Heist","1":"The Stranger"}}

orient=’table’

See the following code.

# app.py

import pandas as pd

# Creating Dataframe
df = pd.DataFrame(
    [['Stranger Things', 'Money Heist'], ['Most Dangerous Game', 'The Stranger']],
    columns=['Netflix', 'Quibi'])


data = df.to_json(orient='table')
print(data)

Output

 python3 app.py
{"schema": {"fields":[{"name":"index","type":"integer"},
{"name":"Netflix","type":"string"},{"name":"Quibi","type":"string"}],
"primaryKey":["index"],"pandas_version":"0.20.0"}, 
"data": [{"index":0,"Netflix":"Stranger Things","Quibi":"Money Heist"},{"index":1,
"Netflix":"Most Dangerous Game","Quibi":"The Stranger"}]}

Specific to orient=’table’, if a DataFrame with a literal Index name of index gets written with to_json(), the subsequent read operation will incorrectly set the Index name to None.

This is because DataFrame also uses an index .to_json() to denote a missing Index name, and the subsequent read_json() operation cannot distinguish between the two.

The same limitation is encountered with a MultiIndex and any names beginning with ‘level_’.

orient=’ values’

See the following code.

# app.py

import pandas as pd

# Creating Dataframe
df = pd.DataFrame([['Stranger Things', 'Money Heist'],
                   ['Most Dangerous Game', 'The Stranger']],
                  columns=['Netflix', 'Quibi'])

# Convert DataFrame to JSON
data = df.to_json(orient='values')
print(data)

Output

python3 app.py
[["Stranger Things","Money Heist"],["Most Dangerous Game","The Stranger"]]

orient=’index’

See the following code.

# app.py

import pandas as pd

# Creating Dataframe
df = pd.DataFrame([['Stranger Things', 'Money Heist'],
                   ['Most Dangerous Game', 'The Stranger']],
                  columns=['Netflix', 'Quibi'])

# Convert DataFrame to JSON
data = df.to_json(orient='index')
print(data)

Output

python3 app.py
{"0":{"Netflix":"Stranger Things","Quibi":"Money Heist"},
"1":{"Netflix":"Most Dangerous Game","Quibi":"The Stranger"}}

orient=’records’

See the following code.

# app.py

import pandas as pd

# Creating Dataframe
df = pd.DataFrame([['Stranger Things', 'Money Heist'],
                   ['Most Dangerous Game', 'The Stranger']],
                  columns=['Netflix', 'Quibi'])

# Convert DataFrame to JSON
data = df.to_json(orient='records')
print(data)

Output

python3 app.py
[{"Netflix":"Stranger Things","Quibi":"Money Heist"},
{"Netflix":"Most Dangerous Game","Quibi":"The Stranger"}]

orient=’split’

See the following code.

# app.py

import pandas as pd

# Creating Dataframe
df = pd.DataFrame([['Stranger Things', 'Money Heist'],
                   ['Most Dangerous Game', 'The Stranger']],
                  columns=['Netflix', 'Quibi'])

# Convert DataFrame to JSON
data = df.to_json(orient='split')
print(data)

Output

python3 app.py
{"columns":["Netflix","Quibi"],"index":[0,1],"data":[["Stranger Things","Money Heist"],
["Most Dangerous Game","The Stranger"]]}

Export Pandas DataFrame to JSON File

If we want to export DataFrame to JSON File, we have to pass the parameter called a path, and the function will return None.

See the following code.

# app.py

import pandas as pd

# Creating Dataframe
df = pd.DataFrame([['Stranger Things', 'Money Heist'],
                   ['Most Dangerous Game', 'The Stranger']],
                  columns=['Netflix', 'Quibi'])

# Convert DataFrame to JSON
data = df.to_json('./export.json', orient='index')
print(data)

Output

python3 app.py
None

A newly exported file is created called export.json inside the same directory as the app.py file.

The content of the file is following.

{"0":{"Netflix":"Stranger Things","Quibi":"Money Heist"},"1":{"Netflix":"Most Dangerous Game","Quibi":"The Stranger"}}

Conclusion

DataFrame stores the data. It aligns the data in a tabular fashion. Hence, it is a two-dimensional data structure. If you want to communicate with the servers, you have to convert the DataFrame to JSON and convert DataFrame to JSON, and you have to use Pandas DataFrame.to_json() function.

See also

How to rename DataFrame column

Pandas iloc[]

Pandas Pivot table

Pandas iterate columns

How to assign new columns to Pandas DataFrame

Leave A Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.