AppDividend
Latest Code Tutorials

Pandas DataFrame hist: How to Create Histogram in Pandas

0

When exploring a dataset, you will often want to get a quick understanding of the distribution of certain numerical variables within it.

The standard way of visualizing the distribution of a single numerical variable is by using a histogram. A histogram divides the values within the numerical variable into “bins”, and counts several observations that fall into each bin.

By visualizing these binned counts in the columnar fashion, we can obtain the very immediate and intuitive sense of a distribution of values within the variable.

Pandas DataFrame hist() method is a wrapper for the matplotlib pyplot API.

Pandas DataFrame hist()

Pandas DataFrame hist() is a wrapper method for matplotlib pyplot API. The hist() method can be a handy tool to access the probability distribution. The function is called on each Series in the DataFrame, resulting in one histogram per column.

The hist() function is used to create a histogram, which clears the idea of the distribution of certain numerical variables from the dataset.

Syntax

DataFrame.hist(data, column=None, by=None, grid= True, 
xlabelsize=None, xrot=None, ylabelsize=None, yrot=None, 
ax=None, sharex=False, sharey=False, figsize=None, layout=None, 
bins=10, backend= None, **kwargs)

Parameters

It has the following parameters.

  1. data: It is the DataFrame. It’s the Pandas object holding the data.
  2. column: It takes str or sequences, and if passed, it will be used to limit data to a subset of columns.
  3. by: It is an object and is an optional parameter. If passed, then it is used to form histograms for separate groups.
  4. grid: It takes boolean values, and by default, it is True. The grid parameter exists to tell whether to show the grid lines or not.
  5. xlabelsize: It takes an integer and is None by default. If it is specified, it changes the x-axis label size.
  6. xrot: It takes float datatype, and by default, it is None. It defines the rotation of x-axis labels. For instance, a value of 90 displays the x labels rotated 90 degrees clockwise.
  7. ylabelsize: It takes an integer and is None by default. If it is specified, it changes the y-axis label size.
  8. yrot: It takes float datatype, and by default, it is None. It defines the rotation of y-axis labels. For instance, a value of 90 displays the y labels rotated 90 degrees clockwise.
  9. ax: It’s the Matplotlib axes object. By default, its none. It’s the axes to plot the histogram on.
  10. sharex: It takes boolean, and by default, it is true. If the ax is None else False.
  11. sharey: It also takes boolean values, and by default, it’s False. In the case of subplots=True,  it shares the y-axis and sets some y-axis labels to invisible.
  12. figsize: It takes tuple. The size in inches of the figure to create. 
  13. layout: It is an optional parameter and takes a tuple as input. A tuple of (rows, columns) for the layout of the histograms.
  14. bins: It takes integer or sequence, b default it’s 10. It is the number of histogram bins to be used. If the integer is given, bins +1 bin edges are calculated and returned.
  15. backend: It takes str, and by default, it is None. Backend to use instead of a backend specified in the option plotting.backend.
  16. **kwargs: All other plotting keyword arguments to be passed to matplotlib.pyplot.hist().

Return Value

The hist() method returns matplotlib.Axes.Subplot or numpy.ndarray of the DataFrame.

Example program on hist()

Example: Write a program to show the working of hist().

import numpy as np
import pandas as pd

df = pd.DataFrame({
    'length': [2.5, 3.6, 4.6, 4.8, 5.0],
    'width': [2.7, 3.7, 6.4, 0.22, 4.7]
})
hist = df.hist(bins=3)
print(hist)

Output

Pandas DataFrame hist() Method in Python

In the above example, we have created a histogram based on the data given in the DataFrame.

Conclusion

To create a histogram, use the Pandas hist() method. Calling the hist() method on a Pandas DataFrame will return histograms for all non-nuisance Series in the DataFrame.

That is it for the Pandas hist() function example.

Leave A Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.