AppDividend
Latest Code Tutorials

Pandas mean: How to Find Mean in Pandas DataFrame

0

Mean Function in Pandas is used to calculate the arithmetic mean of a given set of numbers, mean of the DataFrame, column-wise mean, or mean of the column in pandas and row-wise mean or mean of rows in Pandas.

Pandas mean 

To find mean of DataFrame, use Pandas DataFrame.mean() function. The DataFrame.mean() function returns the mean of the values for the requested axis.

If the mean() method is applied to a Pandas series object, then it returns the scalar value, which is the mean value of all the values in the DataFrame.

If the mean() method is applied on a Pandas DataFrame object, then it returns the pandas series object that contains the mean of the values over the specified axis.

Syntax

DataFrame.mean(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)

Parameters

axis{index (0), columns (1)}

Axis for the method to be applied.

skipna: bool, default True

Exclude NA/None values when computing the result.

level: int or level name, default None

If the axis is the MultiIndex, count along with a specific level, collapsing into the Series.

numeric_only: bool, default None

Include only float, int, boolean columns. If the values are None, will attempt to use everything, then use only numeric data. Not implemented for Series.

**kwargs

Additional keyword arguments to be passed to the function.

Return Value

It returns Series or DataFrame (if level specified).

DataFrame mean example

In the df.mean() method, if we don’t specify the axis, then it will take the index axis by default.

In the below example, we will find the mean of DataFrame with reference to the index axis.

# app.py

import pandas as pd

data = {'X': [29, 46, 10, 36],
        'Y': [11, 18, 19, 21],
        'Z': [3, 12, 1, 2]}
df = pd.DataFrame.from_dict(data)
meanDf = df.mean()
print(meanDf)

Output

X    30.25
Y    17.25
Z     4.50
dtype: float64

In this example, we got a series of mean values with respect to the index axis. This is how it calculated.

X = 30.25, it is the output of 29 + 46 + 10 + 36 = 121. And then we need to divide it by 4, which gives 30.25. It is the same for Y and Z.

To calculate mean row-wise in the DataFrame, pass the axis = 1 parameter.

# app.py

import pandas as pd

data = {'X': [29, 46, 10, 36],
        'Y': [11, 18, 19, 21],
        'Z': [3, 12, 1, 2]}
df = pd.DataFrame.from_dict(data)
meanDf = df.mean(axis=1)
print(meanDf)

Output

0    14.333333
1    25.333333
2    10.000000
3    19.666667
dtype: float64

Here, inside the df.mean() function, we passed axis = 1 parameter.

The calculation of the mean function is following.

For the first row, the mean value is 14.33, which is calculated by 29 + 11 + 3 = 43 and then divide that by 3, which gives 14.33. This calculation is the same for the second, third, and fourth row.

The df.mean(axis=0), axis=0 argument calculates the column-wise mean of the dataframe so that the result will be axis=1 is row-wise mean, so you are getting multiple values.

So, if you want to calculate mean values, row-wise, or column-wise, you need to pass the appropriate axis. Otherwise, by default, it will give you index based mean.

Find mean in None valued DataFrame.

There are times when you face lots of None or NaN values in the DataFrame. When we encounter that, we can find the mean value over the column axis.

See the following code.

# app.py

import pandas as pd

data = {'X': [29, 46, None, 36],
        'Y': [11, None, 19, 21],
        'Z': [3, 12, 1, None]}
df = pd.DataFrame.from_dict(data)
meanDf = df.mean(axis=1, skipna=True)
print(meanDf)

Output

0    14.333333
1    29.000000
2    10.000000
3    28.500000
dtype: float64

Finding mean of specific DataFrame column

To find a mean of specific DataFrame column, use df[“column name”].

# app.py

import pandas as pd

data = {'X': [29, 46, None, 36],
        'Y': [11, None, 19, 21],
        'Z': [3, 12, 1, None]}
df = pd.DataFrame.from_dict(data)
meanZ = df['Z'].mean()
print(meanZ)

Output

5.333333333333333

In this example, we got the mean of column Z, which contains None values as well.

The output is calculated like this: 3 + 12 + 1 = 16 and then divide that by 3 which is the final output = 5.3333.

Conclusion

To calculate a mean of the Pandas DataFrame, you can use pandas.DataFrame.mean() method. Using the mean() method, you can calculate mean along an axis, or the complete DataFrame. Just remember the following points.

To find the average for each column in DataFrame.

df.mean(axis=0)

To find the average for each row in DataFrame.

df.mean(axis=1)

That is it for Pandas DataFrame mean() function.

See Also

Pandas Drop column

Pandas DataFrame hist()

Pandas transform

Pandas rank

Pandas values

Leave A Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.