Pandas DataFrame mean() Method

Pandas DataFrame mean() method “returns the mean of the values for the requested axis.” Applying the mean() method on a Pandas series object returns a scalar value.

Syntax

DataFrame.mean(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)

Parameters

  1. axis: Axis for the method to be applied.
  2. skipna: Exclude NA/None values when computing the result.
  3. level: If the axis is the MultiIndex, count along with a specific level, collapsing into the Series.
  4. numeric_only: Include only float, int, and boolean columns. If the values are None, I will attempt to use everything, then use only numeric data. However, I have not implemented it for Series.
  5. **kwargs: Additional keyword arguments are to be passed to the function.

Return Value

It returns Series or DataFrame (if level specified).

Use Pandas DataFrame mean() method on Real-time project

To demonstrate the real-time use of the df.mean() method, we will use the Kaggle Dataset EtherPriceHistory(USD).

Step 1: Load the dataset

To read the csv file in Pandas DataFrame, use the pd.read_csv() method.

import pandas as pd

ether_data = pd.read_csv('EtherPriceHistory(USD).csv')
ether_data.head()

Output

Loading the dataset

You can see that it provides three columns:

  1. Date(UTC): It is the date of the record.
  2. UnixTimeStamp: The Unix timestamp corresponding to the date.
  3. Value: The value of Ethereum in USD on that date.

Step 2: Data Visualization

You can plot the chart of Ethereum prices over the years based on the dataset. We will plot the Date(UTC) on the x-axis and the Value on the y-axis.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

ether_data = pd.read_csv('EtherPriceHistory(USD).csv')

# Set the style for the plot
sns.set_style("whitegrid")

# Plot the trend of Ethereum's value over time
plt.figure(figsize=(14, 7))
sns.lineplot(data=ether_data, x="Date(UTC)", y="Value")
plt.title("Ethereum Value Over Time (USD)")
plt.xlabel("Date")
plt.ylabel("Value (USD)")
plt.show()

Output

Data Visualization

Step 3: Calculate the mean

To calculate the mean price of Ethereum, use the .mean() method.

# Calculate the mean of the 'Value' column
ether_mean_value = ether_data['Value'].mean()
print(ether_mean_value)

Output

Calculate the mean

The average value of Ethereum over the provided period in the dataset is approximately USD 211.73.

From our analysis, we can say that Ethereum’s value has seen significant fluctuations over time. It started very low, saw rapid growth, and witnessed periods of decline.

You can visualize the mean like this:

Visual of Applying df.mean() function on dataset

This project can serve as a basic introduction to time-series analysis and visualization, and it provides a foundation for more advanced analyses, such as forecasting or understanding factors influencing Ethereum’s price.

I hope you will learn from this real-time project!

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.