Python statistics.median() function calculates the median of the given dataset, which represents the middle value when the data is sorted in ascending order. The dataset can be any of the following: a list, a tuple, a set, or a dictionary.
For an odd number of values in a dataset, it selects the middle value after the data is sorted.
Let’s calculate the median value of a list:
import statistics dataset = [11, 19, 48, 18, 21] print(statistics.median(dataset)) # Output: 19 (Middle value of the dataset) # new dataset = [11, 18, 19, 21, 48]
In the above code, the dataset is a list that contains five elements.
Now, the values in the dataset are unordered. So, when you apply the median() function to it, the first thing it will do is sort the list in ascending order.
So, the dataset becomes this: [11, 18, 19, 21, 48], and then it finds the middle value, which is 19, and hence it returns 19 in the output.
Even-length dataset
For an even number, the statistics.median() method calculates the arithmetic mean of the two central values.
The output may yield a float, as it represents the average of two values.
import statistics dataset_even = [11, 19, 21, 48] middle_average = statistics.median(dataset_even) print(middle_average) # Output: 20.0 ((19 + 21) / 2)
In this code, the central values of the dataset_even are 19 and 21. As you can see from the above output, it returns the average of 19 and 21, which is 20.0.
Empty dataset
If the dataset is empty and you attempt to find its median value, it will throw StatisticsError.
import statistics data = [] try: result = statistics.median(data) except statistics.StatisticsError as e: print(e) # Output: no median for empty data
All identical values in a dataset
If all the values in the dataset are the same, the median value will be that value, regardless of the dataset’s length.
import statistics same_dataset = [3, 3, 3, 3, 3] value = statistics.median(same_dataset) print(value) # Output: 3
Median of a tuple
A tuple behaves just like a list. First, it will sort the tuple. Then, if the number of elements is odd, it returns the middle value; if it is even, it returns the average of the two central values.
import statistics tuple = (10, 2, 8, 4) print(statistics.median(tuple)) # Output: 6.0
Median of a set
Sets are unordered, but median() accepts them since they are iterable. Internally, Python sorts them before performing the calculation.
import statistics set = {11, 21, 26, 14, 25} print(statistics.median(set)) # Output: 21
In this code, it first sorts the set. Therefore, the set will become: {11, 14, 21, 25, 26}.
Now, the middle value of the set is obviously 21, and hence it is in the output.
Median of a Dictionary
If the input is a dictionary, you can have two choices:
- Median of the keys
- Median of the values
By default, the median() function calculates the median of the dict keys.
But you can calculate the median of values by calling dict.values() function first returns the list of values and then passes it to the median() function.
Median of the keys
In most cases, dictionary keys are made of strings. So, if the data type is string, you can’t find its median. If you attempt it, it will throw this error: TypeError: unsupported operand type(s) for /: ‘str’ and ‘int’
import statistics dict = { 'k': 21, 'b': 19, 'a': 4, 'y': 18 } print(statistics.median(dict)) # TypeError: unsupported operand type(s) for /: 'str' and 'int'
Median of the values
In our case, the values are integers. Therefore, we can get the list of values by using a dictionary.values() function.
Then, it will sort the list in ascending order based on the values and then find the middle value.
import statistics dict = {'k': 21, 'b': 19, 'a': 4, 'y': 18} values = dict.values() print(values) # Output: dict_values([21, 19, 4, 18]) print(statistics.median(values)) # Output: 18.5
The output is 18.5 because the ordered list of values is [4, 18, 19, 21]. The average value of 18 and 19 is (18 + 19) / 2 = 18.5, and hence it is the output.