The objective of this article is to illustrate in a fun way the differences between mean and median, and when one of them is not suitable for describing a data set.

The mean of a data set is obtained by summing the values and dividing by the total number of elements. The median, on the other hand, is the value of the variable that holds the central position in an ordered data set.

To clearly illustrate the difference between mean and median, we will use a hypothetical example. Let’s say there are 10 people in a bar:

The average annual salary of those present in the bar is €29,700. There is no defined median, but it would be between €27,000 and €29,000.

And then Amancio Ortega walks in (ta-da…).

Since the richest person in the bar is now much wealthier than the rest, the average annual salary has skyrocketed to €104 million. Yes, Amancio earns that much, just in dividends.

Meanwhile, the median remains in the same range, now defined by Pilar with €29,000.

Most people would agree that €29,000 is a better representation of the annual salary in Spain than €104 million.

The example aims to illustrate that the mean is a “summary” figure that is affected by extreme values, whereas the median is much less so. Therefore, for skewed data distributions, the median is a better measure of central tendency.

So let’s start talking less about means and more about medians.

Inspiration: https://introductorystats.wordpress.com