## Median and Mean Absolute Error |

Interactive histogram with mean absolute error graph

Recall also that in our general notation, we have a data set
with *n* points arranged in a requency
distribution with *k* classes. The class mark of the *i*'th
class is denoted *x*_{i}; the frequency
of the *i*'th class is denoted *f*_{i}
and the relative frequency of th *i*'th class is denoted *p*_{i}
= *f*_{i} / *n*.

Recall that the median is the value
that is half way through the ordered data set. Specifically, if *n*
is odd then the median is *x*_{j} where *j*
is the smallest integer satisfying

the value with rank (*n* + 1)/2; if *n* is even
the median is (*x*_{j} + *x*_{l})/2
where *j* and *l* are the smallest integers
satisfying

A measure of center and the corresponding measure of spread
are sometimes best thought of in the context of an *error
function*. Generally, the error function gives a measure of
the overall error when a number *t* is used to represent
the entire distribution. Thus, the best measure of center,
relative to this function, is the value of *t* that *minimizes*
the error function, and the *minimum value* of the error
function is the corresponding measure of spread.

In the previous section, for example, we saw that if we start with the mean square error function, then the best measure of center is the mean and the minimum error is the variance. If we start with the root mean square error function, then the best measure of center is again the mean, but the minimum error is the standard deviation.

In this section, we will explore an error function that seems very natural at first, and indeed is related to the median, but upon closer inspection has some definite drawbacks. The main point of this section is that the mean square error function has very special properties that makes it the compelling choice. It is important that you understand this point, because other mean square error functions occur throughout statistics.

**Mean Absolute Error**

The *mean absolute error* function is given by

As the name suggests, the mean absolute error is a weighted average of the absolute errors, with the relative frequencies as the weight factors.

Recall also that we can think of the relative frequency
distribution as the probability
distribution of a random
variable *X* that gives the mark of the class
containing a randomly chosen value from the data set. With this
interpretation, the MSE(*t*) is the first absolute moment of *X*
about *t*:

MAE(

t) =E[|X-t|]

MAE(*t*) may seem to be the simplest measure of overall
error when *t* is used to represent the distribution.

As before, you can construct a frequency distribution and
histogram for a continuous variable *x* by clicking on the
horizontal axis from 0.1 to 5.0. In the applet above, when you
click on points in the left graph to generate the distribution,
MAE is shown in the right graph.

**1.** Note that MAE(*t*) is a
continuous function of *t* for a fixed data set (that is,
for given values of *x*_{i} and *p*_{i})
and its graph is composed of line segments.

**2.** In the applet, click on two
distinct points to generate a distribution with two distinct
points. Note the shape of the MAE graph.

**3.** Explicitly compute MAE(*t*)
for the distribution in Exercise 2 and show that you get the same
function as the one graphed in the applet.

Exercises 2 and 3 show a serious flaw in the mean absolute
error function--in general, there does not exist a unique value
of *t* minimizing MAE(*t*)!

**4.** Click on additional points to
generate a more complicated distribution. Note how the shape of
the MAE graph changes as you add points. Try to formulate a
conjecture about the set of *t* values that minimize MAE(*t*).

In Exercise 4, you should have observed the following general
behavior of the mean absolute error function: If the number of
points *n* is odd, then the median *x*_{j}
(in the notation above) is the unique value of *t* that
minimizes MAE(*t*). However, if *n* is even, then
the set of values minimizing MAE(*t*) is the "median
interval" [*x*_{j}, *x*_{l}].
If *x*_{j} = *x*_{l},
then once again the median is the unique value of *t*
minimizing MAE(*t*). However if *x*_{j}
and *x*_{l} are different, then the
median

(

x_{j}+x_{l}) / 2

has no better claim as the center of the distribution than any other point in the median interval!

The minimum value of MAE is referred to as the *mean
absolute deviation* or MAD.

In the applet, the median ± MAD is drawn in the histogram,
analogous to the mean ± standard deviation bar in the previous
section. In the graph of the MAE function, a vertical red line is
drawn from the median on the *x*-axis to the graph of MAE;
the height of this line is the MAD.

**5.** Reset the applet and click on
points to generate a distribution. Note the general behavior of
the MAE function described in the previous paragraph.

**6.** Try to prove algebraically that
the MAE function has the behavior described above.

**7.** Construct a distribution of each
of the types indicated below. In each case, note the position and
size of the boxplot and the shape of the MAE graph.

- A uniform distribution.
- A symmetric, unimodal distribution.
- A unimodal distribution that is skewed right.
- A unimodal distribution that is skewed left.
- A symmetric bimodal distribution
- A
*U*-distribution.

## Descriptive Statistics |