Home > Mathematica, Mathematics > What is best?

## What is best?

I have been learning a lot browsing through John D Cook’s blogs.  I found  Are men better than women at chess particularly illuminating, as well as the associated hyperlinks, esp. the 1961 paper he refers to (available in full text).

I reproduce the paper’s equations:

Consider, a random sample of size $n$  drawn from a standard normal distribution  ($x_j \in N(0,1)$) and let $\phi (x)$ be the probability density function and and $\Phi (x)$ the corresponding cumulative distribution function. In order to determine the order statistics, consider the sample ordered:

$x_{(1)} \leq x_{(2)}\leq \ldots \leq x_{(n)}$

The probability distribution of the $k$-order statistic can be derived (as shown in the paper) as follows:

• the number of ways of choosing k statistic: $\frac{n!}{(k-1)!(n-k)!}$ (there are (k-1)! ways of arranging the numbers below the k-statistic and (n-k)! ways of arranging the numbers above the k-statistic
• the density function follows from: k-1 observations are less than y and n-k observations are greater than y and density of y
• this leads to the probability density distribution:

$f(y)= \frac{n!}{(k-1)!(n-k)!}\Phi (y)^{k-1} (1-\Phi(y))^{n-k}\phi (y)$

• the corresponding cumulative distribution function:

$F(y)= \frac{n!}{(k-1)!(n-k)!} \int_{-\infty}^y \Phi (x)^{k-1} (1-\Phi (x))^{n-k} \phi (x) \, dx$

The $n$-statistic is the maximum of the sample. As observed in the paper, and is evident from above this yields:

$F(y)=\int_{0}^{\Phi (y)}n\Phi ^{n-1}\,d\Phi =\Phi (y)^n$

Using the notation for the paper, let $\alpha$ be $F(y)$, e.g median $\alpha=0.5$.

$y = \Phi^{-1} (\alpha^{1/n})$

I explored these considerations by simulations and analytically. The 100 samples of sizes 100, 1000 and 10000 yielded the following:

The analytic relationship with the simulation data overlaid is shown in the following graphics (the second logarithmic horizontal axis):

In relation to the observations of John D Cook, the gap between median  of maximums of samples of size n and size 0.1 n are shown in the next graphic: