A Central Limit Theorem is any of a set of weak-convergence results in probability theory. They all express the fact that any sum of many independent and identically-distributed random variables will tend to be distributed according to a particular "attractor distribution". The most important and famous result is called The Central Limit Theorem which states that if the sum of the variables has a finite variance, then it will be approximately normally distributed (i.e., following a Gaussian distribution).
Since many real processes yield distributions with finite variance, this explains the ubiquity of the normal probability distribution.
Several generalizations for finite variance exist which do not require identical distribution but incorporate some condition which guarantees that none of the variables exert a much larger influence than the others. Two such conditions are the Lindeberg condition and the Lyapunov condition. Other generalizations even allow some "weak" dependence of the random variables. Also, a generalization due to Gnedenko and Kolmogorov states that the sum of a number of independent random variables with power-law tail distributions decreasing as 1/|x|α+1 with 0 < α < href="http://en.wikipedia.org/wiki/L%C3%A9vy_skew_alpha-stable_distribution" title="Lévy skew alpha-stable distribution">Lévy distribution as the number of variables grows. This article will only be concerned with the central limit theorem as it applies to distributions with finite variance.
History:
“The central limit theorem has an interesting history. The first version of this theorem was postulated by the French-born English mathematician Abraham de Moivre, who, in a remarkable article published in 1733, used the normal distribution to approximate the distribution of the number of heads resulting from many tosses of a fair coin. This finding was far ahead of its time, and was nearly forgotten until the famous French mathematician Pierre-Simon Laplace rescued it from obscurity in his monumental work Théorie Analytique des Probabilités, which was published in 1812. Laplace expanded De Moivre's finding by approximating the binomial distribution with the normal distribution. But as with De Moivre, Laplace's finding received little attention in his own time. It was not until the nineteenth century was at an end that the importance of the central limit theorem was discerned, when, in 1901, Russian mathematician Aleksandr Lyapunov defined it in general terms and proved precisely how it worked mathematically. Nowadays, the central limit theorem is considered to be the unofficial sovereign of probability theory. ”
See Bernstein (1945) for a historical discussion focusing on the work of Pafnuty Chebyshev and his students Andrey Markov and Aleksandr Lyapunov that led to the first proofs of the C.L.T. in a general setting.
Proof of the central limit theorem:
For a theorem of such fundamental importance to statistics and applied probability, the central limit theorem has a remarkably simple proof using characteristic functions. It is similar to the proof of a (weak) law of large numbers. For any random variable, Y, with zero mean and unit variance (var(Y) = 1), the characteristic function of Y is, by Taylor's theorem,
\varphi_Y(t) = 1 - {t^2 \over 2} + o(t^2), \quad t \rightarrow 0
Where o (t2 ) is "little o notation" for some function of t that goes to zero more rapidly than t2. Letting Yi be (Xi − μ)/σ, the standardised value of Xi, it is easy to see that the standardised mean of the observations X1, X2, ..., Xn is just
Z_n = \frac{n\overline{X}_n-n\mu}{\sigma\sqrt{n}} = \sum_{i=1}^n {Y_i \over \sqrt{n}}.
By simple properties of characteristic functions, the characteristic function of Zn is
\left[\varphi_Y\left({t \over \sqrt{n}}\right)\right]^n = \left[ 1 - {t^2 \over 2n} + o\left({t^2 \over n}\right) \right]^n \, \rightarrow \, e^{-t^2/2}, \quad n \rightarrow \infty.
But, this limit is just the characteristic function of a standard normal distribution, N(0,1), and the central limit theorem follows from the Lévy continuity theorem, which confirms that the convergence of characteristic functions implies convergence in distribution.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment