# Introduction

In the previous chapter we discussed probability theory, which we expressed in terms of a variable $X$. We defined $X$ as a set of realizations of some process, which in turn is governed by rules of probability regarding potential outcomes in the sample space.

The variables we were talking about have been what are called random variables, which means that they have a probability distribution. As we noted before, broadly speaking, there are two kinds of random variables: discrete and continuous.

Discrete variables can take on any one of several distinct, mutually-exclusive values.

• Congressperson's ideology score {0,1,2,3...,100}
• An individual's political affiliation (Democrat, Republican, Independent}
• Whether or not a country is a member of European Union (true/false)

A Continuous variable can take on any value in its range.

• Individual income
• National population

This chapter focuses on a family of continuous distributions that are the most widely used in statistical inference, and are found in a wide variety of contexts, both applied and theoretical. The Normal distribution is the well-known "bell-shaped curve" that most students usually encounter first in the artificial context of academic testing, but due to a powerful result called the Central Limit Theorem, occurs in a wide variety of uncontrolled situations where the value of a random variables is determined by the average effect of a large number of random variables with any combination of distributions. The χ2, t and F distributions can be derived from various products of normally-distributed variables, and are used extensively in statistical inference and applied statistics, so it's useful to understand them in a bit of depth.

## Need to do

<a href="User:Philip Schrodt">Philip Schrodt</a> 06:57, 13 July 2011 (PDT)

• Probably need to get most of the probability chapter---which are the moment hasn't been started---written before this one. In particular, will the pdf and cdf be defined there or here?
• Add some of the discrete distributions, particularly the binomial
• Do we add---or link to on another page---the derivation of the mean and standard errors for these: that code is available in CCL on an assortment of places on the web

# The Normal Distribution

We are all used to seeing normal distributions described, and to hearing that something is "normally distributed." We know that a normal distribution is "bell-shaped," and symmetrical, and probably that it has some mean and some standard deviation.

Formally, if X is a normally distributed variate with mean μ and variance σ2, then:

<img _fckfakelement="true" _fck_mw_math="f(x) = \frac{1}{\sigma \sqrt{2\pi}} \text{exp} \left( - \frac{(x - \mu)^{2}}{2 \sigma^{2}} \right)" src="/images/math/0/5/c/05c01fdab44e6d59e0edc24028e1206a.png" />.

We denote this X˜N(μ,σ2), and say X is distributed normally with mean mu and variance sigma squared. The symbol φ is often used as a shorthand to represent the normal density in \eqref{normalden}:

<img _fckfakelement="true" _fck_mw_math="X \sim \phi_{\mu, \sigma^{2}}" src="/images/math/d/6/4/d64347ee6feb5ed546c6c65e3674dfb5.png" />.

The corresponding normal CDF -- which is the probability of a normal random variate taking on a value less than or equal to some specified number -- is (as always) the indefinite integral of \eqref{normalden}. This has no simple closed-form solution, so we typically just write:

<img _fckfakelement="true" _fck_mw_math="F(x) \equiv \Phi_{\mu, \sigma^{2}}(x) = \int \phi_{\mu, \sigma^{2}} f(x) d x." src="/images/math/4/b/0/4b02e1a28b9bbc37b9bec48dcc04b239.png" />

Here are a bunch of normal curves

<img src="/images/thumb/5/55/StatDist.Normals.png/512px-StatDist.Normals.png" _fck_mw_filename="StatDist.Normals.png" _fck_mw_width="512" alt="StatDist.Normals.png" />

<img src="/images/thumb/b/b7/StatDist.NormalCDFs.png/512px-StatDist.NormalCDFs.png" _fck_mw_filename="StatDist.NormalCDFs.png" _fck_mw_width="512" alt="Normal cumulative distribution functions" />

## Bases for the Normal Distribution

The most common justification for the normal distribution has its roots in the 'central limit theorem'. Consider i = 1,2,...N independent, real-valued random variates $Xi$, each with finite mean $μi$ and variance <img _fckfakelement="true" _fck_mw_math="\sigma^{2}_{i} > 0" src="/images/math/8/e/3/8e3e18b02caaa63b535d363869d670c9.png" />. If we consider a new variable $X$ defined as the sum of these variables:

<img _fckfakelement="true" _fck_mw_math="X = \sum_{i=1}^{N} X_{i}" src="/images/math/a/7/5/a752c37c7aaf9b5d42055a00f9b5fd37.png" />

then we know that

<img _fckfakelement="true" _fck_mw_math=" \text{E}(X) = \sum_{i=1}^{N} \mu_{i} " src="/images/math/d/a/7/da7eb476d715554622bbefea75301103.png" />

and

<img _fckfakelement="true" _fck_mw_math=" \text{Var}(X) = \sum_{i=1}^{N} \sigma^{2}_{i} " src="/images/math/7/7/d/77de1add6aafc0249d728571a9683b18.png" />

The central limit theorem states that:

<img _fckfakelement="true" _fck_mw_math=" \underset{N \rightarrow \infty}{\lim} X = \underset{N \rightarrow \infty}{\lim} \sum_{i=1}^{N} X_{i} \overset{D}{\rightarrow} N(\cdot) " src="/images/math/8/a/1/8a127d6ed5c5e2dd71ebdf34d8682057.png" />

where the notation <img _fckfakelement="true" _fck_mw_math="\overset{D}{\rightarrow}" src="/images/math/0/9/3/0931fec8f6726354023e382d5c71be2c.png" /> indicates convergence in distribution. That is, as N gets sufficiently large, the distribution of the sum of N independent random variates with finite mean and variance will converge to a normal distribution. As such, we often think of a normal distribution as being appropriate when the observed variable X can take on a range of continuous values, and when the observed value of X can be thought of as the product of a large number of relatively small, independent shocks or perturbations.

## Properties of the Normal Distribution

• A normal variate X has support in <img _fckfakelement="true" _fck_mw_math="\mathfrak{R}" src="/images/math/6/1/0/610bc52ec5a62efd154a01deb92a0d5c.png" />.
• The normal is a two-parameter distribution, where <img _fckfakelement="true" _fck_mw_math="\mu \in (-\infty, \infty)" src="/images/math/8/9/0/8907d5e8f9cb4328f76778e16b69fba7.png" /> and <img _fckfakelement="true" _fck_mw_math="\sigma^{2} \in (0, \infty)" src="/images/math/c/1/0/c100ff23356ea61cfd105a64d8774c53.png" />.
• The normal distribution is always symmetrical (M3 = 0) and mesokurtic.
• item The normal distribution is preserved under a linear transformation. That is, if X˜N(μ,σ2), then aX + b˜N(aμ + b,a2σ2). (Why? Recall our earlier results on μ and σ2).

## The Standard Normal Distribution

One linear transformation is especially useful:

<img _fckfakelement="true" _fck_mw_math=" \begin{align} b & = \frac{-\mu}{\sigma} \\ a & = \frac{1}{\sigma} \end{align} " src="/images/math/8/e/e/8ee8272b17591d286ca783dd0a7b5dd0.png" />.

This yields:

<img _fckfakelement="true" _fck_mw_math=" \begin{align} ax + b & \sim N(a\mu+b, a^{2} \sigma^{2}) \\ & \sim N(0,1) \end{align} " src="/images/math/0/4/0/040ae3bbba3a7ab522dca56833ad2722.png" />

This is the standard normal density function. We often denote this <img _fckfakelement="true" _fck_mw_math="\phi(\cdot)" src="/images/math/5/2/d/52d16e95602c985d5f23b36ddc663415.png" />, and say that "X is distributed as standard normal." We can also get this by transforming ("standardizing") the normal variate X...

• If X˜N(μ,σ2), then <img _fckfakelement="true" _fck_mw_math="Z = \frac{(x - \mu)}{\sigma} \sim N(0,1)" src="/images/math/8/1/e/81e060afad46dc085b84bc31bf94454c.png" />.
• The density function then reduces to:

<img _fckfakelement="true" _fck_mw_math=" f(z) = \equiv \phi(z) = \frac{1}{\sqrt{2\pi}} \text{exp} \left[ - \frac{(z)^{2}}{2} \right] " src="/images/math/e/3/f/e3f634ea319de07e04630c939d8e364f.png" />

Similarly, we often write the CDF for the standard normal as <img _fckfakelement="true" _fck_mw_math="\Phi(\cdot)" src="/images/math/8/9/d/89d767697c1931e19b576aef0e242f9b.png" />.

## Why do we care about the normal distribution?

The normal distribution's importance lies in its relationship to the central limit theorem. As we'll discuss at more length later, the central limit theorem means that as one's sample size increases, the distribution of sample means (or other estimates) approaches a normal distribution.

## Additional points needed on the normal

<a href="User:Philip Schrodt">Philip Schrodt</a> 07:00, 13 July 2011 (PDT)

• More extended discussion of the CLT, and a note that if we are dealing with a data generating process where the "error" is the average (or cumulative) effect of a large number of random variables with a variety of distributions, the CLT tells us that the net effect will be normally distributed. This, in turn, explains why linear models that assume Normally distributed error---regression and ANOVA---have proven to be so robust in practice
• Link to a number of examples of normally distributed data...should be easy to find these on the web. E.g. the classical height. Maybe SAT scores, though these are artificially normal
• ref to the wikipedia article; there is also a nice graphic to snag from there---introductory sidebar---which shows the standard normal
• sidebar on the log-normal?
• something about the bivariate normal and some nice graphics of this?
• sidebar on the issue of fat tails and how these destroyed the economy in 2007?---there is a fairly readable Wired article on this: http://www.wired.com/techbiz/it/magazine/17-03/wp_quant

# The χ2 Distribution

The chi-square (χ2) distribution is a one-parameter distribution defined only how positive values. If Z˜N(0,1), then <img _fckfakelement="true" _fck_mw_math="Z^{2} \sim \chi^{2}_{1}" src="/images/math/6/3/2/6328cdfbc5adb977368c7175e79b1484.png" />. That is, the square of a N(0,1) variable is chi-squared with one degree of freedom. The fact that the square of a standard normal variate is a one-degree-of-freedom chi-square variable also explains why (e.g.) a chi-squared variate is only defined for nonnegative real numbers. If W1,W2,...Wk are all independent <img _fckfakelement="true" _fck_mw_math="\chi^{2}_{1}" src="/images/math/9/e/b/9eb85f77631ff93a56a6bd530579baac.png" /> variables, then <img _fckfakelement="true" _fck_mw_math="\sum_{i=1}^{k}W_{i} \sim \chi^{2}_{k}" src="/images/math/f/2/8/f2879b38c36a66d597e5669963650a44.png" />. (The sum of k independent chi-squared variables is chi-squared with k degrees of freedom). By extension, the sum of the squares of k independent N(0,1) variables are also <img _fckfakelement="true" _fck_mw_math="\sim \chi^{2}_{k}" src="/images/math/d/f/b/dfb0914c2a449a81c51f1d89ea4ec283.png" />.

The χ2 distribution is positively skewed, with E(W) = k and

Var(W) = 2k.

Figure below presents five χ2 densities with different values of k.

<img src="/images/thumb/8/89/StatDist.ChiSquares.png/512px-StatDist.ChiSquares.png" _fck_mw_filename="StatDist.ChiSquares.png" _fck_mw_width="512" alt="StatDist.ChiSquares.png" />

Need to define degrees of freedom here


### Characteristics of the χ2 Distribution

If Wj and Wk are independent <img _fckfakelement="true" _fck_mw_math="\chi^{2}_{j}" src="/images/math/6/b/0/6b0005e43b70a25520c6a6abc4d0ea47.png" /> and <img _fckfakelement="true" _fck_mw_math="\chi^{2}_{k}" src="/images/math/f/0/f/f0ff6604e7eae2605b1b118a3528ea32.png" /> variables, respectively, then Wj + Wk is <img _fckfakelement="true" _fck_mw_math="\sim \chi^{2}_{j+k}" src="/images/math/1/0/c/10cddb6bd869d6187a32e742d90a0891.png" />; this result can be extended to any number of independent chi-squared variables. This in turn implies the result the sum of the squares of k independent N(0,1) variables are also <img _fckfakelement="true" _fck_mw_math="\sim \chi^{2}_{k}" src="/images/math/d/f/b/dfb0914c2a449a81c51f1d89ea4ec283.png" />

## Derivation of the χ2 from Gamma functions

Gill discusses the χ2 distribution as a special case of the gamma PDF. That's fine, but there's actually a much more intuitive way of thinking about it, and one that comports more closely with how it is (most commonly) used in statistics. Formally, a variable W that is distributed as χ2 with k degrees of freedom has a density of:

<img _fckfakelement="true" _fck_mw_math="\begin{align} f(w) &=& \frac{1}{2^{k} \Gamma(k)} w^{k} \text{exp} \left[ \frac{-w}{2} \right] \\ &=& \frac{w^{\frac{k-2}{2}} \exp(\frac{-w}{2})}{2^{\frac{k}{2}} \Gamma(\frac{k}{2})} \end{align} " src="/images/math/5/1/5/5158ca285caaeda5490da80182369578.png" />

where <img _fckfakelement="true" _fck_mw_math="\Gamma(k) = \int_{0}^{\infty} t^{k - 1} \text{exp}(-t) \, dt" src="/images/math/c/9/d/c9d5de0db64c93347442795b22a9f129.png" /> is the gamma integral (see, e.g., Gill, p.\ 222). As with the normal distribution, the need to write the distribution in this fashion reflects the fact that it has no closed-form solution. The corresponding CDF is

<img _fckfakelement="true" _fck_mw_math=" F(w)=\frac{\gamma(k/2,w/2)}{\Gamma(k/2)} " src="/images/math/1/3/d/13d302967fd94bd45df3aa569e5503f2.png" />

where <img _fckfakelement="true" _fck_mw_math="\Gamma(\cdot)" src="/images/math/1/1/e/11ee491fb6e261ad0b4f721d59ea7318.png" /> is as before and <img _fckfakelement="true" _fck_mw_math="\gamma(\cdot)" src="/images/math/e/4/8/e4812a607c060d3c6b4680f1b884021d.png" /> is the \texttt{http://en.wikipedia.org/wiki/Incomplete\_Gamma\_function}{lower incomplete gamma function}. We write this\footnote{One also occasionally sees W˜χ2(k), with the degrees of freedom in parentheses.} as <img _fckfakelement="true" _fck_mw_math="W \sim \chi^{2}_{k}" src="/images/math/4/2/9/4299c7d4b1834264833030cc65295f32.png" />, and say W is distributed as chi-squared with k degrees of freedom. \\

## Additional points needed on the chi-square

<a href="User:Philip Schrodt">Philip Schrodt</a> 07:00, 13 July 2011 (PDT)

• Probably want to mention the use in contingency tables here, since the connection isn't obvious.
• Agresti and Finlay state this was introduced by Pearson in 1900, apparently in the context of contingency tables---confirm this, any sort of story here?
• As df becomes very large, the chi-square approximates the normal; this is a asymptotic distribution and for practical purposes, can be used if df > 50
• Discuss more about the assumption of statistical independence?
• Chi-square as the test for comparing whether an observed frequency fits a known distribution

# Student's t Distribution

For a variable X which is distributed as t with k degrees of freedom, the PDF function is:

<img _fckfakelement="true" _fck_mw_math=" f(x) = \frac{\Gamma(\frac{k+1}{2})} {\sqrt{k\pi}\,\Gamma(\frac{k}{2})} \left(1+\frac{x^2}{k} \right)^{-(\frac{k+1}{2})}\! " src="/images/math/e/6/d/e6d0efa21a2ef9e5400f5d5cfdac879f.png" />

where once again <img _fckfakelement="true" _fck_mw_math="\Gamma(\cdot)" src="/images/math/1/1/e/11ee491fb6e261ad0b4f721d59ea7318.png" /> is the gamma integral. We write X˜tk, and say X is distributed as Student's t with k degrees of freedom. The figure below presents <i>t</i> densities for five different values of <i>k</i>, along with a standard normal density for comparison.

<img src="/images/thumb/d/d8/StatDist.tDists.png/512px-StatDist.tDists.png" _fck_mw_filename="StatDist.tDists.png" _fck_mw_width="512" alt="StatDist.tDists.png" />

The t-distribution is sometimes known as "Student's t", after a then-anonymous student of the statistician Karl Pearson. The story, from Wikipedia,

The t-statistic was introduced in 1908 by William Sealy Gosset, a chemist working for the Guinness brewery in Dublin, Ireland ("Student" was his pen name). Gosset had been hired due to Claude Guinness's innovative policy of recruiting the best graduates from Oxford and Cambridge to apply biochemistry and statistics to Guinness' industrial processes. Gosset devised the t-test as a way to cheaply monitor the quality of stout. He published the test in Biometrika in 1908, but was forced to use a pen name by his employer, who regarded the fact that they were using statistics as a trade secret. In fact, Gosset's identity was unknown to fellow statisticians.

Note a few things about t:

• The mean/mode/median of a t-distributed variate is zero, and its variance is <img _fckfakelement="true" _fck_mw_math="\frac{k}{k - 2}" src="/images/math/d/7/5/d75c6ef4ee2b360ba8f65eb687e33f1e.png" />.
• t looks like a standard normal distribution (symmetrical, bell-shaped) but has thicker tails (read: higher probabilities of draws being relatively far from the mean/mode). However...
• ...as k gets larger, t converges to a standard normal distribution; at or above k = 30 or so, the two are effectively indistinguishable.

The importance of the t distribution lies in its relationship to the normal and chi-square distributions. In particular, if Z˜N(0,1) and <img _fckfakelement="true" _fck_mw_math="W \sim \chi^{2}_{k}" src="/images/math/4/2/9/4299c7d4b1834264833030cc65295f32.png" />, and Z and W are independent, then

<img _fckfakelement="true" _fck_mw_math="\frac{Z}{\sqrt{W/k}} \sim t_{k} " src="/images/math/a/a/2/aa26d30be9d0c50902042558dcf5f532.png" />

That is, the ratio of an N(0,1) variable and a (properly transformed) chi-squared variable follows a t distribution, with d.f.\ equal to the number of d.f.\ of the chi-squared variable. Of course, this also means that <img _fckfakelement="true" _fck_mw_math="\frac{Z^{2}}{W/k} \sim t_{k}." src="/images/math/6/c/c/6cc32b16a52a7fcdaf5f98e77177b6b2.png" />

Since we know that <img _fckfakelement="true" _fck_mw_math="Z^{2} \sim \chi^{2}_{1}" src="/images/math/6/3/2/6328cdfbc5adb977368c7175e79b1484.png" />, this means that another derivation of the t distribution is as a ratio of a <img _fckfakelement="true" _fck_mw_math="\chi^{2}_{1}" src="/images/math/9/e/b/9eb85f77631ff93a56a6bd530579baac.png" /> variate and a <img _fckfakelement="true" _fck_mw_math="\chi^{2}_{k}" src="/images/math/f/0/f/f0ff6604e7eae2605b1b118a3528ea32.png" /> variate.

## Additional points needed on the t distribution

<a href="User:Philip Schrodt">Philip Schrodt</a> 07:00, 13 July 2011 (PDT)

• May want to note that it is ubiquitous in the inference on regression coefficients
• Might want to note somewhere---this might go earlier in the discussion of df---that in most social science research (e.g. survey research and time-series cross-sections), the sample sizes are well above the point where the t is asymtotically normal. The t is actually important only in very small samples, though these can be found in situations such as small subsamples in survey research (are Hispanic ferret owners in Wyoming more likely to support the Tea Party?) and situations where the population itself is small (e.g. state membership in the EU, Latin America, or ECOWAS), and experiments with a small number of subjects or cases (this is commonly found in medical research, for example, and this also motivated Gossett's original development of the test, albeit with yeast and hops---we presume---rather than experimental subjects.). In these instances, using the conventional normal approximation to the t---in particular, the rule-of-thumb of looking for standard errors less than twice the size of the coefficient estimate to establish two-tailed 0.05 significance---will be misleading.

# The F Distribution

An F distribution is the ratio of two chi-squared variates. If W1 and W2 are independent and <img _fckfakelement="true" _fck_mw_math="\sim \chi^{2}_{k}" src="/images/math/d/f/b/dfb0914c2a449a81c51f1d89ea4ec283.png" /> and <img _fckfakelement="true" _fck_mw_math="\chi^{2}_{\ell}" src="/images/math/3/f/c/3fc30391893e01b4c49b1a8ec41b574c.png" />, respectively, then <img _fckfakelement="true" _fck_mw_math="\frac{W_{1}}{W_{2}} \sim F_{k,\ell} " src="/images/math/0/5/1/05164dba515bcc3a894d2ce3731268cc.png" />

That is, the ratio of two chi-squared variables is distributed as F with d.f.\ equal to the number of d.f.\ in the numerator and denominator variables, respectively.

Formally, if X is distributed as F with k and <img _fckfakelement="true" _fck_mw_math="\ell" src="/images/math/3/3/4/334ce9eb79df1178b0380461c9eaa09e.png" /> degrees of freedom, then the PDF of X is:

<img _fckfakelement="true" _fck_mw_math=" f(x) = \frac{\left(\frac{k\,x}{k\,x + \ell}\right)^{k/2} \left(1-\frac{k\,x}{k\,x + \ell}\right)^{\ell/2}}{x\; \mathrm{B}(k/2, \ell/2)} " src="/images/math/7/6/8/76893d623336cf8f6974dd4a40735ec7.png" />

where <img _fckfakelement="true" _fck_mw_math="\mathrm{B}(\cdot)" src="/images/math/c/2/d/c2d3433e3640c11e1f072c4006e17c11.png" /> is the beta function. That is, <img _fckfakelement="true" _fck_mw_math="\mathrm{B}(x,y) = \int_0^1t^{x-1}(1-t)^{y-1}\,dt" src="/images/math/0/6/8/0689adb68c7ec29099fc40c34aa0dad5.png" />.} We write <img _fckfakelement="true" _fck_mw_math="X \sim F_{k,\ell}" src="/images/math/a/2/c/a2c04e5e0527e6a1156e6bbc58d89c7f.png" />, and say X is distributed as F with k and <img _fckfakelement="true" _fck_mw_math="\ell" src="/images/math/3/3/4/334ce9eb79df1178b0380461c9eaa09e.png" /> degrees of freedom. \\}

The F is a two-parameter distribution, with degrees of freedom parameters (say k and <img _fckfakelement="true" _fck_mw_math="\ell" src="/images/math/3/3/4/334ce9eb79df1178b0380461c9eaa09e.png" />), both of which are limited to the positive integers. An F variate X takes values only on the non-negative real line; it has expected value equal to <img _fckfakelement="true" _fck_mw_math="\text{E}(X) = \frac{\ell}{\ell - 2}," src="/images/math/8/d/a/8dab1cce0da2d33b88c188a1cab3c153.png" /> which implies that the mean of an F-distributed variable converges on 1.0 as <img _fckfakelement="true" _fck_mw_math="\ell \rightarrow \infty" src="/images/math/d/1/3/d132d0a78b8c0819b6187998c23cd1fb.png" />. Likewise, it has variance

<img _fckfakelement="true" _fck_mw_math="\text{Var}(X) = \frac{2\,\ell^2\,(k+\ell-2)}{k (\ell-2)^2 (\ell-4)}, " src="/images/math/3/5/1/35190b821a12585a7f4b779386f719ac.png" /> which bears no simple relationship to either k or <img _fckfakelement="true" _fck_mw_math="\ell" src="/images/math/3/3/4/334ce9eb79df1178b0380461c9eaa09e.png" />.

The F distribution is (generally) positively skewed. Examples of some F densities with different values of k and <img _fckfakelement="true" _fck_mw_math="\ell" src="/images/math/3/3/4/334ce9eb79df1178b0380461c9eaa09e.png" /> are presented in the figure below.

<img src="/images/thumb/4/46/StatDist.FDists.png/512px-StatDist.FDists.png" _fck_mw_filename="StatDist.FDists.png" _fck_mw_width="512" alt="StatDist.FDists.png" />

If <img _fckfakelement="true" _fck_mw_math="X \sim F(k, \ell)" src="/images/math/3/0/a/30a268520cf8c7ecc505b7280344f1b2.png" />, then <img _fckfakelement="true" _fck_mw_math="\frac{1}{X} \sim F(\ell, k)" src="/images/math/c/2/b/c2b8fcb83c04471d8606d0cb75b97fbe.png" /> (because <img _fckfakelement="true" _fck_mw_math="\frac{1}{X} = \frac{1}{(W_{1} / W_{2})} = \frac{W_{2}}{W_{1}}" src="/images/math/0/c/b/0cbf35f2c3c12afa441855bf91a2185d.png" />). In addition, the square of a t distributed variable is ˜F(1,k) (\textit{why}? -- take the formula for t, and square it...)

## Additional points needed on the F distribution

<a href="User:Philip Schrodt">Philip Schrodt</a> 10:00, 13 July 2011 (PDT)

• Discovered by Fisher in 1922, hence "F"
• Mention how it will be used for R2 and ANOVA Failed to parse (syntax error): F = MS_\frac{{between},MS_{within}}
• Square of a tk statistic is an F1,k statistic

# Summary: Relationships Among Continuous Distributions

The substantive importance of all these distributions will become apparent as we move on to sampling distributions and statistical inference. In the meantime, it is useful to consider the relationship between the four distributions we discussion above

<img src="/images/thumb/2/2e/Continuous.dists.png/512px-Continuous.dists.png" _fck_mw_filename="Continuous.dists.png" _fck_mw_width="512" alt="Continuous.dists.png" />