Estimator+Distributions

Estimator Distributions (Estimation) (GWU EMSE-271)
Index | Topics (Logical Lectures) | Lectures | Problems | Readings | References | Concepts

Lecture 3 starts with samples of 5 estimator distributions. "Normal, t, F and Chi-squared distributions are the ones that are used the most in hypothesis testing for univariate and multivariate analysis. Hence," the notes "show what they look like and describe their parameters." - Van Dorp

In retrospect, I think that the discrete and continuous modeling and estimator distributions can be categorized (at least from a EMSE 271 as taught perspective) into 3 groups. ==>
 * Those used in "classical," (and Bayesian) statistical problems: Bernoulli, Binominal, Hypergeometric, Poission, Geomentric (some times as the distribution in Bayesian problems).
 * Those used to help normalize dependent variables for regression analyses: Exponential, Gamma
 * Other continuous, not as frequently used (in class, except Harris): Beta, Weibull
 * Estimator Distributions (in hypothesis testing): Normal (Student's T), F Distribution and Chi-Squared. T-test and F-test used most often in regression. - pws (January 2010) (really goes somewhere else TBD)
 * **Normal distribution** from a sample with a normal distribution with mean mu and variance sigma squared.
 * If n is large enough, Central Limit Therorem permits approximating using Normal distribution.
 * If distribution normal, answer is exact.
 * **Chi-squared distribution with n degrees of freedom** (n = 5) from a sample with a normal dist. with mean 0 and variance 1.
 * Only true when the sample is from a normal distribution. (Also important for t-distribution, F distribution. So, in regression analysis, need to test for normality.)
 * **Chi-squared distribution with n-1 degress of freedom** (n=4) from a sample with a normal distribution with mean mu and variance sigma squared.
 * Lose a degree of freedom in the calculation of ( check sum)
 * Useful for goodness-of-fit testing.
 * **t distribution with n-1 degrees of freedom** (n=5) from a sample with a normal distribution with mean mu and variance sigma squared.
 * Sample graph compares t and standard normal PDFs.
 * There is more variation in the t-distribution, because in the Normal we know sigma, but with the t we only know S.
 * **F distribution** (Y's independent from Xs) with n-1, m-1 degrees of freedom from a sample with a normal distribution with mean mu and variance sigma squared.
 * Ratio is a reasonable way to see if the variances? ( check) are the same or not.


 * **Distribution** || **Reference** || **Comment** || **Used For** (Dig up PAD 201 Chart - check) ||
 * Normal || [|wikipedia] || Can approximately describe "any variable that tends to cluster about the mean." || "Variable that is sum of a large number of independent factors." ||
 * (Students) t-distribution || [|wikipedia] || "population standard deviation is unknown and has to be estimated from the data." OR "when the sample size is small." || "popular Student's //t//-tests for the statistical significance of the difference between two sample means, and for confidence intervals for the difference between two population means." ||
 * Chi-squared || wikipedia || "easily calculated quantities can be proven to have distributions that approximate to the chi-square distribution if the null hypothesis is true." || "common chi-square tests for goodness of fit of an observed distribution to a theoretical one, and of the independence of two criteria of classification of qualitative data." ||
 * F Distribution || [|wikipedia] || "The //F//-distribution arises frequently as the null distribution of a test statistic" || "in likelihood-ratio tests, perhaps most notably in the analysis of variance;" ||

[1] "Quite often, however, textbook problems will treat the population standard deviation as if it were known and thereby avoid the need to use the Student's //t//-test. These problems are generally of two kinds: (1) those in which the sample size is so large that one may treat a data-based estimate of the variance as if it were certain, and (2) those that illustrate mathematical reasoning, in which the problem of estimating the standard deviation is temporarily ignored because that is not the point that the author or instructor is then explaining." - [|Wikipedia]


 * Sources:**
 * EMSE 271, Fall 2009 (Slides 63-67)
 * Van Dorp email. Subject: "Re: Lecture 3 - Question 1," Fri 9/11/2009 1:51 PM
 * Normal distribution. (2009, September 11). In //Wikipedia, The Free Encyclopedia//. Retrieved 13:48, September 11, 2009, from []
 * Student's t-distribution. (2009, August 24). In Wikipedia, The Free Encyclopedia. Retrieved 02:07, August 24, 2009, from http://en.wikipedia.org/w/index.php?title=Student%27s_t-distribution&oldid=309707855
 * Chi-square distribution. (2009, September 2). In Wikipedia, The Free Encyclopedia. Retrieved 12:20, September 2, 2009, from http://en.wikipedia.org/w/index.php?title=Chi-square_distribution&oldid=311458263
 * F-distribution. (2009, August 25). In Wikipedia, The Free Encyclopedia. Retrieved 10:29, August 25, 2009, from http://en.wikipedia.org/w/index.php?title=F-distribution&oldid=309949928