# Statistics for Environmental Engineers

Usually the population variance, a2, is not known and we cannot use the normal distribution as the reference distribution for the sample average. Instead, we substitute sy for ay and use the t distribution. If the parent distribution is normal and the population variance is estimated by s2, the quantity:

which is known as the standardized mean or as the t statistic, will have a t distribution with v = n — 1 degrees of freedom. If the parent population is not normal but the sampling is random, the t statistic will tend toward the t distribution (just as the distribution of y tends toward being normal).

If the parent population is N(n, a ), and assuming once again that the observations are random and independent, the sample variance s has especially attractive properties. For these conditions, s is distributed independently of y in a scaled X (Chi-square) distribution. The scaled quantity is:

This distribution is skewed to the right. The exact form of the x distribution depends on the number of degrees of freedom, v, on which s2 is based. The spread of the distribution increases as v increases. The tail area under the Chi-square distribution is the probability of a value of X = vs la2 exceeding a given value.

Figure 2.11 illustrates these properties of the sampling distributions of y, s2, and t for a random sample of size n = 4.

Example 2.9

For the nitrate data, the sample mean concentration of y = 7.51 mgIL lies a considerable distance below the true value of 8.00 mgIL (Figure 2.12). If the true mean of the sample is 8.0 mgIL and the laboratory is measuring accurately, an estimated mean as low as 7.51 would occur by chance only about four times in 100. This is established as follows. The value of the t statistic is:

