Statistics for Environmental Engineers

Скачать в pdf «Statistics for Environmental Engineers»

The test used above is valid to compare any two of the models that have one less parameter than Model A. To compare Models A and E, notice that omitting t2 decreases the regression sum of squares by 20256 — 17705 = 2551. The F statistic is 2551/51.5 = 49.5. Because 49.5 >> 5.99 (the upper 95% point of the F distribution with 1 and 6 degrees of freedom), this change is significant and t2 needs to be included in the model.

The test is modified slightly to compare Models A and D because Model D has two less terms than Model A. The decrease of 343 in the regression sum of squares results from dropping to terms (z2 and zt). The F statistic is now computed using 343/2 in the numerator and 51.5 in the denominator: F = (343/2)/51.5 = 3.33. The upper 95% point of the appropriate reference distribution is F = 5.14, which has 2 degrees of freedom for the numerator and 6 degrees of freedom for the denominator. Because for the model is less than the reference F (F = 3.33 < 5.14), the terms z2 and zt are not needed.

Model D is as good as Model A. Model D is the simplest adequate model:

Model D у = 186 + 7.12t- 3.06z + 0.143t2

This is the same model that was obtained by starting with the simplest possible model and adding terms to make up for inadequacies.


The model building process uses regression to estimate the parameters, followed by diagnosis to decide whether the model should be modified by adding or dropping terms. The goal is not to maximize R2, because this puts unneeded high-order terms into the polynomial model. The best model should have the fewest possible parameters because this will minimize the prediction error of the model.

Скачать в pdf «Statistics for Environmental Engineers»