Statistics for Environmental Engineers

Скачать в pdf «Statistics for Environmental Engineers»


Diagnosing the need for weighted least squares is easy in this case because there are triplicate measurements. The variation within replicates is evident in the tabulated data, but it is hidden in Figure 37.2. Figure 37.3 suggests the nonconstant variance, but Figure 37.4 shows it better by flattening the curve to show the residuals with respect to the average at each of the 13 standard concentration levels. The residuals are larger when the analyte concentration is large, which means that the variance is not constant at all concentration levels.


There is a further problem with the straight-line analysis given above. A check on the confidence interval of the intercept would support keeping the negative value. This confidence interval is wrong because the residuals of the fitted model are not random and they do not have constant variance. The violation of the constant variance condition of regression distorts all statements about confidence intervals, prediction intervals, and probabilities associated with these quantities. Thus, weighting is important even if it does not make a notable difference in the position of the line.

Theory: Weighted Least Squares


The following is a general statement of the least squares criterion. It is used for all models, linear and otherwise, and for constant or nonconstant variance. If the values of the response are y1, y2,…yn and if the variances of these observations are of,o%,…,ol, then the parameter estimates that individually and uniquely have the smallest variance will be obtained by minimizing the weighted sum of squares:


minimize S = ^w(yt — n)2


where n is the response calculated from the proposed model, yt is the observation at a specified value of x;, and w; is the weight assigned to observation y;. The wt will be proportional to 1/o2. If the variance is constant (of = 022 = ••• = o2), all w; = 1, and each observation has an equal opportunity to determine the calibration curve. If the variance is not constant, the least accurate measurements are assigned a small weight and the most accurate measurements are assigned large weights. This prevents the least accurate measurements from dominating the outcome of the regression.

Скачать в pdf «Statistics for Environmental Engineers»