Statistics for Environmental Engineers

Скачать в pdf «Statistics for Environmental Engineers»

5.56


7.0


4.82


7.0


7.26


7.0


6.42


8.0


7.91


5.0


5.68


5.0


4.74


5.0


5.73


8.0


6.89

Note: Each data set has n = 11, mean of Х = 9.0, mean of у = 7.5, equation of the regression line у = 3.0 + 0.5x, standard error of estimate of the slope = 0.118 (t statistic = 4.24, regression sum of squares (corrected for mean) = 110.0, residual sum of squares = 13.75, correlation coefficient r = 0.82 and R2 = 0.67).


Source: Anscombe, F. J. (1973). Am. Stat., 27, 17-21.


У



10


15


У


5


0


(a) R2 = 0.67 •



.•



. •


(b) R2 = 0.67


• ••••


• •






………


■ 1 ■ 1 ■ 1 ■


(c) R2 = 0.67#


(d) R2 = 0.67


_



1


1

0    5    10    15    20 0    5    10    15    20


x    x


FIGURE 39.2 Plot of Anscombe’s four data sets which all have R = 0.67 and identical results from simple linear regression analysis (data from Anscombe 1973).


gives Anscombe’s four data sets. Each data set has n = 11, x = 9.0, y = 7.5, fitted regression line y = 3 + 0.5x, standard error of estimate of the slope = 0.118 (t statistic = 4.24), regression sum of squares (corrected for mean) = 110.0, residual sum of squares = 13.75, correlation coefficient = 0.82, and R2= 0.67. All four data sets appear to be described equally well by exactly the same linear model, at least until the data are plotted (or until the residuals are examined). Figure 39.2 shows how vividly they differ. The example is a persuasive argument for always plotting the data.

Скачать в pdf «Statistics for Environmental Engineers»