Statistics for Environmental Engineers

Скачать в pdf «Statistics for Environmental Engineers»

Case Study: Serial Dependence of BOD Data


A total of 120 biochemical oxygen demand (BOD) measurements were made at two-hour intervals to study treatment plant dynamics. The data are listed in Table 32.1 and plotted in Figure 32.1. As one would expect, measurements taken 24 h apart (12 sampling intervals) are similar. The task is to examine this daily cycle and the assess the strength of the correlation between BOD values separated by one, up to at least twelve, sampling intervals.

Correlation and Autocorrelation Coefficients


Correlation between two variables x and у is estimated by the sample correlation coefficient:


‘L(xl-x)(yl-y)


£(Xi — x)2Kyt — у )2


where X and у are the sample means. The correlation coefficient (r) is a dimensionless number that can range from -1 to + 1.


Serial correlation, or autocorrelation, is the correlation of a variable with itself. If sufficient data are available, serial dependence can be evaluated by plotting each observation yt against the immediately preceding one,    yt-1.    (Plotting    yt    vs.    yt+1    is equivalent to plotting    yt    vs. yt-1.)    Similar plots can be made


for observations two units apart (yt vs. yt-2), three units apart, etc.


If measurements were made daily, a plot of yt vs. yt-7 might indicate serial dependence in the form of a weekly cycle. If у represented monthly averages, yt vs. yt-12 might reveal an annual cycle. The distance between the observations that are examined for correlation is called the lag. The convention is to measure lag as the number of intervals between observations and not as real time elapsed. Of course, knowing the time between observations allows us to convert between real time and lag time.

Скачать в pdf «Statistics for Environmental Engineers»