Building the Data Warehouse

Скачать в pdf «Building the Data Warehouse»


Once the estimate of number of units of data in the data warehouse is made (using a high and a low projection), repeat the process, but this time for the five-year horizon.


After the raw data projections are made, the index data space projections are calculated. For each table—for each key in the table or element of data that will be searched directly—identify the length of the key or element of data and determine whether the key will exist for each entry in the primary table.


Now the high and low numbers for the occurrences of rows in the tables are multiplied, respectively, by the maximum and minimum lengths of data. In addition, the number of index entries is multiplied by the length of the key and added to the total amount of data in order to determine the volume of data that will be required.


A word of caution: Estimates projecting the size of the data warehouse almost always are low. Furthermore, the growth rate of the warehouse is usually faster than the projection.

Input to the Planning Process


The estimate of rows and DASD then serves as input to the planning process, as shown by Figure 4.2. When the estimates are made, accuracy is actually important (or even desirable) only to the order of magnitude. A fine degree of accuracy here is a waste of time.

Data in Overflow?


Once the raw estimate as to the size of the data warehouse is made, the next step is to compare the total number of rows in the warehouse environment to the charts shown in Figure 4.3. Depending on how many total rows will be in the warehouse environment, different approaches to design, development, and storage are necessary. For the one-year horizon, if the number of row total

Скачать в pdf «Building the Data Warehouse»