Building the Data Warehouse

Скачать в pdf «Building the Data Warehouse»



database design

—    arrays of data

—    merging tables

—    selective redundancy

—    further separation of data

—    derived data

—    preformatting, preallocation

—    relationship artifacts

—    prejoining tables

Figure 3.20 Getting good performance out of the data warehouse environment.


Figure 3.21 Getting the most out of the physical I/Os that have to be done.

The job of the data warehouse designer is to organize data physically for the return of the maximum number of records from the execution of a physical I/O. (Note: This is not an issue of blindly transferring a large number of records from DASD to main storage; instead, it is a more sophisticated issue of transferring a bulk of records that have a high probability of being accessed.)

For example, suppose a programmer must fetch five records. If those records are organized into different blocks of data on storage, then five I/Os will be required. But if the designer can anticipate that the records will be needed as a group and can physically juxtapose those records into the same block, then only one I/O will be required, thus making the program run much more efficiently.

There is another mitigating factor regarding physical placement of data in the data warehouse: Data in the warehouse normally is not updated. This frees the designer to use physical design techniques that otherwise would not be acceptable if it were regularly updated.

The Data Model and Iterative Development

In all cases, the data warehouse is best built iteratively. The following are some of the many reasons why iterative development is important:

Скачать в pdf «Building the Data Warehouse»