Building the Data Warehouse

Скачать в pdf «Building the Data Warehouse»


Of course, beyond the basic issue of technology and its efficiency is the cost of storage and processing.

Managing Multiple Media


In conjunction with managing large amounts of data efficiently and cost-effectively, the technology underlying the data warehouse must handle multiple storage media. It is insufficient to manage a mature data warehouse on Direct Access Storage Device (DASD) alone. Following is a hierarchy of storage of data in terms of speed of access and cost of storage:


Main memory


Very fast


Very expensive


Expanded memory


Very fast


Expensive


Cache


Very fast


Expensive


DASD


Fast


Moderate


Magnetic tape


Not fast


Not expensive


Optical disk


Not slow


Not expensive


Fiche


Slow


Cheap

The volume of data in the data warehouse and the differences in the probability of access dictates that a fully populated data warehouse reside on more than one level of storage.

Index/Monitor Data


The very essence of the data warehouse is the flexible and unpredictable access of data. This boils down to the ability to access the data quickly and easily. If data in the warehouse cannot be easily and efficiently indexed, the data warehouse will not be a success. Of course, the designer uses many practices to make data as flexible as possible, such as spreading data across different storage media and partitioning data. But the technology that houses the data must be able to support easy indexing as well. Some of the indexing techniques that often make sense are the support of secondary indexes, the support of sparse indexes, the support of dynamic, temporary indexes, and so forth. Furthermore, the cost of creating the index and using the index cannot be significant.

Скачать в pdf «Building the Data Warehouse»