Building the Data Warehouse

The interface to different technologies requires several considerations: ■■ Does the data pass from one DBMS to another easily?

■■ Does it pass from one operating system to another easily?

■■ Does it change its basic format in passage (EBCDIC, ASCII, etc.)?

Programmer/Designer Control of Data Placement

Because of efficiency of access and update, the programmer/designer must have specific control over the placement of data at the physical block/page level, as shown in Figure 5.2.

The technology that houses the data in the data warehouse can place the data where it thinks is appropriate, as long as the technology can be explicitly overridden when needed. Technology that insists on the physical placement of data with no overrides from the programmer is a serious mistake.

The programmer/designer often can arrange for the physical placement of data to coincide with its usage. In doing so, many economies of resource utilization can be gained in the access of data.

Parallel Storage/Management of Data

One of the most powerful features of data warehouse data management is parallel storage/management. When data is stored and managed in a parallel fashion, the gains in performance can be dramatic. As a rule, the performance boost is inversely proportional to the number of physical devices over which the data is scattered, assuming there is an even probability of access for the data.

The entire issue of parallel storage/management of data is too complex and important to be discussed at length here, but it should be mentioned.

Meta Data Management

