Building the Data Warehouse

Скачать в pdf «Building the Data Warehouse»

First impressions, though, can be very deceiving. What at first appears to be nothing more than the movement of data from one place to another quickly turns into a large and complex task—far larger and more complex than the programmer thought.

Precisely what kind of functionality is required as data passes from the operational, legacy environment to the data warehouse environment? The following lists some of the necessary functionality:

■■ The extraction of data from the operational environment to the data warehouse environment requires a change in technology. This normally includes reading the operational DBMS technology, such as IMS, and writing the data out in newer, data warehouse DBMS technology, such as Informix. There is a need for a technology shift as the data is being moved. And the technology shift is not just one of a changing DBMS. The operating system changes, the hardware changes, and even the hardware-based structure of the data changes.

■■ The selection of data from the operational environment may be very complex. To qualify a record for extraction processing, several coordinated lookups to other records in a variety of other files may be necessary, requiring keyed reads, connecting logic, and so on. In some cases, the extraneous data cannot be read in anything but the online environment. When this is the case, extraction of data for the data warehouse must occur in the online operating window, a circumstance to be avoided if at all possible.

■■ Operational input keys usually need to be restructured and converted before they are written out to the data warehouse. Very seldom does an input key remain unaltered as it is read in the operational environment and written out to the data warehouse environment. In simple cases, an element of time is added to the output key structure. In complex cases, the entire input key must be rehashed or otherwise restructured.

Скачать в pdf «Building the Data Warehouse»