Building the Data Warehouse

Скачать в pdf «Building the Data Warehouse»


programmer when setting off to load the data warehouse.


In the early days of data warehouse, there was no choice but to build the programs that did the integration by hand. Programmers using COBOL, C, and other languages wrote these. But soon people noticed that these programs were tedious and repetitive. Furthermore, these programs required ongoing maintenance. Soon technology appeared that automated the process of integrating data from the operational environment, called extract/transform/load (ETL) software. The first ETL software was crude, but it quickly matured to the point where almost any transformation could be handled.


ETL software comes in two varieties—software that produces code and software that produces a runtime module that is parameterized. The code producing software is much more powerful than the runtime software. The code producing software can access legacy data in its own format. The runtime software usually requires that legacy data be flattened. Once flattened, the runtime module can read the legacy data. Unfortunately, much intelligence is lost in the flattening of the legacy data.


In any case, ETL software automates the process of converting, reformatting, and integrating data from multiple legacy operational sources. Only under very unusual circumstances does attempting to build and maintain the opera-tional/data warehouse interface manually make sense.

Triggering the Data Warehouse Record


The basic business interaction that causes the data warehouse to become populated with data is one that can be called an EVENT/SNAPSHOT interaction. In this type of interaction, some event (usually in the operational environment) triggers a snapshot of data, which in turn is moved to the data warehouse environment. Figure 3.42 symbolically depicts an EVENT/SNAPSHOT interaction.

Скачать в pdf «Building the Data Warehouse»