Building the Data Warehouse

Скачать в pdf «Building the Data Warehouse»


■    Manufacturing process control. Analog data is created as a by-product of the manufacturing process. The analog data is at such a deep level of granularity that it is not useful in the data warehouse. It needs to be edited and aggregated so that its level of granularity is raised.


■    Clickstream data generated in the Web environment. Web logs collect clickstream data at a granularity that it is much too fine to be placed in the data warehouse. Clickstream data must be edited, cleansed, resequenced, summarized, and so forth before it can be placed in the warehouse.


These are a few notable exceptions to the rule that business-generated data is at too high a level of granularity.

Levels of Granularity-Banking Environment


Consider the simple data structures shown in Figure 4.7 for a banking/financial environment.


To the left—at the operational level—is operational data, where the details of banking transactions are found. Sixty days’ worth of activity are stored in the operational online environment.


In the lightly summarized level of processing—shown to the right of the operational data—are up to 10 years’ history of activities. The activities for an account for a given month are stored in the lightly summarized portion of the data warehouse. While there are many records here, they are much more compact than the source records. Much less DASD and many fewer rows are found in the lightly summarized level of data.


Of course, there is the archival level of data (i.e., the overflow level of data), in which every detailed record is stored. The archival level of data is stored on a medium suited to bulk management of data. Note that not all fields of data are transported to the archival level. Only those fields needed for legal reasons, informational reasons, and so forth are stored. The data that has no further use, even in an archival mode, is purged from the system as data is passed to the archival level.

Скачать в pdf «Building the Data Warehouse»