Building the Data Warehouse

Скачать в pdf «Building the Data Warehouse»


description



description


appl A appl B appl C appl D


conflicting keys


appl A key char(10) appl B key dec fixed(9,2) appl C key pic ‘9999999′    key    char(12)


appl D key char(12)


Figure 2.2 The issue of integration.


encoding of gender is concerned, it matters little whether data in the warehouse is encoded as m/f or 1/0 . What does matter is that regardless of method or source application, warehouse encoding is done consistently. If application data is encoded as X/Y, it is converted as it is moved to the warehouse. The same consideration of consistency applies to all application design issues, such as naming conventions, key structure, measurement of attributes, and physical characteristics of data.


The third important characteristic of a data warehouse is that it is nonvolatile. Figure 2.3 illustrates nonvolatility of data and shows that operational data is regularly accessed and manipulated one record at a time. Data is updated in the operational environment as a regular matter of course, but data warehouse data


mass load/ access of data



nonvolatility


data

Figure 2.3 The issue of nonvolatility.


record-by-record manipulation of data


exhibits a very different set of characteristics. Data warehouse data is loaded (usually en masse) and accessed, but it is not updated (in the general sense). Instead, when data in the data warehouse is loaded, it is loaded in a snapshot, static format. When subsequent changes occur, a new snapshot record is written. In doing so a history of data is kept in the data warehouse.


The last salient characteristic of the data warehouse is that it is time variant. Time variancy implies that every unit of data in the data warehouse is accurate as of some one moment in time. In some cases, a record is time stamped. In other cases, a record has a date of transaction. But in every case, there is some form of time marking to show the moment in time during which the record is accurate. Figure 2.4 illustrates how time variancy of data warehouse data can show up in several ways.

Скачать в pdf «Building the Data Warehouse»