Building the Data Warehouse

Скачать в pdf «Building the Data Warehouse»

existing


systems


environment



“Best” data to represent the data model:


•    most timely


•    most accurate


•    most complete


•    nearest to the external source


•    most structurally compatible


Figure 9.1 Migration to the architected environment.


Some reasons for excluding derived data and DSS data from the corporate data model and the midlevel model include the following:


■■ Derived data and DSS data change frequently.


■■ These forms of data are created from atomic data.


■■ They frequently are deleted altogether.


■■ There are many variations in the creation of derived data and DSS data.


Because derived data and DSS data are excluded from the corporate data model and the midlevel model, the data model does not take long to build.


After the corporate data model and the midlevel models are in place, the next activity is defining the system of record. The system of record is defined in terms of the corporation’s existing systems. Usually, these older legacy systems are affectionately known as the “mess.”


The system of record is nothing more than the identification of the “best” data the corporation has that resides in the legacy operational or in the Web-based ebusiness environment. The data model is used as a benchmark for determining what the best data is. In other words, the data architect starts with the data model and asks what data is in hand that best fulfills the data requirements identified in the data model. It is understood that the fit will be less than perfect. In some cases, there will be no data in the existing systems environment or the Web-based ebusiness environment that exemplifies the data in the data model. In other cases, many sources of data in the existing systems environment contribute data to the systems of record, each under different circumstances.

Скачать в pdf «Building the Data Warehouse»