Building the Data Warehouse

Скачать в pdf «Building the Data Warehouse»

■    The input records that must be read have exotic or nonstandard formats. There are a whole host of input types that must be read, then converted on entry into the data warehouse:

■    Fixed-length records

■    Variable-length records

■    Occurs depending on

■    Occurs clause

Conversion must be made. But the logic of conversion must be specified, and the mechanics of conversion (what the “before” and “after” look like) can be quite complex. In some cases, conversion logic becomes very twisted.

■    Perhaps the worst of all: Data relationships that have been built into old legacy program logic must be understood and unraveled before those files can be used as input. These relationships are often Byzantine, arcane, and undocumented. But they must patiently be unwound and deciphered as the data moves into the data warehouse. This is especially difficult when there

is no documentation or when the documentation that exists is out-of-date. And, unfortunately, on many operational legacy systems, there is no documentation. There is an old saying: Real programmers don’t do documentation.

■■ Data format conversion must be done. EBCDIC to ASCII (or vice versa) must be spelled out.

■■ Massive volumes of input must be accounted for. Where there is only a small amount of data being entered as input, many design options can be accommodated. But where many records are being input, special design options (such as parallel loads and parallel reads) may have to be used.

■■ The design of the data warehouse must conform to a corporate data

Скачать в pdf «Building the Data Warehouse»