Building the Data Warehouse

Скачать в pdf «Building the Data Warehouse»


Enter the Extract Program


Shortly after the advent of massive OLTP systems, an innocuous program for “extract” processing began to appear (see Figure 1.2).


The extract program is the simplest of all programs. It rummages through a file or database, uses some criteria for selecting data, and, on finding qualified data, transports the data to another file or database.

r


extract program


extract processing

• control


Figure 1.2 The nature of extract processing.


The extract program became very popular, for at least two reasons:


■■ Because extract processing can move data out of the way of high-performance online processing, there is no conflict in terms of performance when the data needs to be analyzed en masse.


■■ When data is moved out of the operational, transaction-processing domain with an extract program, a shift in control of the data occurs. The end user then owns the data once he or she takes control of it. For these (and probably a host of other) reasons, extract processing was soon found everywhere.


The Spider Web


As illustrated in Figure 1.3, a “spider web” of extract processing began to form. First, there were extracts; then there were extracts of extracts; then extracts of extracts of extracts; and so forth. It was not unusual for a large company to perform as many as 45,000 extracts per day.


This pattern of out-of-control extract processing across the organization became so commonplace that it was given its own name—the “naturally evolving architecture”—which occurs when an organization handles the whole process of hardware and software architecture with a laissez-faire attitude. The larger and more mature the organization, the worse the problems of the naturally evolving architecture become.

Скачать в pdf «Building the Data Warehouse»