Building the Data Warehouse

Скачать в pdf «Building the Data Warehouse»

■■ The capture of the Web-based data in the Web logs. Once captured, how can the data be freed for use within the data warehouse?

■    A change in basic data formats. Data in one environment is stored in ASCII, and data in the data warehouse is stored in EBCDIC, and so forth.

Another important technological issue that sometimes must be addressed is the volume of data. In some cases, huge volumes of data will be generated in the legacy environment. Specialized techniques may be needed to enter them into the data warehouse. For example, clickstream data found in the Web logs needs to be preprocessed before it can be used effectively in the data warehouse environment.

There are other issues. In some cases, the data flowing into the data warehouse must be cleansed. In other cases, the data must be summarized. A host of issues relate to the mechanics of the bringing of data from the legacy environment into the data warehouse environment.

After the system of record is defined and the technological challenges in bringing the data into the data warehouse are identified, the next step is to design the data warehouse, as shown in Figure 9.2.

If the data modeling activity has been done properly, the design of the data warehouse is fairly simple. Only a few elements of the corporate data model and the midlevel model need to be changed to turn the data model into a data warehouse design. Principally, the following needs to be done:

■    An element of time needs to be added to the key structure if one is not already present.

Скачать в pdf «Building the Data Warehouse»