Building the Data Warehouse

Скачать в pdf «Building the Data Warehouse»




the level of detail

high level of detail—    low level of detail-

low level of granularity    high level of granularity

EXAMPLE: the details of every phone call made by a customer for a month


the summary of phone calls made by a customer for a month



partitioning of data

•    the splitting of data into small units

•    done at the application level or the DBMS level

difficult to manage

easy to manage

Figure 2.11 Major design issues of the data warehouse: granularity, partitioning, and proper design.

Granularity is the major design issue in the data warehouse environment because it profoundly affects the volume of data that resides in the data warehouse and the type of query that can be answered. The volume of data in a warehouse is traded off against the level of detail of a query.

In almost all cases, data comes into the data warehouse at too high a level of granularity. This means that the developer must spend a lot of resources breaking the data apart. Occasionally, though, data enters the warehouse at too low a level of granularity. An example of data at too low a level of granularity is the Web log data generated by the Web-based ebusiness environment. Web log clickstream data must be edited, filtered, and summarized before its granularity is fit for the data warehouse environment.

The Benefits of Granularity

Many organizations are surprised to find that data warehousing provides an invaluable foundation for many different types of DSS processing. Organizations may build a data warehouse for one purpose, but they discover that it can be used for many other kinds of DSS processing. Although infrastructure for the data warehouse is expensive and difficult to build, it has to be built only once. After the data warehouse has been properly constructed, it provides the organization with a foundation that is extremely flexible and reusable.

Скачать в pdf «Building the Data Warehouse»