Building the Data Warehouse

Скачать в pdf «Building the Data Warehouse»


ISSUE: It is absolutely normal for the data warehouse environment to have some amount of repetitive processing done against it. If only repetitive processing is done, however, or if no repetitive processing is planned, the designer should question why.


49.    How will major subjects be partitioned? (By year? By geography? By functional unit? By product line?) Just how finely does the partitioning of the data break the data up?


ISSUE: Given the volume of data that is inherent to the data warehouse environment and the unpredictable usage of the data, it is mandatory that data warehouse data be partitioned into physically small units that can be managed independently. The design issue is not whether partitioning is to be done. Instead, the design issue is how partitioning is to be accomplished. In general, partitioning is done at the application level rather than the system level.


The partitioning strategy should be reviewed with the following in mind:


■■ Current volume of data ■■ Future volume of data ■■ Current usage of data ■■ Future usage of data


■■ Partitioning of other data in the warehouse


■ Use of other data


■■ Volatility of the structure of data


50.    Will sparse indexes be created? Would they be useful?


ISSUE: Sparse indexes created in the right place can save huge amounts of processing. By the same token, sparse indexes require a fair amount of overhead in their creation and maintenance. The designer of the data warehouse environment should consider their use.


51.    What temporary indexes will be created? How long will they be kept? How large will they be?

Скачать в pdf «Building the Data Warehouse»