Describe Multi-Way Array Aggregation (Multi-Way) Method for Computing Full Cubes with suitable example.
The Multi-Way Array Aggregation (Multi-Way) Method for Computing Full Cubes
• It is an array-based "bottom-up" algorithm. The Multi-way Array Aggregation (or simply Multi-Way) method computes a full data cube by using a multidimensional array as its basic data structure. It is a typical MOLAP approach that uses direct array addressing, where dimension values are accessed via the position or index of their corresponding array locations. Hence, Multi-Way cannot perform any value-based reordering as an optimization technique. A different approach is developed for the array-based cube construction, as follows:
• Partition the array into chunks: A chunk is a sub-cube that is small enough to fit into the memory available for cube computation. Chunking is a method for dividing an n-dimensional array into small n-dimensional chunks, where each chunk is stored as an object on a disk. The chunks are compressed so as to remove wasted space resulting from empty array cells.
• Compute aggregates by visiting (i.e., accessing the values at) cube cells: The order in which cells are visited can be optimized so as to minimize the number of times that each cell must be revisited, thereby reducing memory access and storage costs.
Aggregation Strategy
1. Partitions array into chunks
2. Chunk: a small sub-cube that fits in memory
3. Data addressing
4. Uses chunk id and offset
5. Multi-way Aggregation
6. Computes aggregates in multi-way
7. Visits chunks in the order
a. to minimize memory access
b. to minimize memory space
The best traversing order to do multi-way aggregation:
• Method: the planes should be sorted and computed according to their size in ascending order.
• Idea: keep the smallest planes in the main memory, fetch and compute only one chunk at a time for the largest plane
• Limitation of the method: computing well only for a small number of dimensions
• If there are a large number of dimensions, "bottom-up computation” and iceberg computation methods can be explored.
Example: Consider a 3-D data array containing the three dimensions A, B, and C. The 3-D array is partitioned into small chunks. Here, the array is partitioned into 64 chunks as shown in figure Dimension A is organized into four equal-sized partitions a0, a1, a2, and a3. B is organized into equal-sized partitions and Dimensions C is similarly organized into four partitions each. Chunk 2...... 64 corresponds to the sub-cubes a0b0c0, a1b0c0..., a3b3bc3, respectively.Suppose that the cardinality and size of the array of A, B, and C is 40, 400, and 4000, respectively. Thus, the size of each partition is 10, 100, and 1000, respectively. Full materialization of the corresponding data cube involves the computation of all of the cuboids defining this cube i.e., (apex A, B, C, AB, AC, BC, ABC).
Comments
Post a Comment