From Discrepancy to Declustering:Near-optimal multidimensional declustering strategies for range queries
Description
Abstract
With improvements in computer processing speed and storage capabilities, disk I/O (input/output) has become the bottleneck for many modern data-intensive applications. As such, the use of multi-disk systems together with declustering schemes has been suggested. Declustering schemes allocate data blocks among multiple disks to enable parallel retrieval.
Given a declustering scheme D, its response time with respect to a query Q, rt(Q), is defined to be the maximum number of disk blocks of the query stored by the scheme in any one of the disks. If |Q| is the number of tiles in Q and M is the number of disks then rt(Q) is at least |Q|/M. One way to evaluate the performance of D with respect to range queries is to measure its additive error - the maximum difference between rt(Q) and |Q|/M over all range queries Q.
In this talk, I will present two declustering schemes for uniform multidimensional data arranged in a d-dimensional grid. These schemes are the first known declustering schemes whose asymptotic additive errors with respect to range queries are near optimal.