1 / 20

On the analysis of indexing schemes

This article explores a framework for measuring indexing scheme efficiency based on access overhead and storage redundancy. It discusses the impact of range and set queries, emphasizing the trade-offs between redundancy and access overhead. The study focuses on lower bounds and the challenges of optimizing efficiency for different workloads. It also touches upon the theory of indexability and open problems in the field.

thelmasmith
Download Presentation

On the analysis of indexing schemes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. On the analysis of indexing schemes Written by: Joseph M. Hellerstein Elias Koutsoupias Christos H. Papadimitriou Presented by Tali Kaufman

  2. Presentation layout Problem definition - define a framework to measure the efficiency of an index. Performance factors - access overhead and storage redundancy. Range-queries access overhead upper bound access overhead lower bound (r = 1) access overhead lower bound (r >= 1) Set-queries worst-case access overhead conclusions open problems

  3. The problem Problem - define a framework for measuring the efficiency of an indexing scheme for a workload, based on two performance factors: storage redundancy and access overhead. Workload - a definition of a data set and a set of potential queries. Indexing scheme - a collection of blocks, which store an actual data set instance.

  4. Workload definition

  5. Example - a workload with two dimensional range queries

  6. Indexing scheme definition

  7. Performance factors definition

  8. Access overhead upper boundfor two dimensional range queries

  9. Access overhead lower bound (redundancy = 1)

  10. Access overhead lower bound (redundancy = 1) [cont]

  11. Access overhead lower bound (redundancy 1)

  12. Access overhead lower bound (redundancy 1) [cont]

  13. Access overhead lower bound (redundancy 1) [cont]

  14. Access overhead lower bound (redundancy 1) [cont]

  15. Access overhead lower bound (redundancy 1) [cont]

  16. Access overhead lower bound (redundancy 1) [cont]

  17. Example - Set inclusion workloads

  18. Set inclusion workloads worst-case access overhead

  19. Conclusions Theory of indexability- the article presents a framework for studying indexability. Workload and index scheme in indexability theory vs. language and algorithm in complexity theory. Emphasis the secondary storage nature of indexing schemes, examine storage utilization(redundancy) and disk access (access overhead) Consider range queries and set queries and focus on lower bounds and trade-off between redundancy and access overhead The trade-off is worse for workloads with large number of queries (set queries - exponential, range queries - polynomial) Algorithms to find the best access methods (search algorithms), and to find best partition into blocks, are not considered. The size of the instance does not affect the results

  20. Open problems

More Related