1 / 18

January 5, 2020

ISDA'2003 Data Mining Techniques in Index Techniques. Ying Wah Teh and Abu Bakar Zaitun tehyw@.um.edu.my, zab@um.edu.my University of Malaya Faculty of Computer Science and Information Technology. January 5, 2020. 1. Contents.  Introduction.  Query Processing Techniques.

apepper
Download Presentation

January 5, 2020

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ISDA'2003 Data Mining Techniques in Index Techniques Ying Wah Teh and Abu Bakar Zaitun tehyw@.um.edu.my, zab@um.edu.my University of Malaya Faculty of Computer Science and Information Technology January 5, 2020 1

  2. Contents  Introduction  Query Processing Techniques  Evaluation of Data Mining Prototypes  Conclusion January 5, 2020 2

  3. Introduction  What data to gather and how to conceptually model the data and manage its storage  Logical database design  Physical database design  Very large data storage nowadays  Redundant data structures the intelligent way of managing storage  Fast access to data  Selecting the right elements to build redundant data structures  Only a few data warehouse administrators can do justice to the task of picking the right redundant data structures.  January 5, 2020 3

  4. Query Processing Techniques  Historical Perspectives  File Processing / Full Scan / Sequential Scan  Simple index  B-Tree index  Present Scenarios of Query Processing Techniques  BitMap Index  Single-column indexes January 5, 2020 4

  5. File Processing  A programmer needs to know at least one-third generation language for writing a data retrieval program to access the relevant information from a file system.  Query processing techniques (sequential scan or full scan)  It is more suitable for the small data volume environment. January 5, 2020 5

  6. January 5, 2020 6

  7. Simple Indexes / Hashed Key  DBMSs were developed that included simple indexes.  It allows users to access information very quickly by a unique value.  It creates a list of record identification which acts as pointers to records.  Exactly key value to access data. January 5, 2020 7

  8. January 5, 2020 8

  9. B-tree indexes  Partial key lookups and exactly key lookup.  It is a very costly to create for every query.  The intelligent way of handling the B-Tree index. January 5, 2020 9

  10. January 5, 2020 10

  11. Present Scenario  Issues a query that only requires a small portion of the result of relations and the predicate is non-primary key.  Only one RID index can be used at a time. January 5, 2020 11

  12. BitMap Index  Bit-vector approach  A RID occupies at least 8 bits, while a BitMap index occupies only 1-bit pointer to a tuple of the relation.  Work well only with low-cardinality data (Female, Male).  The intelligent way of handling the BitMap is the vital issue. January 5, 2020 12

  13. Single-column indexes  Index intersection offers greater flexibility  A good strategy would be to define single- column indexes on all columns that will be frequently queries and let index intersection handle situation.  The intelligent way of handling the single- column indexes is the vital issue. January 5, 2020 13

  14. Our Research Perspective  Most researchers apply data mining at the application level of data warehouse.  We applied data mining in the physical design of data warehouses to optimise the base relation. January 5, 2020 14

  15. Architecture of One-column Index Selection January 5, 2020 15

  16. Evaluation of Data Mining Prototypes January 5, 2020 16

  17. Conclusion  It is necessary to have an intelligent way of handling the various query processing techniques (such as indexes).  Data mining techniques can be used in the physical design of a data warehouse to generate single- column indexes.  The positive results from the study should motivate further efforts to make it into a fully functional SQL engine. January 5, 2020 17

  18. Thank You Questions? tehyw@um.edu.my January 5, 2020 18

More Related