1 / 21

Query Optimization

Query Optimization. Ex.: SELECT DISTINCT Orders.Customer FROM Orders, Contains WHERE Orders.O_No = Contains.O_No AND Contains.Product = 'Brie' Assumptions: 100,000 tuples in Orders, 1000 bytes each

Download Presentation

Query Optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Query Optimization • Ex.: • SELECT DISTINCT Orders.Customer • FROM Orders, Contains • WHERE Orders.O_No = Contains.O_No • AND Contains.Product = 'Brie' • Assumptions: • 100,000 tuples in Orders, 1000 bytes each • 1,000,000 tuples in Contains, 100 bytes each • 1,000 tuples in Contains concern Brie • 100 MB main memory O. Günther: Database Management Systems

  2. Query Optimization (cont.) • Strategy 1: • 1) Compute cartesian product Orders  Contains • 2) Select all tuples with Orders.O_No = Contains.O_No • 3) Select all tuples with Contains.Product = 'Brie' • 4) Project to Customer • Strategy 2: • 1) Select all tuples from Contains with Product = 'Brie' • 2) Compute cartesian product with Orders • 3) Select all tuples with Orders.O_No = Contains.O_No • 4) Project to Customer O. Günther: Database Management Systems

  3. Query Optimization (cont.) • Analysis Strategy 1: • (1)+(2): Tuple-I/Os for Orders: • Tuple-I/Os for Contains: • (3)+(4): Tuple-I/Os: • Tuple-I/Os in total: • Analysis Strategy 2: • (1): Tuple-I/Os for Contains: • (2)-(4): Tuple-I/Os: • Tuple-I/Os in total: • Strategy 2 is clearly superior (Factor?) O. Günther: Database Management Systems

  4. Query Optimization (cont.) • Which (meta)data should be stored? (Statistics) • - number of tuples for each relation • - number of columns for each relation • - number of different values per column • - occurence frequencies of particular values • More information facilitates query optimization but slows down updates • Automatical optimization preferable because • - statistics always up-to-date • - more cost-efficient • - dynamic • Important strength of relational systems O. Günther: Database Management Systems

  5. Transaction Processing • Transaction (TA) • - logical unit of work • - should be executed either completely or not at all • - atomic, i.e., not decomposable • Recovery • Concurrency O. Günther: Database Management Systems

  6. Recovery • Recovery: restart after system fault • System faults • - program crash • - arithmetic mistakes (e.g. overflow) • - disk crash • - power failure • Ex.: • DELETE • FROM Contains • WHERE O_No = 1024 • What happens in case of system fault „in the middle“ O. Günther: Database Management Systems

  7. Recovery (cont.) • COMMIT • - operation to terminate a TA successfully • - all updates are stored in the database permanently • - storage on „safe“ storage medium • - transaktion is finalized • - bundling of several COMMIT operations in checkpoints • ROLLBACK • - operation to abort a TA in case of system fault • - changes in CPU registers and storage are reversed • Important for ROLLBACK • - logging each single modification • - storing the log on a „safe“ medium O. Günther: Database Management Systems

  8. recovery checkpoint checkpoint checkpoint error Recovery (cont.) (Updates are stored on some “safe” medium) O. Günther: Database Management Systems

  9. Recovery (cont.) • 3 types of transactions • - transactions that already completed and whose results have been made • permanent: T1 • - transactions that have already completed but whose results have not • yet been made permanent: T2, T4  REDO (i.e. re-run, after recovery • these transactions will have completed) • - transactions that started but that did not finish: T3, T5  UNDO (i.e. • reversal of all modifications, ROLLBACK of each transaction concerned; • after recovery these transactions will not have completed) O. Günther: Database Management Systems

  10. transaction A transaction B update R.X read from R.X action on basis of R.X commit B Problem! ROLLBACK A R .. relation R.X .. attributes of R Concurrency: Dirty Read Problem O. Günther: Database Management Systems

  11. transaction A transaction B Concurrency: Lost Update Problem A reads R.X B reads R.X double R.X B adds 2 to R.X A writes new value of R.X B writes new value of R.X Commit A Commit B transaction A transaction B A changes R.X A reads R.X B changes R.X A Rollback B Commit O. Günther: Database Management Systems

  12. Concurrency: Possible Solutions • Timestamps to coordinate transactions • Locks: temporary blocking of parts of the database • - exclusive lock (X-Lock): read/write lock, i.e. no other TA • is allowed to read or write the blocked data • - shared lock (S-Lock): write lock, i.e., others can read but not write • If a TA wants to read, it first has to ask for an S-lock for the required data • If a TA wants to write, it first has to ask for an X-lock for the required data • compatibility of locks • S+S ... OK • S+X ... Not OK • X+X ... Not OK O. Günther: Database Management Systems

  13. Locks: Application to Dirty Read N N Yes N Yes Yes Yes Yes Yes O. Günther: Database Management Systems

  14. Locks: Application to Dirty Read (cont.) TA A obtains an X-lock for the field R.X to prepare for the planned update TA B asks for an S-lock to prepare for the planned read operation  REJECTED ROLLBACK A  locks are released TA A obtains S-lock TA B performs read operation + COMMIT restart TA A • Ex. 1: O. Günther: Database Management Systems

  15. Locks: Application to Dirty Read (cont.) TA A requests X-Lock for R.X TA A obtains X-Lock, updates R.X TA B requests S-Lock  REJECTED, TA B waits TA A ROLLBACK TA B obtains S-Lock, reads R.X TA B COMMIT restart TA A, re-obtains X-Lock • Ex. 2: O. Günther: Database Management Systems

  16. TA A wants to read R.X, asks for S-lock TA A obtains S-lock, reads R.X TA B also wants to read R.X, asks for S-Lock TA B obtains S-Lock, reads R.X TA A wants to update R.X, asks for X-Lock TA A does not obtain X-Lock because TA B holds an S-Lock  A waits TA B wants to update R.X, asks for X-Lock TA B does NOT obtain X-Lock  B waits Locks: Application to Lost Update DEADLOCK  break via Rollback of some TA O. Günther: Database Management Systems

  17. Deadlocks • Problem: How to recognize deadlocks? • How to treat deadlocks involving several TAs? • Searching for cycles in the WAIT-FOR graph wait for O. Günther: Database Management Systems

  18. Serializability • Given a set of TAs, which possible events should be considered correct? • Convention: a schedule is considered correct if it is serializable • Serializability means that the result of the schedule is identical to the result • of some serial schedule • Ex.: (TA1) A := A + 1  read A into main memory add 1 write A back into the DB (TA2) A := 2 * A  read A into main memory multiply by 2 write A back into the DB (TA3) write A  read A into main memory display A on the screen set A to 1 in the DB O. Günther: Database Management Systems

  19. Serializability - An Example • Assumption: A = 1 • TA1, TA2, TA3: • TA1, TA3, TA2: • TA2, TA3, TA1: • TA2, TA1, TA3: • TA3, TA1, TA2: • TA3, TA2, TA1: O. Günther: Database Management Systems

  20. Concurrency: 2-Phase Locking • 2-Phase locking protocol • for each transaction one first asks for all required locks (phase I) • processing ... • then all locks are (gradually) released (phase II) number of locks TA2: no 2-phase-locking O. Günther: Database Management Systems

  21. Concurrency and 2-Phase Locking Theorem: 2-Phase Locking Protokoll for each transaction Serializability of the schedule all „reasonable“ possibilities serializable 2-phase-locking equivalent to FIFO serial O. Günther: Database Management Systems

More Related