1 / 42

Julian Dyke Independent Consultant

Julian Dyke Independent Consultant. Logical I/O. Web Version. juliandyke.com. © 2005 Julian Dyke. Agenda. Introduction Logical I/Os Buffer Cache Behaviour Statistics Conclusion. Logical I/Os. Logical I/Os are read operations Buffers are cached in shared memory

nikkos
Download Presentation

Julian Dyke Independent Consultant

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Julian Dyke Independent Consultant Logical I/O Web Version juliandyke.com ©2005 Julian Dyke

  2. Agenda • Introduction • Logical I/Os • Buffer Cache Behaviour • Statistics • Conclusion

  3. Logical I/Os • Logical I/Os are read operations • Buffers are cached in shared memory • Most logical I/Os can be satisfied from cache • The remainder will result in physical I/Os • Logical I/Os include • current reads • consistent reads

  4. Current Reads • Current reads • Current version of block • Can be updated • Can be dirty • Includes all changes • Only one current version of block in buffer cache • Only one current version of block across all instances • Can be used to construct consistent versions

  5. Consistent Reads • Consistent reads • Potentially historic version of block • Consistent to a specific System Change Number (SCN) • Cannot be updated • Cannot be dirty • Can be used to construct consistent versions • Can have multiple versions of same block in buffer cache • Can be • single block (sequential reads) • multi block (scattered reads) • Can be traced using events 10200 / 10201

  6. Logical I/O statistics • session logical reads statistic • Total number of logical reads in session • Unreliable at system level • At session level session logical reads = db block gets + consistent gets • db block gets statistic • Number of current reads • consistent gets statistic • Number of consistent reads

  7. Buffer Pools • There are up to eight buffer pools • DEFAULT • KEEP, RECYCLE Oracle 8.0 and above • 2K, 4K, 8K, 16K and 32K Oracle 9.0 and above • 32K not available on all platforms • Cannot have non-standard block size same as DEFAULT block size

  8. Buffer Pool Headers • One for each buffer pool (usable or unusable) • Externalized in • V$BUFFER_POOL • V$BUFFER_POOL_STATISTICS • Based on X$KCBWBPD • Created in shared pool permanent memory when instance is started • Contain one or more working sets

  9. X$KCBWBPD Buffer Pool Name • Externalises buffer pool header Buffer Pool ID Block Size Granule Size Buffers per Granule Minimum Working Set ID Maximum Working Set ID Number of Working Sets Number of Buffers

  10. Hash Buckets • Hash value of each block calculated from • Data Block Address (DBA) • Block Class • Number of hash buckets dependent on number of buffers in cache e.g. • Each hash bucket contains • Cache Buffers Chains latch • Pointer to array of double linked lists

  11. BH BH BH BH # hash chains cachebufferschainlatch Hash Buckets

  12. Buffer Headers • Each buffer header describes contents of one buffer • All buffers accessed via buffer header • Buffer header contains pointers to • Buffer • Cache Buffers Chains latch • Buffer header includes double linked lists for • Cache Buffers Chain list • Replacement list • Users list • Waiters list

  13. X$BH • Externalises buffer headers Hash List Address Block Size Hash List Replacement List Tablespace# Absolute File Number Relative File Number Block Number Object ID Buffer Address

  14. Working Sets • Introduced in Oracle 8.1.5 • Each buffer pool contains one or more working sets • Working set header • created in shared pool permanent memory • associated with one DBWn process • protected by cache buffers lru chain latch • Each working set maintains separate set of LRU lists

  15. LRU Lists • In Oracle 9.2 each working set maintains 4 LRU lists • LRU - replacement list - normal blocks • LRU-W - write list - dirty blocks • LRU-XO - object list - buffers involved in • DROP • TRUNCATE • LRU-XR - range list - buffers involved in • ALTER TABLESPACE BEGIN BACKUP • ALTER TABLESPACE END BACKUP • ALTER TABLESPACE OFFLINE • ALTER TABLESPACE READ ONLY

  16. Main and Auxiliary Lists • Each LRU contains • main list • auxiliary list • Auxiliary list includes • dirty buffers identified by DBWn processes • buffers being written • Buffers are moved from main to auxiliary list by DBWn processes to avoid unnecessary scans • Processes scan auxiliary lists first for free buffers • Buffers also allocated to auxiliary list • at startup • after FLUSH_CACHE

  17. MAIN AUX MAIN AUX MAIN AUX MAIN AUX Working Set Lists Working SetHeader Hot Cold ReplacementList BufferHeader WriteList ObjectList RangeList

  18. Replacement List • In Oracle 8.1.5 and above a mid-point insertion algorithm is used • Buffer cache has a hot end and a cold end • Buffers are inserted at mid-point • Mid-point is head of cold end • Starts at hot end - moves down cache • Maximum mid-point determined by _db_percent_hot_default • Default value is 50% Head of Hot End Head of Cold End Hot End Replacement List Cold End

  19. X$KCBWDS • Externalises working set header Working Set ID Database Writer Number MAIN Replacement List AUX Replacement List Number of buffers on MAIN Replacement List Number of buffers on AUX Replacement List Insertion Point Maximum number of Hot Buffers Number of Hot Buffers

  20. Touch Count • Each buffer header maintains • touch count • timestamp • Touch count represents number of 3 second intervals in which buffer has been accessed since • buffer last read into cache • touch count last reset • Each time buffer is accessed • if timestamp more than 3 seconds ago • increment touch count • set timestamp to current time

  21. Touch Count • When buffer reaches tail of cold end • If touch count >= 2 then buffer is moved to hot end • Otherwise used as next free buffer • Hot criteria determined by • _db_aging_hot_criteria • default value is 2 touches • Time interval determined by • _db_aging_touch_time • default value is 3 seconds

  22. Single versus Multi-Block Reads • Single block reads • Used with current reads • Can be used with consistent reads • Waits recorded by db file sequential read • Multi block reads • Frequently used with consistent reads • Maximum number of physical blocks read specified by DB_FILE_MULTIBLOCK_READ_COUNT • Waits recorded by db file scattered read • Blocks moved to cold end of buffer cache

  23. 45 33 42 11 52 72 71 66 34 66 42 49 92 87 45 45 52 71 71 42 72 33 33 87 42 33 11 42 52 71 11 11 45 42 71 42 52 92 34 34 72 45 92 11 72 34 2 1 1 1 1 2 1 1 0 1 2 0 0 3 4 2 1 1 2 1 0 2 0 4 1 1 1 2 2 4 4 0 1 4 1 2 0 1 0 1 2 4 0 3 1 2 Read Block 87 Read Block 34 Read Block 33 Read Block 42 Read Block 11 Update touch count for block 42 Update buffer contents Get first available buffer from cold end Update buffer contents Insert buffer at head of cold end Get first available buffer from cold end Insert buffer at head of cold end Insert buffer at head of coldend Update buffer contents Move block 71 to head of hot end Set touch count on block 71 to zero Get first available bufferfrom cold end Update buffer contents Set touch counton block 42 to zero Insert buffer at head of coldend Move block 42 to headof hot end Update touch countfor block 34 Get first available bufferfrom cold end STOP Single-Block Reads Block Number Head of Hot End Head of Cold End Touch Count Read Block 42

  24. 27 27 56 27 17 34 27 27 27 95 27 85 27 40 17 27 95 17 34 33 34 27 27 33 95 27 132 150 132 150 128 132 132 132 150 128 150 Read Block 27 - SCN 132 Read Block 27 - SCN 128 Get first available buffer from cold end Apply undo to rollback block to SCN 128 Insert buffer at head of cold end Insert buffer at head of cold end Read current version of block 27 into buffer Get first available bufferfrom cold end Read consistent version of block 27 into buffer Apply undo to rollback block to SCN 132 150 STOP Consistent Reads Current Block Head of Hot End Head of Cold End BlockNumber Consistent Block SystemChangeNumber

  25. 1 3 6 2 8 2 3 4 7 1 2 1 2 1 4 1 2 1 6 1 2 3 1 5 5 5 3 7 8 2 1 3 4 7 6 5 6 5 4 2 3 1 5 5 5 6 7 6 7 6 5 8 Read Block 7 Read Block 5 Read Block 2 Read Block 4 Read Block 8 Read Block 6 Read Block 1 Read Block 3 Move block 7 to cold end Move block 5 to cold end Move block 8 to cold end Get first four available buffers from cold end Read next four blocks into buffers Insert buffers at head of cold end Move block 1 to cold end Move block 3 to cold end Read next four blocks into buffers Insert buffers at head of cold end Move block 4 to cold end Get next four available buffers from cold end Move block 6 to cold end Move block 2 to cold end STOP Multi-Block Reads DB_FILE_MULTIBLOCK_READ_COUNT = 4 Head of Hot End Head of Cold End

  26. Dirty Blocks • When blocks are updated they are marked dirty • Changes immediately written to redo buffer • Changes written back to disk asynchronously by DBWn process • DBWn process • scans from cold end of MAIN replacement list • moves dirty blocks to auxiliary list • writes dirty blocks back to disk • Written blocks remain on auxiliary list until re-used

  27. Buffer Pinning • In Oracle 8.0 and above, Oracle uses pinning to reduce number of logical I/Os • If buffer will be accessed again by the statement, it is pinned in the buffer cache • Frequently used with index scans • Only appears to be used with consistent gets • not observed with current gets • If pinning was not implemented, number of logical I/Os would significantly increase

  28. Buffer Pinning Statistics • buffer is not pinned count statistic • Number of pin-able buffers not pinned by this session when visited • Equivalent to number of logical I/Os (for that part of statement) • buffer is pinned count statistic • Number of buffers already pinned by this session when visited • Number of buffers visited = • buffer is not pinned count + buffer is pinned count

  29. Consistent Gets Statistics • consistent gets - examination statistic • Number of consistent gets that could be immediately performed without pinning the buffer • Generally apply to indexes • Require one latch get • Included in consistent gets statistic • no work - consistent read gets statistic • Number of consistent gets that could be performed without requiring rollback or cleanout • Generally apply to tables • Require two latch gets • Included in consistent gets statistic

  30. 13 21 20 19 18 17 16 12 23 15 4 14 9 10 11 22 3 7 8 5 6 9 5 17 11 16 15 13 10 12 14 18 20 21 22 19 4 23 8 7 3 6 6 2 4 18 13 14 15 9 16 18 17 12 7 1 8 10 11 18 5 3 6 5 2 4 8 3 7 9 13 10 11 12 14 18 17 18 15 18 16 1 1 1 1 1 1 1 1 36 32 48 28 40 24 20 56 44 52 16 12 8 64 4 68 72 72 72 60 13 8 9 11 6 14 10 7 12 2 5 16 3 4 17 18 15 18 18 1 Read Block 1 Read Block 1 Read Block 1 Segment Header 1 Segment Header 3 Segment Header 2 Segment Header Data Blocks EmptyBlocks UnusedBlocks Table T1 High Water Mark Read Block 3 Read Block 16 Read Block 18 Read Block 21 Read Block 20 Read Block 2 Read Block 17 Read Block 4 Read Block 12 Read Block 5 Read Block 15 Read Block 11 Read Block 13 Read Block 10 Read Block 9 Read Block 19 Read Block 14 Read Block 8 Read Block 7 Read Block 6 Data Block Data Block Data Block Data Block Data Block Empty Block Empty Block Data Block Data Block Data Block Data Block Data Block Data Block Data Block Data Block Data Block Data Block Data Block Data Block Data Block STOP Full Table Scan SELECT SUM(c2) FROM t1; 2 1 1 2 0 SELECT STATEMENT1 0 TABLE ACCESS (FULL) OF 'T1'

  31. Full Table Scan - Summary • In Oracle 9.2 • segment header initially read 3 times • segment header read again every 10 extents • All blocks are read up to high water mark • For longer tables blocks can be prefetched • Algorithm differs for Automatic Segment Space Managed tablespaces

  32. 2 3 1 3 2 1 1 3 2 2 1 1 1 1 Read Index Block 1 Read Index Block 4 Read Table Block 3 Data Block Branch Block Leaf Block STOP Unique Scan SELECT c2 FROM t1 WHERE c1 = 42; 0 SELECT STATEMENT1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1'2 1 INDEX (UNIQUE SCAN) OF 'I1' Index I1 Branch Block Leaf Blocks Table T1

  33. 2 1 2 1 1 2 1 Read Index Block 1 Read Index Block 3 Leaf Block Branch Block STOP Index Organised Table SELECT c2 FROM t1 WHERE c1 = 42; 0 SELECT STATEMENT1 0 INDEX (UNIQUE SCAN) OF 'I1' Index I1 Branch Block Leaf Blocks

  34. 1 1 1 1 1 1 1 1 1 Read Table Block 7 Data Block STOP Single Table Hash Cluster SELECT c2 FROM t1 WHERE c1 = 42; 0 SELECT STATEMENT1 0 TABLE ACCESS (HASH) OF 'T1' Table T1 Leaf Blocks

  35. Clustering Factor • Measures relationship between index entries and corresponding data blocks • Used by CBO to calculate cost of using index • Good clustering factor approaches number of blocks in table; • Bad clustering factor approaches number of rows in table • CBO will favour indexes with a better clustering factor Bad Clustering Factor Good Clustering Factor

  36. 4 5 3 8 6 3 6 7 4 2 7 5 1 5 1 4 8 7 7 5 6 6 3 2 4 3 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 5 6 5 4 4 1 3 3 2 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 4 4 7 2 6 3 1 5 5 6 4 5 2 5 3 1 1 3 2 4 1 4 2 1 2 5 3 4 5 6 3 Read Index Block 3 Read Table Block 18 Read Table Block 14 Read Table Block 6 Read Index Block 3 Read Index Block 3 Read Table Block 2 Read Index Block 1 Read Index Block 3 Read Index Block 3 Read Index Block 3 Read Table Block 10 Read Table Block 22 Data Block Data Block Leaf Block (Pinned) Data Block Leaf Block (Pinned) Leaf Block Data Block Leaf Block (Pinned) Data Block Data Block Leaf Block (Pinned) Leaf Block (Pinned) Branch Block STOP Range Scan - Bad Clustering Factor SELECT c2 FROM t1 WHERE c3 = 42; 0 SELECT STATEMENT1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1'2 1 INDEX (RANGE SCAN) OF 'I2' Index I2 Branch Block Leaf Blocks Table T1

  37. 4 4 4 3 3 3 3 1 3 2 3 3 3 3 3 3 3 4 3 1 3 4 2 3 4 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 1 2 2 2 3 3 2 3 2 3 4 7 8 1 6 2 9 7 5 5 3 1 6 2 5 4 2 1 4 3 Read Table Block 9 Read Table Block 8 Read Index Block 4 Read Index Block 3 Read Table Block 8 Read Index Block 3 Read Index Block 3 Read Table Block 8 Read Table Block 9 Read Index Block 3 Read Index Block 3 Read Index Block 1 Read Table Block 8 Leaf Block (Pinned) Data Block (Pinned) Data Block (Pinned) Leaf Block (Pinned) Data Block (Pinned) Leaf Block (Pinned) Data Block Data Block Leaf Block (Pinned) Leaf Block Branch Block Leaf Block (Pinned) Data Block (Pinned) STOP Range Scan - Good Clustering Factor SELECT c2 FROM t1 WHERE c4 = 42; 0 SELECT STATEMENT1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1'2 1 INDEX (RANGE SCAN) OF 'I3' Index I3 Branch Block Leaf Blocks Table T1

  38. Clustering Factor - Summary • Higher clustering factor • Reduces number of logical I/Os required • Increases number of buffers that can be pinned

  39. Row Prefetching • For queries returning more than one row specify maximum number of rows per round trip • If prefetch size too small • Increased number of round trips • Degrades performance • If prefetch size too large • Increased number of packets • May degrade performance

  40. Row Prefetching • Applies to • OCI OCI_ATTR_PREFETCH_ROWS • Pro*C Host Array • JDBC setRowPrefetch () • PL/SQL BULK COLLECT • SQL*Plus SET ARRAYSIZE • OCI default prefetch value is 1 (returns 2 rows per fetch) res = OCIAttrSet( (dvoid *)stmt, (ub4)OCI_HTYPE_STMT, (dvoid *)&prefetchRows, (ub4)0, (ub4)OCI_ATTR_PREFETCH_ROWS, (OCIError *)err);

  41. Row Prefetching • Example - full table scan • 1000 row table • 31 blocks (+ segment header) Consistent Gets Prefetch Size

  42. Thank you for your interest For more information and to provide feedback please contact me My e-mail address is: info@juliandyke.com My website address is: www.juliandyke.com

More Related