1 / 28

A Closer Look inside Oracle ASM

Inside Oracle ASM, UKOUG Dec 2007 - 2. Outline. Oracle ASM for DBAsIntroduction and motivations ASM is not a black box Investigation of ASM internalsFocus on practical methods and troubleshootingASM and VLDBMetadata, rebalancing and performanceLessons learned from CERN's production DB services.

Mercy
Download Presentation

A Closer Look inside Oracle ASM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. A Closer Look inside Oracle ASM UKOUG Conference 2007 Luca Canali, CERN IT

    2. Inside Oracle ASM, UKOUG Dec 2007 - 2

    3. Inside Oracle ASM, UKOUG Dec 2007 - 3 ASM Oracle Automatic Storage Management Provides the functionality of a volume manager and filesystem for Oracle (DB) files Works with RAC Oracle 10g feature aimed at simplifying storage management Together with Oracle Managed Files and the Flash Recovery Area An implementation of S.A.M.E. methodology Goal of increasing performance and reducing cost

    4. Inside Oracle ASM, UKOUG Dec 2007 - 4 ASM for a Clustered Architecture This is the architecture deployed at CERN for the Physics DBs, more on https://twiki.cern.ch/twiki/pub/PSSGroup/HAandPerf/Architecture_description.pdf This is the architecture deployed at CERN for the Physics DBs, more on https://twiki.cern.ch/twiki/pub/PSSGroup/HAandPerf/Architecture_description.pdf

    5. Inside Oracle ASM, UKOUG Dec 2007 - 5 ASM Disk Groups

    6. Inside Oracle ASM, UKOUG Dec 2007 - 6 Files, Extents, and Failure Groups Files and extent pointers Failgroups and ASM mirroring

    7. Inside Oracle ASM, UKOUG Dec 2007 - 7 ASM Is not a Black Box ASM is implemented as an Oracle instance Familiar operations for the DBA Configured with SQL commands Info in V$ views Logs in udump and bdump Some ‘secret’ details hidden in X$TABLES and ‘underscore’ parameters

    8. Inside Oracle ASM, UKOUG Dec 2007 - 8 Selected V$ Views and X$ Tables A complete list in https://twiki.cern.ch/twiki/bin/view/PSSGroup/ASM_Internals View Name X$ Table name Description V$ASM_DISKGROUP X$KFGRP performs disk discovery and lists diskgroups V$ASM_DISKGROUP_STAT X$KFGRP_STAT diskgroup stats without disk discovery V$ASM_DISK X$KFDSK, X$KFKID performs disk discovery, lists disks and their usage metrics V$ASM_DISK_STAT X$KFDSK_STAT, X$KFKID lists disks and their usage metrics V$ASM_FILE X$KFFIL lists ASM files, including metadata/asmdisk files V$ASM_ALIAS X$KFALS lists ASM aliases, files and directories V$ASM_TEMPLATE X$KFTMTA lists the available templates and their properties V$ASM_CLIENT X$KFNCL lists DB instances connected to ASM V$ASM_OPERATION X$KFGMG lists rebalancing operations N.A. X$KFKLIB available libraries, includes asmlib path N.A. X$KFDPARTNER lists disk-to-partner relationships N.A. X$KFFXP extent map table for all ASM files N.A. X$KFDAT extent list for all ASM disks N.A. X$KFBH describes the ASM cache (buffer cache of ASM in blocks of 4K (_asm_blksize) N.A. X$KFCCE a linked list of ASM blocks. to be further investigated Additional in 11g: View Name X$ Table name Description V$ASM_ATTRIBUTE X$KFENV ASM attributes, the X$ table shows also 'hidden' attributes V$ASM_DISK_IOSTAT X$KFNSDSKIOST I/O statistics N.A. X$KFDFS N.A. X$KFDDD N.A. X$KFGBRB N.A. X$KFMDGRP N.A. X$KFCLLE N.A. X$KFVOL N.A. X$KFVOLSTAT N.A. X$KFVOFS N.A. X$KFVOFSV A complete list in https://twiki.cern.ch/twiki/bin/view/PSSGroup/ASM_Internals View Name X$ Table name Description V$ASM_DISKGROUP X$KFGRP performs disk discovery and lists diskgroups V$ASM_DISKGROUP_STAT X$KFGRP_STAT diskgroup stats without disk discovery V$ASM_DISK X$KFDSK, X$KFKID performs disk discovery, lists disks and their usage metrics V$ASM_DISK_STAT X$KFDSK_STAT, X$KFKID lists disks and their usage metrics V$ASM_FILE X$KFFIL lists ASM files, including metadata/asmdisk files V$ASM_ALIAS X$KFALS lists ASM aliases, files and directories V$ASM_TEMPLATE X$KFTMTA lists the available templates and their properties V$ASM_CLIENT X$KFNCL lists DB instances connected to ASM V$ASM_OPERATION X$KFGMG lists rebalancing operations N.A. X$KFKLIB available libraries, includes asmlib path N.A. X$KFDPARTNER lists disk-to-partner relationships N.A. X$KFFXP extent map table for all ASM files N.A. X$KFDAT extent list for all ASM disks N.A. X$KFBH describes the ASM cache (buffer cache of ASM in blocks of 4K (_asm_blksize) N.A. X$KFCCE a linked list of ASM blocks. to be further investigated Additional in 11g: View Name X$ Table name Description V$ASM_ATTRIBUTE X$KFENV ASM attributes, the X$ table shows also 'hidden' attributes V$ASM_DISK_IOSTAT X$KFNSDSKIOST I/O statistics N.A. X$KFDFS N.A. X$KFDDD N.A. X$KFGBRB N.A. X$KFMDGRP N.A. X$KFCLLE N.A. X$KFVOL N.A. X$KFVOLSTAT N.A. X$KFVOFS N.A. X$KFVOFSV

    9. Inside Oracle ASM, UKOUG Dec 2007 - 9 ASM Parameters Notable ASM instance parameters: *.asm_diskgroups='TEST1_DATADG1','TEST1_RECODG1' *.asm_diskstring='/dev/mpath/itstor*p*' *.asm_power_limit=5 *.shared_pool_size=70M *.db_cache_size=50M *.large_pool_size=50M *.processes=100 A longer list: *.asm_diskgroups='TEST1_DATADG1','TEST1_RECODG1' *.asm_diskstring='/dev/mpath/itstor*p*' *.asm_power_limit=5 *.cluster_database=true *.cluster_database_instances=6 *.shared_pool_size=70M *.db_cache_size=50M *.large_pool_size=50M *.processes=100 +ASM1.instance_number=1 (..repeat for each instance) *.instance_type='asm' +ASM1.local_listener='LISTENER_+ASM1‘ (..repeat for each instance) *.remote_login_passwordfile='exclusive' *.user_dump_dest='/ORA/dbs00/oracle/admin/+ASM/udump‘ *.core_dump_dest='/ORA/dbs00/oracle/admin/+ASM/cdump' *.background_dump_dest='/ORA/dbs00/oracle/admin/+ASM/bdump' A longer list: *.asm_diskgroups='TEST1_DATADG1','TEST1_RECODG1' *.asm_diskstring='/dev/mpath/itstor*p*' *.asm_power_limit=5 *.cluster_database=true *.cluster_database_instances=6 *.shared_pool_size=70M *.db_cache_size=50M *.large_pool_size=50M *.processes=100 +ASM1.instance_number=1 (..repeat for each instance) *.instance_type='asm' +ASM1.local_listener='LISTENER_+ASM1‘ (..repeat for each instance) *.remote_login_passwordfile='exclusive' *.user_dump_dest='/ORA/dbs00/oracle/admin/+ASM/udump‘ *.core_dump_dest='/ORA/dbs00/oracle/admin/+ASM/cdump' *.background_dump_dest='/ORA/dbs00/oracle/admin/+ASM/bdump'

    10. Inside Oracle ASM, UKOUG Dec 2007 - 10 More ASM Parameters Underscore parameters Several undocumented parameters Typically don’t need tuning Exception: _asm_ausize and _asm_stripesize May need tuning for VLDB in 10g New in 11g, diskgroup attributes V$ASM_ATTRIBUTE, most notable disk_repair_time au_size X$KFENV shows ‘underscore’ attributes Query to display underscore parameters select a.ksppinm "Parameter", c.ksppstvl "Instance Value" from x$ksppi a, x$ksppcv b, x$ksppsv c where a.indx = b.indx and a.indx = c.indx and ksppinm like '%asm%' order by a.ksppinm; Query to display underscore parameters select a.ksppinm "Parameter", c.ksppstvl "Instance Value" from x$ksppi a, x$ksppcv b, x$ksppsv c where a.indx = b.indx and a.indx = c.indx and ksppinm like '%asm%' order by a.ksppinm;

    11. Inside Oracle ASM, UKOUG Dec 2007 - 11 ASM Storage Internals ASM Disks are divided in Allocation Units (AU) Default size 1 MB (_asm_ausize) Tunable diskgroup attribute in 11g ASM files are built as a series of extents Extents are mapped to AUs using a file extent map When using ‘normal redundancy’, 2 mirrored extents are allocated, each on a different failgroup RDBMS read operations access only the primary extent of a mirrored couple (unless there is an IO error) In 10g the ASM extent size = AU size 11g implements variable extent size for extent#>20000 (similar to the automatic extent size management in the DB) 11g implements variable extent size for extent#>20000 (similar to the automatic extent size management in the DB)

    12. Inside Oracle ASM, UKOUG Dec 2007 - 12 ASM Metadata Walkthrough Three examples follow of how to read data directly from ASM. Motivations: Build confidence in the technology, i.e. ‘get a feeling’ of how ASM works It may turn out useful one day to troubleshoot a production issue.

    13. Inside Oracle ASM, UKOUG Dec 2007 - 13 Example 1: Direct File Access 1/2 Goal: Reading ASM files with OS tools, using metadata information from X$ tables Example: find the 2 mirrored extents of the RDBMS spfile sys@+ASM1> select GROUP_KFFXP Group#, DISK_KFFXP Disk#, AU_KFFXP AU#, XNUM_KFFXP Extent# from X$KFFXP where number_kffxp=(select file_number from v$asm_alias where name='spfiletest1.ora'); GROUP# DISK# AU# EXTENT# ---------- ---------- ---------- ---------- 1 16 17528 0 1 4 14838 0

    14. Inside Oracle ASM, UKOUG Dec 2007 - 14 Example 1: Direct File Access 2/2 Find the disk path sys@+ASM1> select disk_number,path from v$asm_disk where GROUP_NUMBER=1 and disk_number in (16,4); DISK_NUMBER PATH ----------- ------------------------------------ 4 /dev/mpath/itstor417_1p1 16 /dev/mpath/itstor419_6p1 Read data from disk using ‘dd’ dd if=/dev/mpath/itstor419_6p1 bs=1024k count=1 skip=17528 |strings Device mapper multipathing is used under Linux Block devices are given a name /dev/mpath/itstor.. to reflect the storage array name The name is a symbolic link (handled by device mapper) to to /dev/dm-xxx (device mapper block device) Device names and device mapper configurations are set up with /etc/multipath.confDevice mapper multipathing is used under Linux Block devices are given a name /dev/mpath/itstor.. to reflect the storage array name The name is a symbolic link (handled by device mapper) to to /dev/dm-xxx (device mapper block device) Device names and device mapper configurations are set up with /etc/multipath.conf

    15. Inside Oracle ASM, UKOUG Dec 2007 - 15 X$KFFXP ‘File extent pointer’ X$ table. This is the author interpretation of the table based on investigation of actual instances. X$KFFXP Column Name Description ADDR x$ table address/identifier INDX row unique identifier INST_ID instance number (RAC) NUMBER_KFFXP ASM file number. Join with v$asm_file and v$asm_alias COMPOUND_KFFXP File identifier. Join with compound_index in v$asm_file INCARN_KFFXP File incarnation id. Join with incarnation in v$asm_file PXN_KFFXP Progressive file extent number XNUM_KFFXP ASM file extent number (mirrored AU have the same extent value) GROUP_KFFXP ASM disk group number. Join with v$asm_disk and v$asm_diskgroup DISK_KFFXP Disk number where the extent is allocated. Join with v$asm_disk AU_KFFXP Relative position of the allocation unit from the beginning of the disk. The allocation unit size (1 MB) in v$asm_diskgroup LXN_KFFXP 0->primary extent, ->mirror extent, 2->2nd mirror copy (high redundancy and metadata) FLAGS_KFFXP N.K. CHK_KFFXP N.K. ‘File extent pointer’ X$ table. This is the author interpretation of the table based on investigation of actual instances. X$KFFXP Column Name Description ADDR x$ table address/identifier INDX row unique identifier INST_ID instance number (RAC) NUMBER_KFFXP ASM file number. Join with v$asm_file and v$asm_alias COMPOUND_KFFXP File identifier. Join with compound_index in v$asm_file INCARN_KFFXP File incarnation id. Join with incarnation in v$asm_file PXN_KFFXP Progressive file extent number XNUM_KFFXP ASM file extent number (mirrored AU have the same extent value) GROUP_KFFXP ASM disk group number. Join with v$asm_disk and v$asm_diskgroup DISK_KFFXP Disk number where the extent is allocated. Join with v$asm_disk AU_KFFXP Relative position of the allocation unit from the beginning of the disk. The allocation unit size (1 MB) in v$asm_diskgroup LXN_KFFXP 0->primary extent, ->mirror extent, 2->2nd mirror copy (high redundancy and metadata) FLAGS_KFFXP N.K. CHK_KFFXP N.K.

    16. Inside Oracle ASM, UKOUG Dec 2007 - 16 Example 2: A Different Way A different metadata table to reach the same goal of reading ASM files directly from OS: sys@+ASM1> select GROUP_KFDAT Group# ,NUMBER_KFDAT Disk#, AUNUM_KFDAT AU# from X$KFDAT where fnum_kfdat=(select file_number from v$asm_alias where name='spfiletest1.ora'); GROUP# DISK# AU# ---------- ---------- ---------- 1 4 14838 1 16 17528 This is much slower that reading X$KFFXP since it scans all the disks’ allocation tablesThis is much slower that reading X$KFFXP since it scans all the disks’ allocation tables

    17. Inside Oracle ASM, UKOUG Dec 2007 - 17 X$KFDAT ‘Disk allocation table’ X$ table. This is the author interpretation of the table based on investigation of actual instances. X$KFDAT Column Name Description ADDR x$ table address/identifier INDX row unique identifier INST_ID instance number (RAC) GROUP_KFDAT diskgroup number, join with v$asm_diskgroup NUMBER_KFDAT disk number, join with v$asm_disk COMPOUND_KFDAT disk compund_index, join with v$asm_disk AUNUM_KFDAT Disk allocation unit (relative position from the beginning of the disk), join with x$kffxp.au_kffxp V_KFDAT V=this Allocation Unit is used; F=AU is free FNUM_KFDAT file number, join with v$asm_file I_KFDAT N.K. XNUM_KFDAT progressive file extent number join with x$kffxp.pxn_kffxp RAW_KFDAT raw format encoding of the disk,and file extent information ‘Disk allocation table’ X$ table. This is the author interpretation of the table based on investigation of actual instances. X$KFDAT Column Name Description ADDR x$ table address/identifier INDX row unique identifier INST_ID instance number (RAC) GROUP_KFDAT diskgroup number, join with v$asm_diskgroup NUMBER_KFDAT disk number, join with v$asm_disk COMPOUND_KFDAT disk compund_index, join with v$asm_disk AUNUM_KFDAT Disk allocation unit (relative position from the beginning of the disk), join with x$kffxp.au_kffxp V_KFDAT V=this Allocation Unit is used; F=AU is free FNUM_KFDAT file number, join with v$asm_file I_KFDAT N.K. XNUM_KFDAT progressive file extent number join with x$kffxp.pxn_kffxp RAW_KFDAT raw format encoding of the disk,and file extent information

    18. Inside Oracle ASM, UKOUG Dec 2007 - 18 Example 3: Yet Another Way Using the internal package dbms_diskgroup declare fileType varchar2(50); fileName varchar2(50); fileSz number; blkSz number; hdl number; plkSz number; data_buf raw(4096); begin fileName := '+TEST1_DATADG1/TEST1/spfiletest1.ora'; dbms_diskgroup.getfileattr(fileName,fileType,fileSz, blkSz); dbms_diskgroup.open(fileName,'r',fileType,blkSz, hdl,plkSz, fileSz); dbms_diskgroup.read(hdl,1,blkSz,data_buf); dbms_output.put_line(data_buf); end; /

    19. Inside Oracle ASM, UKOUG Dec 2007 - 19 DBMS_DISKGROUP Can be used to read/write ASM files directly It’s an Oracle internal package Does not require a RDBMS instance 11g’s asmcmd cp command uses dbms_diskgroup

    20. Inside Oracle ASM, UKOUG Dec 2007 - 20 File Transfer Between OS and ASM The supported tools (10g) RMAN DBMS_FILE_TRANSFER FTP (XDB) WebDAV (XDB) They all require a RDBMS instance In 11g, all the above plus asmcmd cp command Works directly with the ASM instance ftp into asm diskgroups can be used together with transportable tablespace. CERN’s experience is the migration and consolidation of 10 DBs from 9i to a multi TB RAC DB on 10g 11g asmcmd is improved from the 10g version: Look in $OH/bin and $OH/rdbms/lib (find $ORACLE_HOME -name *asmcmd*) ftp into asm diskgroups can be used together with transportable tablespace. CERN’s experience is the migration and consolidation of 10 DBs from 9i to a multi TB RAC DB on 10g 11g asmcmd is improved from the 10g version: Look in $OH/bin and $OH/rdbms/lib (find $ORACLE_HOME -name *asmcmd*)

    21. Inside Oracle ASM, UKOUG Dec 2007 - 21 Strace and ASM 1/3 Goal: understand strace output when using ASM storage Example: read64(15,"#33\0@\"..., 8192, 473128960)=8192 This is a read operation of 8KB from FD 15 at offset 473128960 What is the segment name, type, file# and block# ?

    22. Inside Oracle ASM, UKOUG Dec 2007 - 22 Strace and ASM 2/3 From /proc/<pid>/fd I find that FD=15 is /dev/mpath/itstor420_1p1 This is disk 20 of D.G.=1 (from v$asm_disk) From x$kffxp I find the ASM file# and extent#: Note: offset 473128960 = 451 MB + 27 *8KB sys@+ASM1>select number_kffxp, xnum_kffxp from x$kffxp where group_kffxp=1 and disk_kffxp=20 and au_kffxp=451; NUMBER_KFFXP XNUM_KFFXP ------------ ---------- 268 17

    23. Inside Oracle ASM, UKOUG Dec 2007 - 23 Strace and ASM 3/3 From v$asm_alias I find the file alias for file 268: USERS.268.612033477 From v$datafile view I find the RDBMS file#: 9 From dba extents finally find the owner and segment name relative to the original IO operation: sys@TEST1>select owner,segment_name,segment_type from dba_extents where FILE_ID=9 and 27+17*1024*1024 between block_id and block_id+blocks; OWNER SEGMENT_NAME SEGMENT_TYPE ----- ------------ ------------ SCOTT EMP TABLE

    24. Inside Oracle ASM, UKOUG Dec 2007 - 24 Investigation of Fine Striping An application: finding the layout of fine-striped files Explored using strace of an oracle session executing ‘alter system dump logfile ..’ Result: round robin distribution over 8 x 1MB extents

    25. Inside Oracle ASM, UKOUG Dec 2007 - 25 Metadata Files ASM diskgroups contain ‘hidden files’ Not listed in V$ASM_FILE (file# <256) Details are available in X$KFFIL In addition the first 2 AUs of each disk are marked as file#=0 in X$KFDAT Example (10g): GROUP# FILE# FILESIZE_AFTER_MIRR RAW_FILE_SIZE ---------- ---------- ------------------- ------------- 1 1 2097152 6291456 1 2 1048576 3145728 1 3 264241152 795869184 1 4 1392640 6291456 1 5 1048576 3145728 1 6 1048576 3145728 select group_kffil group#, number_kffil file#, filsiz_kffil filesize_after_mirr, filspc_kffil raw_file_size from x$kffil where number_kffil <256; Metadata files have 3-way mirroring select group_kffil group#, number_kffil file#, filsiz_kffil filesize_after_mirr, filspc_kffil raw_file_size from x$kffil where number_kffil <256; Metadata files have 3-way mirroring

    26. Inside Oracle ASM, UKOUG Dec 2007 - 26 ASM Metadata 1/2 File#0, AU=0: disk header (disk name, etc), Allocation Table (AT) and Free Space Table (FST) File#0, AU=1: Partner Status Table (PST) File#1: File Directory (files and their extent pointers) File#2: Disk Directory File#3: Active Change Directory (ACD) The ACD is analogous to a redo log, where changes to the metadata are logged. Size=42MB * number of instances Source: Oracle Automatic Storage Management, Oracle Press Nov 2007, N. Vengurlekar, M. Vallath, R.Long A note on File#1 and file extent pointers: The extent information available in X$KFFXP is In File#1 for the first 60 extents (short files) Extent# > 60 make use of indirect pointers Indirect pointers are allocated in files Direct investigation: Using X$KFFXP count the extents allocated for: create tablespace test1 datafile size 30712k; 60 extents (normal redundancy) create tablespace test2 datafile size 30713k; 65 extents (normal redundancy) The extra extents are indirect pointers A note on File#1 and file extent pointers: The extent information available in X$KFFXP is In File#1 for the first 60 extents (short files) Extent# > 60 make use of indirect pointers Indirect pointers are allocated in files Direct investigation: Using X$KFFXP count the extents allocated for: create tablespace test1 datafile size 30712k; 60 extents (normal redundancy) create tablespace test2 datafile size 30713k; 65 extents (normal redundancy) The extra extents are indirect pointers

    27. Inside Oracle ASM, UKOUG Dec 2007 - 27 ASM Metadata 2/2 File#4: Continuing Operation Directory (COD). The COD is analogous to an undo tablespace. It maintains the state of active ASM operations such as disk or datafile drop/add. The COD log record is either committed or rolled back based on the success of the operation. File#5: Template directory File#6: Alias directory 11g, File#9: Attribute Directory 11g, File#12: Staleness registry, created when needed to track offline disks

    28. Inside Oracle ASM, UKOUG Dec 2007 - 28 ASM Rebalancing Rebalancing is performed (and mandatory) after space management operations Goal: balanced space allocation across disks Not based on performance or utilization ASM spreads every file across all disks in a diskgroup ASM instances are in charge of rebalancing Extent pointers changes are communicated to the RDBMS RDBMS’ ASMB process keeps an open connection to ASM This can be observed by running strace against ASMB In RAC, extra messages are passed between the cluster ASM instances LMD0 of the ASM instances are very active during rebalance

    29. Inside Oracle ASM, UKOUG Dec 2007 - 29 ASM Rebalancing and VLDB Performance of Rebalancing is important for VLDB An ASM instance can use parallel slaves RBAL coordinates the rebalancing operations ARBx processes pick up ‘chunks’ of work. By default they log their activity in udump Does it scale? In 10g serialization wait events can limit scalability Even at maximum speed rebalancing is not always I/O bound Gmon is also involved at the beginning and the end of the rebalancing operation to take and release the diskgroup lockGmon is also involved at the beginning and the end of the rebalancing operation to take and release the diskgroup lock

    30. Inside Oracle ASM, UKOUG Dec 2007 - 30 ASM Rebalancing Performance Tracing ASM rebalancing operations 10046 trace of the +arbx processes Oradebug setospid … oradebug event 10046 trace name context forever, level 12 Process log files (in bdump) with orasrp (tkprof will not work) Main wait events from my tests with RAC (6 nodes) DFS lock handle Waiting for CI level 5 (cross instance lock) Buffer busy wait ‘unaccounted for’ enq: AD - allocate AU enq: AD - deallocate AU log write(even) log write(odd)

    31. Inside Oracle ASM, UKOUG Dec 2007 - 31 ASM Single Instance Rebalancing Single instance rebalance Faster in RAC if you can rebalance with only 1 node up (I have observed: 20% to 100% speed improvement) Buffer busy wait can be the main event It seems to depend on the number of files in the diskgroup. Diskgroups with a small number of (large) files have more contention (+arbx processes operate concurrently on the same file) Only seen in tests with 10g 11g has improvements regarding rebalancing contention

    32. Inside Oracle ASM, UKOUG Dec 2007 - 32 Rebalancing, an Example Rebalancing speed is measured in MB/minute to conform to V$ASM_OPERATION units Test conditions may vary the results (OS, storage, Oracle version, number of ASM files, etc) It’s a good idea to repeat the measurements when several parameters of the environment change to get meaningful results.Rebalancing speed is measured in MB/minute to conform to V$ASM_OPERATION units Test conditions may vary the results (OS, storage, Oracle version, number of ASM files, etc) It’s a good idea to repeat the measurements when several parameters of the environment change to get meaningful results.

    33. Inside Oracle ASM, UKOUG Dec 2007 - 33 Rebalancing Workload When ASM mirroring is used (e.g. with normal redundancy) Rebalancing operations can move more data than expected Example: 5 TB (allocated): ~100 disks, 200 GB each A disk is replaced (diskgroup rebalance) The total IO workload is 1.6 TB (8x the disk size!) How to see this: query v$asm_operation, the column EST_WORK keeps growing during rebalance The issue: excessive repartnering Rebalancing in RAC is failed over when an instance crashes, but Does not restart if all instance are down (typical of single instance) No obvious way to tell if a diskgroup has a pending rebalance op A partial work around is to query v$ASM_DISK to see if there are disk occupation imbalances total_mb and free_mb Rebalancing in RAC is failed over when an instance crashes, but Does not restart if all instance are down (typical of single instance) No obvious way to tell if a diskgroup has a pending rebalance op A partial work around is to query v$ASM_DISK to see if there are disk occupation imbalances total_mb and free_mb

    34. Inside Oracle ASM, UKOUG Dec 2007 - 34 ASM Disk Partners ASM diskgroup with normal redundancy Two copies of each extents are written to different ‘failgroups’ Two ASM disks are partners: When they have at least one extent set in common (they are the 2 sides of a mirror for some data) Each ASM disk has a limited number of partners Typically 10 disk partners: X$KFDPARTNER Helps to reduce the risk associated with 2 simultaneous disk failures

    35. Inside Oracle ASM, UKOUG Dec 2007 - 35 Free and Usable Space When ‘ASM mirroring’ is used not all the free space should be occupied V$ASM_DISKGROUP.USABLE_FILE_MB: Amount of free space that can be safely utilized taking mirroring into account, and yet be able to restore redundancy after a disk failure it’s calculated for the case of the worst scenario, anyway it is a best practice not to have it go negative (it can) This can be a problem when deploying a small number of large LUNs and/or failgroups

    36. Inside Oracle ASM, UKOUG Dec 2007 - 36 Fast Mirror Resync ASM 10g with normal redundancy does not allow to offline part of the storage A transient error in a storage array can cause several hours of rebalancing to drop and add disks It is a limiting factor for scheduled maintenances 11g has new feature ‘fast mirror resync’ Redundant storage can be put offline for maintenance Changes are accumulated in the staleness registry (file#12) Changes are applied when the storage is back online

    37. Inside Oracle ASM, UKOUG Dec 2007 - 37 Read Performance, Random I/O IOPS measured with SQL (synthetic test) From other tests with less ad-hoc tuning and from application deployment: ~100 IOPS random read operations per disk (SATA 7200 rpm) More info here: https://twiki.cern.ch/twiki/bin/view/PSSGroup/HAandPerfFrom other tests with less ad-hoc tuning and from application deployment: ~100 IOPS random read operations per disk (SATA 7200 rpm) More info here: https://twiki.cern.ch/twiki/bin/view/PSSGroup/HAandPerf

    38. Inside Oracle ASM, UKOUG Dec 2007 - 38 Read Performance, Sequential I/O

    39. Inside Oracle ASM, UKOUG Dec 2007 - 39 Implementation Details Multipathing Linux Device Mapper (2.6 kernel) Block devices RHEL4 and 10gR2 allow to skip raw devices mapping External half of the disk for data disk groups JBOD config No HW RAID ASM used to mirror across disk arrays HW: Storage arrays (Infortrend): FC controller, SATA disks FC (Qlogic): 4Gb switch and HBAs (2Gb in older HW) Servers are 2x CPUs, 4GB RAM, 10.2.0.3 on RHEL4, RAC of 4 to 8 nodes

    40. Inside Oracle ASM, UKOUG Dec 2007 - 40 Conclusions CERN deploys RAC and ASM on Linux on commodity HW 2.5 years of production, 110 Oracle 10g RAC nodes and 300TB of raw disk space (Dec 2007) ASM metadata Most critical part, especially rebalancing Knowledge of some ASM internals helps troubleshooting ASM on VLDB Know and work around pitfalls in 10g 11g has important manageability and performance improvements

    41. Inside Oracle ASM, UKOUG Dec 2007 - 41 Q&A Q&A Links: http://cern.ch/phydb http://twiki.cern.ch/twiki/bin/view/PSSGroup/ASM_Internals http://www.cern.ch/canali

More Related