1.09k likes | 1.7k Views
SQL Server: Performance & Balanced System Design. By Frank McBath frankmcb@microsoft.com. Agenda. Introduction Benchmarking Tools & Monitoring Siebel Database Siebel Queries Siebel Query Repro Siebel & Statistics Wait Types Locating Long Running Queries Optimizing Data Loading
E N D
SQL Server: Performance & Balanced System Design By Frank McBath frankmcb@microsoft.com
Agenda Introduction Benchmarking Tools & Monitoring Siebel Database Siebel Queries Siebel Query Repro Siebel & Statistics Wait Types Locating Long Running Queries Optimizing Data Loading Balanced System Design - Network Performance - IO - Memory - File Management Summary
Dealing with Complexity • Key Theme – “Simple but NOT simplistic” • Static configuration parameters replaced with dynamic algorithms employing adaptive feedback • Administrative control retained to manage system wide resources • You can still constrain the amount of memory used by SQL Server – if you want to
Dynamic control algorithms maintain “near optimal” values in response to workload If you tune at this point in time you pick this value. Maybe you pick 2 values and reconfigure system for each “use” Dynamic Control Instantaneous optimal value of control value Control Value Time
Dynamic Control Analogy • Ignition Timing: • user controlled by lever Classic “knob” • Vacuum Advance: • Uses simple feedback • Major advance • Full Feedback: • Continuously adjusted • Monitors: temp, speed, • engine response, etc.
Memory, CPU & Clustering LimitsWindows Server 2003 and SQL Server 2000 1 With Address Windowing Extensions (AWE) in SQL Server 2000
Cost Savings Scalability Manageability • Optimized for Windows Server 2003 and Itanium • Great performance • Large memory addressability (up to 32 TB) • Nearly unlimited virtual memory (up to 8 TB) • I/O savings due to larger memory buffer pools • T-SQL code-compatibility with SQL Server 2000 • 8 node clustering support • Same on-disk format as 32-bit for easy migration • One setup for database & OLAP based on Windows Installer technology • Compelling alternative to expensive Unix solutions The highly scalable database platform for memory intensive, performance-critical business applications
Agenda Introduction Benchmarking Tools & Monitoring Siebel Database Siebel Queries Siebel Query Repro Siebel & Statistics Wait Types Locating Long Running Queries Optimizing Data Loading Balanced System Design - Network Performance - IO - Memory - File Management Summary
PSPP Results as of Oct 26, 2003 • 10/08/03 - 5,000 Concurrent User on Unisys ES7000 server and Microsoft SQL Server 2000 • 10/07/03 - 32,000 Concurrent User on HP-UX servers • 10/07/03 -12,000 Concurrent User on HP ProLiant and Integrity servers and Microsoft SQL Server 2000 (64-bit) • 06/03/03 - 10,000 Concurrent User on HP-UX servers • 04/24/03 - 30,000 Concurrent User on Unisys ES7000 / 2000 Series of servers and Microsoft SQL Server 2000 (64-bit) • 02/05/03 - 5,000 Concurrent User on IBM eServer xSeries and Microsoft SQL Server 2000 • 10/21/02 - 4,500 Concurrent User on IBM eServer xSeries and IBM DB2 UDB • 06/24/02 - 30,000 Concurrent User on IBM eServer pSeries and IBM DB2 UDB
World-class performance 30,000 concurrent Users
12,000 User Benchmark on HP/Windows/SQL64 – resource utilization
12,000 User Benchmark on HP/Windows/SQL64 • Concurrent Users • Server Component Throughput SQL64 on a 4x 1.5 GHz Itanium2 HP Integrity used 47% CPU and 13.3 GB memory proving unprecedented price/performance for Siebel
Oracle 10K vs SQL Server 12K on the DB-tier Oracle supported 10K users on rp8400 with 16x CPU 875Mhz with Oracle 9.x/Hp-ux posting 35% CPU and 18.2GB memory. MSSQL supported 12K users on rx5670 with 4x CPU 1.5Ghz with sql2K/windows2003 posting 47% CPU and 13.3GB memory. Result: 17.95% less CPU** 26% less memory 60% less cost 20% more users (12K vs 10K) SQL Server 2000 64-bit did more with less. *HP rx5670 is around $50K on HP web if you pay the full price. *HP rp8400 base price $124K ** rp8400 16 CPU SpecInt 98.2, rx5600 4 CPU SpecInt 60 ** (cont) (98.2 * 35%) = 34.37 , (60 * 47%) = 28.2 ** (cont) 1 – (28.2/34.37) = 17.95% ** www.spec.org ** Scott Hall Slide
Agenda Introduction Benchmarking Tools & Monitoring Siebel Database Siebel Queries Siebel Query Repro Siebel & Statistics Wait Types Locating Long Running Queries Optimizing Data Loading Balanced System Design - Network Performance - IO - Memory - File Management Summary
Tuning Tools • Windows System Monitor – PERFMON.EXE • System Monitor in the Performance console, started from Administrative Tools • Query Analyzer – ISQLW.EXE • Graphical showplan • Statistics profile • Profiler- PROFILER.EXE • Spot problematic queries • Use the Tuning or Duration templates • Monitor the overhead carefully on your system • Index Tuning Wizard – Particularly for EIM initial
Exploring I/O with System Monitor • Make sure you set DISKPERF –Y on command line to get counters dumped • Performance Object: Physical Disk • Counters: • %Disk Time • Number for entire logical drive should be less than 100% • Avg Disk Queue Length (system requests on avg. waiting for disk access) • Want 0 • If it is above 2 (def. above 3), look into it • See if it’s sustained queueing or temporary • Avg. Disk Read/Write /sec (diff counters, and remember Logical vs. Physical) • Nice to have: 5 – 7 ms (might be optimistic) • Realistic (today’s technology): 20 – 25 ms on a moderately loaded system • Log device service write times should be below 20 ms • Technology dependent … • See BOL (index “Bottlenecks” then “Monitoring Disk Activity”) for some more tips
System Monitor • Useful Counters • Processor - % Processor Time • Physical Disk - %Disk Time, Avg. Disk Queue Length • Memory – Available MBytes • System – Context Switches / sec • SQL Server Locks – Lock Waits/sec, Number of Deadlocks/sec • SQLServer: Access Methods • Full Scans/sec, Page Splits/sec, Table Lock Escalation/sec • SQLServer: Buffer Manager • Buffer Cache Hit Ratio, Lazy Writes/sec, Page Reads/sec, Page Writes/sec, ReadAhead Pages/sec • SQLServer: Databases - Transactions/sec • SQLServer: General Statistics - User Connections • Q150934 – How to Create a Performance Monitor Log
Profiler Terminology • Template • Defines criteria for what to monitor • Saved in .tdf file • Trace • Captures data based upon selected events, data columns, and filters • Filter • Limits the results (Equal, Not like, etc…) • Event Category • Defines the way events are grouped • Event • Action generated within SQL engine
Profiler • Use Built-in templates • Find the worst-performing queries • Filter by duration • Identify the cause of a deadlock • Monitor stored procedure performance • Audit activity – C2 audits • Reorder your output columns by Duration, CPU, read, writes, textdata, etc.
Profiler in Production • Can be very CPU intensive • My experience in production: 8x at 100% • Filter! Filter! Filter! • Let the computer tell you what’s going foul • To proactively find out what is going wrong • Filter on Duration > 30,000ms • Run for 24 hours • This will show you all the poor running queries on your system. • Can’t make it run faster? • Look at IO’s in loops or running all the time. • Repack to higher fill factor. • 5 to 4 IO’s is a 20% increase. • McBath’s Oil Argument • Do you notice your car running better when you change your oil? No. But your car sure does. • Make your database server run as efficient as possible. Makes it more scalable. More with less.
Agenda Introduction Benchmarking Tools & Monitoring Siebel Database Siebel Queries Siebel Query Repro Siebel & Statistics Wait Types Locating Long Running Queries Optimizing Data Loading Balanced System Design - Network Performance - IO - Memory - File Management Summary
Siebel Database • Over 2,300 Tables • Many with over 120 columns per table • 2,200 Clustered Indexes (ROW_ID) • 10,500 Non-Clustered Indexes • Over Indexed, many NULL indexes • 10,000+ Default Constraints • 25,000+ Check Constraints • Very few Stored Procedures • Some Triggers • used by workflow / assignment manager
Siebel Logical Schema vs. Physical Schema • Siebel database schema is designed as a cross platform logical schema, which is managed by Siebel Tools • The logical schema is translated to a physical schema by using Siebel database utilities, such as DDLIMP • The logical schema may be altered and mapped to a compatible physical data type depending on the DB platform and code page
Don’t Be Afraid to Add/Change Indexes… • You almost always have to. • Work closely with Siebel Expert Services • Make sure your Siebel meta data is in sync with SQL Server meta data… or bad things can happen next time you DDLSYNC… • See examples in the next slides
Out of Box… sp_helpindex EIM_CONTACT3 EIM_CONTACT3_T01 EIM_CONTACT3_T02 EIM_CONTACT3_T03 EIM_CONTACT3_T04 EIM_CONTACT3_T05 EIM_CONTACT3_T06 EIM_CONTACT3_T07 EIM_CONTACT3_T08 EIM_CONTACT3_T09 EIM_CONTACT3_T10 EIM_CONTACT3_T11 EIM_CONTACT3_T12 EIM_CONTACT3_T13 EIM_CONTACT3_T14 EIM_CONTACT3_T15 EIM_CONTACT3_T16 EIM_CONTACT3_T17 EIM_CONTACT3_T18 EIM_CONTACT3_T19 EIM_CONTACT3_T20 EIM_CONTACT3_U1 A Large Customer… sp_helpindex EIM_CONTACT3 EIM_CONTACT3_T01 EIM_CONTACT3_U1 Reality (I)
S_CONTACT_EI S_CONTACT_F10 S_CONTACT_F11 S_CONTACT_F12 S_CONTACT_F13 S_CONTACT_F15 S_CONTACT_F2 S_CONTACT_F3 S_CONTACT_F4 S_CONTACT_F5 S_CONTACT_F6 S_CONTACT_F7 S_CONTACT_F8 S_CONTACT_II S_CONTACT_M1 S_CONTACT_M11 S_CONTACT_M12 S_CONTACT_M13 S_CONTACT_M14 S_CONTACT_M15 S_CONTACT_M16 S_CONTACT_M17 S_CONTACT_M18 S_CONTACT_M19 S_CONTACT_M2 S_CONTACT_M20 S_CONTACT_M21 S_CONTACT_M22 S_CONTACT_M3 S_CONTACT_M4 S_CONTACT_M6 S_CONTACT_M8 S_CONTACT_M9 S_CONTACT_P1 S_CONTACT_U1 S_CONTACT_U2 S_CONTACT_V1 S_CONTACT_V2 S_CONTACT_V3 S_CONTACT_V5 S_CONTACT_EI S_CONTACT_F6_X S_CONTACT_II S_CONTACT_M1 S_CONTACT_M50 S_CONTACT_M8 S_CONTACT_ML1_X S_CONTACT_ML2_X S_CONTACT_ML3_X S_CONTACT_ML4_X S_CONTACT_ML5_X S_CONTACT_ML6_X S_CONTACT_P1 S_CONTACT_PREM01_X S_CONTACT_PREM02_X S_CONTACT_U1 S_CONTACT_U2 S_CONTACT_V3 Indexes in RED were custom. Reality (II)
Poor Indexes • Get rid of 100% NULL indexes • Cost a lot on INSERTS/UPDATES/DELETES, ex. EIM • Cost disk space • Cost tape space when you back them up • Only time one might be used: on an aggregate. It’s cheaper than a full scan. Rare, though. • Stored Procedure below will examine the indexes on the top 100 tables for indexes that probably will not be used.
Agenda Introduction Benchmarking Tools & Monitoring Siebel Database Siebel Queries Siebel Query Repro Siebel & Statistics Wait Types Locating Long Running Queries Optimizing Data Loading Balanced System Design - Network Performance - IO - Memory - File Management Summary
Siebel OM Query Execution and Fetching Mechanism • 20 tables in a join… • Siebel OM uses Server side API cursors • For List applet functionality i.e. to maintain user state and support pending result sets • To support multiple active statements per connections • Fast Forward cursor with auto-fetch or Dynamic cursor when accessing text columns • Sometimes there is an implicit conversions to Keyset (order by not covered by index ) • Average fetch size is 3 or 4 rows – this is computed by dividing the ODBC buffer size by row size • Siebel uses ODBC Prepare/Execute • Example: Select * from table where x = ? • What it looks like…
SQL Fetch Mechanism sp_cursorprepex (….., 3) Siebel OM 3 rows sp_cursorfetch (…,3) 3 rows
Agenda Introduction Benchmarking Tools & Monitoring Siebel Database Siebel Queries Siebel Query Repro Siebel & Statistics Wait Types Locating Long Running Queries Optimizing Data Loading Balanced System Design - Network Performance - IO - Memory - File Management Summary
Typical Siebel OM Query declare @P1 int set @P1=-1 declare @P2 int set @P2=0 declare @P3 int set @P3=28688 Fast Forward, Parameterized, Auto Fetch, Auto Close (undocumented and subject to change) declare @P4 int set @P4=8193 declare @P5 int set @P5=10 exec sp_cursorprepexec @P1 output, @P2 output, N'',N' SELECT T1.LAST_UPD_BY, T1.ROW_ID, T18.PRTNR_TYPE, T13.CREATED_BY, T2.ASGN_USR_EXCLD_FLG, . . . T2.PAR_OU_ID FROM dbo.S_PARTY T1 INNER JOIN dbo.S_ORG_EXT T2 ON T1.ROW_ID = T2.PAR_ROW_ID INNER JOIN dbo.S_ACCNT_POSTN T3 ON T2.PR_POSTN_ID = T3.POSITION_ID AND T2.ROW_ID = T3.OU_EXT_ID INNER JOIN dbo.S_PARTY T4 ON T3.POSITION_ID = T4.ROW_ID LEFT OUTER JOIN dbo.S_ORG_EXT T5 ON T2.PAR_OU_ID = T5.PAR_ROW_ID LEFT OUTER JOIN dbo.S_PRI_LST T6 ON T2.CURR_PRI_LST_ID = T6.ROW_ID LEFT OUTER JOIN dbo.S_POSTN T7 ON T2.PR_MGR_POSTN_ID = T7.ROW_ID LEFT OUTER JOIN dbo.S_USER T8 ON T7.PR_EMP_ID = T8.ROW_ID LEFT OUTER JOIN dbo.S_ORG_EXT T9 ON T2.PAR_BU_ID = T9.PAR_ROW_ID LEFT OUTER JOIN dbo.S_ORG_EXT T10 ON T1.PAR_PARTY_ID = T10.PAR_ROW_ID LEFT OUTER JOIN dbo.S_ORG_PRTNR T11 ON T1.ROW_ID = T11.PAR_ROW_ID LEFT OUTER JOIN dbo.S_ORG_EXT_SS T12 ON T1.ROW_ID = T12.PAR_ROW_ID LEFT OUTER JOIN dbo.S_BU T13 ON T1.ROW_ID = T13.PAR_ROW_ID LEFT OUTER JOIN dbo.S_OU_PRTNR_TIER T14 ON T2.PR_PRTNR_TIER_ID = T14.ROW_ID LEFT OUTER JOIN dbo.S_ASGN_GRP T15 ON T2.PR_TERR_ID = T15.ROW_ID LEFT OUTER JOIN dbo.S_INDUST T16 ON T2.PR_INDUST_ID = T16.ROW_ID LEFT OUTER JOIN dbo.S_ADDR_ORG T17 ON T2.PR_ADDR_ID = T17.ROW_ID LEFT OUTER JOIN dbo.S_OU_PRTNR_TYPE T18 ON T2.PR_PRTNR_TYPE_ID = T18.ROW_ID LEFT OUTER JOIN dbo.S_POSTN T19 ON T3.POSITION_ID = T19.PAR_ROW_ID LEFT OUTER JOIN dbo.S_USER T20 ON T19.PR_EMP_ID = T20.PAR_ROW_ID LEFT OUTER JOIN dbo.S_ORG_SYN T21 ON T2.PR_SYN_ID = T21.ROW_ID LEFT OUTER JOIN dbo.S_ORG_BU T22 ON T2.BU_ID = T22.BU_ID AND T2.ROW_ID = T22.ORG_ID LEFT OUTER JOIN dbo.S_PARTY T23 ON T22.BU_ID = T23.ROW_ID LEFT OUTER JOIN dbo.S_ORG_EXT T24 ON T22.BU_ID = T24.PAR_ROW_ID WHERE ((T2.PRTNR_FLG != ''N'') AND ((T13.BU_FLG = ''N'' OR T13.BU_FLG IS NULL) AND T2.PRTNR_FLG = ''Y'')) OPTION(FAST 40)', @P3 output, @P4 output, @P5 output Build plan and return 40 rows ASAP
Why do I get a different query plan in the Query Analyzer ? • Bind value (Prepare Execute model) • Hard Coding Values instead of binding at Run Time • Cursor (SQL Server API cursor) • Not putting the “ODBC Wrapper” • SQL hint (Fast 40) • Not including compiler options • Text column • Implicit Cursor Conversion • Table Spools in one plan, but not the other • Capture on Implicit Cursor Event in Profiler • Also capture on “integer data” column
(N)TEXT Columns • (N)TEXT column may cause performance problems • In the Siebel database schema • A logical TEXT data type is always translated to a physical (N)TEXT column • A VARCHAR data type can be translated to either a (N)VARCHAR column or a (N)TEXT column • VARCHAR(2000+) is translated to a (N)TEXT column
One size fits all: Implicit Cursor Conversions • One type of cursor is requested, but it cannot be fulfilled in it’s native call. • Rather than fail, SQL Server converts internally. • Performance problems. • For example, Siebel uses an “option fast 40” • Fast forward, read only requested with an ORDER BY, yet no index on the WHERE clause that is ordered. SQL Server converts to a KEYSET cursor which spools off to TEMPDB for the sort. • Fix: make an index that matches the ORDER BY. • KEYSET conversion goes away. • SQL Profiler: Event -> Cursors -> CursorsImplicitConversion
Query Repro: Quick and Dirty • Capture on RPC: Starting Event on Profiler • Cut and paste it into Query Analyzer • 99% of the time it will give you the same plan that is coming out of Siebel. • DON’T: Spool Siebel out at the client, hard code the values and put into Query Analyzer. • Probably won’t work • For example, it won’t have the OPTION FAST 40
How to make the Query in QA • print 'declaring variables' • declare @P1 int • declare @P5 int • declare @P6 int • set @P1=NULL • -- set @P5=28688 • -- SCROLLOPT 28676 = 16384 (AutoClose) + 8192 (AutoFetch) + 4096 (Parameterized) + 4 (Forward Only) • set @P5=28676 • set @P6=8193 • print 'running sp_cursorprepare' • exec sp_cursorprepare @P1 output • , N'@P1 varchar(30)' • -- , N'SELECT * FROM authors WHERE au_lname like @P1 OPTION (FAST 1)' • , N'SELECT * FROM authors WHERE au_lname like @P1' • , 1 • , @P5 output • , @P6 output • print 'declaring more variables' • declare @P2 int • declare @P3 int • declare @P4 int • set @P3 = 1 • set @P2=NULL • set @P3=24592 • -- SCROLLOPT - 24592 = 16384 (AutoClose) + 8192 (AutoFetch) + 4 (Forward Only) • set @P4=8193 • set @P5=15 • print 'executing cursor' • exec sp_cursorexecute @P1, @P2 output, @P3 output, @P4 output, @P5 output, 'R%' • print 'select the results from the cusor execute' • select @P1,@P2,@P3,@P4, @P5, @P6
Agenda Introduction Benchmarking Tools & Monitoring Siebel Database Siebel Queries Siebel Query Repro Siebel & Statistics Wait Types Locating Long Running Queries Optimizing Data Loading Balanced System Design - Network Performance - IO - Memory - File Management Summary
Siebel and Update Statistics • Problem: • Siebel queries join a lot of tables • A lot of the tables joined might be smaller than the threshold to execute an automatic update statistics • Result: Bad plans for the join • Stale statistics and EIM • A Plan that tips over. 98 jobs run in 5 minutes. 2 run in 1 hour or… a lot longer. • EIM running faster than Auto Update Stats can kick off. • Solution run a manual update statistics on those tables • Check rowmodctr in sysindexes for indid=1 • See Appendix for code on auto update stats
Example of Stale Statistics • set statistics io on • Table ‘S_CONTACT'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0. • Table ‘S_OPPTY'. Scan count 4382, logical reads 21439, physical reads 650, read-ahead reads 877. • Table ‘S_PARTY'. Scan count 2, logical reads 17573, physical reads 199, read-ahead reads 0. • After UPDATE STATISTICS with sample of 10%: • Table ‘S_CONTACT'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0. • Table ‘S_OPPTY'. Scan count 192, logical reads 1440, physical reads 0, read-ahead reads 11. • Table ‘S_PARTY'. Scan count 4, logical reads 1507, physical reads 3, read-ahead reads 59.
Agenda Introduction Benchmarking Tools & Monitoring Siebel Database Siebel Queries Siebel Query Repro Siebel & Statistics Wait Types Locating Long Running Queries Optimizing Data Loading Balanced System Design - Network Performance - IO - Memory - File Management Summary
Why Wait Types? • Wait types will help define where your bottleneck is. • They are seen in the master..sysprocesses table in a column called “waittype”. • select waittype, * from master..sysprocesses • There are all kinds of waittypes. For example, blocking due to database locks, network io, disk queueing, etc… • The key to solving throughput is understanding what’s damming up the river.
Wait Types • If a thread goes into a sleep status, a wait state is set. • The wait state is contained in master..sysprocesses in the columns waittype, and lastwaittype. • Lastwaittype is a character description of the last wait state for this thread. It is not reset until another wait state occurs. • Waittype is a varbinary wait state that is the current wait state. A wait time of 0 means the thread is currently running. • See SQL_PERF.DOC for detailed information:
Track_waitstats stored procedure • Track_waitstats is a stored procedure that will capture waitstats from DBCC SQLPERF, and provide a ranking of descending order based on percentage. • This is useful in identifying the greatest opportunites for performance improvements. • See Appendix for Stored Procedure on Perf • See the sample output below:
Commentary • The above sample shows the majority share of wait time, 48%, being due to network IO waits. Improving network IO is the single largest opportunity for improving application performance. • Other lesser opportunities in the above example include LCK_M_X (exlusive locks) and WRITELOG (transaction log). Exclusive lock waits account for almost 13% of total wait time. An examination of transaction management may offer clues as to whether improvements can be made here. • WRITELOG means threads are waiting for physical writes to complete to the transaction log. Given the 11% writelog waits, a further analysis of PERFMON disk queues for the transaction log will confirm whether the IO capacity of the transaction log drives have trouble keeping up with write requests as shown by steady and high disk queues.