1 / 41

SQL on Fire! Part 1

SQL on Fire! Part 1. Tips and Tricks around SQL. Agenda. Part I SQL vs. SQL PL example Error handling Tables out of nowhere Pivoting Aggregation Deleting duplicate rows Alternatives to Min and Max Part II Mass deletes Order processing Moving rows between tables Recursion Merge

Gabriel
Download Presentation

SQL on Fire! Part 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SQL on Fire! Part 1 Tips and Tricks around SQL

  2. Agenda • Part I • SQL vs. SQL PL example • Error handling • Tables out of nowhere • Pivoting • Aggregation • Deleting duplicate rows • Alternatives to Min and Max • Part II • Mass deletes • Order processing • Moving rows between tables • Recursion • Merge • Calculating nesting levels Easy Difficult

  3. Motivation – The OLTP Mantra • Reduce Codepath • Reduce Logical and Physical I/O • Reduce Network Traffic • Minimize Lock Contention • Avoid Deadlocks • High performance starts with the application.

  4. SQL vs. SQL PL - Example • DDL • CREATE TABLE emp(empID INTEGER NOT NULLsalary INTEGER,name VARCHAR(20),deptID INTEGER); • CREATE UNIQUE INDEX empIdxPKON emp(empID) INCLUDE (deptID); • ALTER TABLE emp ADD PRIMARY KEY (empID);

  5. SQL vs. SQL PL

  6. SQL vs. SQL PL • Business request:Give everyone in deptID 100 a 5% raise.

  7. SQL vs. SQL PL – First Attempt DECLARE empcur CURSOR FORSELECT * FROM emp FOR UPDATE; OPEN empcur;emploop: LOOP FETCH empcur INTO vempid, vsalary, vname, vdeptid; IF SQLSTATE = ‘02000’ THEN LEAVE emploop; END IF; IF deptid = 100 THEN SET vsalary = vsalary * 1.05; UPDATE emp SET salary = vsalaryWHERECURRENTOF empcur; END IF;END LOOP emploop; CLOSE empcur; Read whole table Only need few rows

  8. SQL vs. SQL PL - Second Attempt DECLARE empcur CURSOR FORSELECT * FROM emp WHERE deptID = 100 FOR UPDATE; OPEN empcur;emploop: LOOP FETCH empcur INTO vempid, vsalary, vname, vdeptid; SET vsalary = vsalary * 1.05; IF SQLSTATE = ‘02000’ THEN LEAVE emploop; END IF; UPDATE emp SET salary = vsalaryWHERECURRENTOF empcur;END LOOP emploop; CLOSE empcur; Retrieve all columns Use only one column

  9. SQL vs. SQL PL - Third Attempt DECLARE empcur CURSOR FORSELECT salary FROM emp WHERE deptID = 100 FOR UPDATE; OPEN empcur;emploop: LOOP FETCH empcur INTO vsalary; SET vsalary = vsalary * 1.05; IF SQLSTATE = ‘02000’ THEN LEAVE emploop; END IF; UPDATE emp SET salary = vsalaryWHERECURRENTOF empcur;END LOOP emploop; CLOSE empcur; Use cursor Trivial logic

  10. SQL vs. SQL PL - Final Attempt UPDATE emp SET salary = salary * 1.05WHERE deptID = 100; Tips:WHERE deptID IN (100, 200)WHERE deptID IN (SELECT deptID FROM dept WHERE deptlocation = ‘Shenzhen’)SET salary = salary * CASE WHEN name = ‘Zhang’ THEN 1.05 ELSE 1.03 END

  11. SQL vs. SQL PL – Aggregation • Return all departments with more than 5 employees:SELECT deptID FROM empGROUPBY deptIDHAVINGCOUNT(*) > 5 • Return the salary of the most highly paid employee by department for departments with more than 5 employees:SELECT deptID, MAX(salary) FROM empGROUPBY deptIDHAVINGCOUNT(*) > 5

  12. Create table if it does not exist BEGIN DECLARE CONTINUE HANDLERFOR SQLSTATE '42710'BEGIN /* Already exists?Ignore */ END;CREATE TABLE T(c1 INT); END Tip 1:There is no need to avoid SQL errorsas long as they are handled… Tip 2:.. except when statement rollback implies undo operation from the log

  13. SQL PL – DDL cleanup BEGINDECLARE CONTINUE HANDLERFOR SQLSTATE '42704'BEGIN /* Does not exist? Ignore */ END;DROP TABLE T1;DROP TABLE T2;DROP SEQUENCE S1; END Tip: Lost errors are very hard to find.So be specific and never ignore generic SQLEXCEPTION.

  14. SQL PL – Is this table empty? IF EXISTS(SELECT 1 FROM emp WHERE dept = 100)THEN .. END IF; Instead of:IF EXISTS(SELECT 1 FROM emp WHERE empid = 5) THEN UPDATE emp SET salary = salary * 1.05 WHERE empID = 5;END IF; Use:UPDATE emp SET salary = salary * 1.05 WHERE empID = 5; Tip:Don’t be afraid of NOTFOUND (SQLSTATE ‘02000’). NOT FOUND not an error

  15. Tables from nowhere • Create tables without INSERT:VALUES (1, 2), (3, 4), …. (10, 11)orVALUES (CAST(? ASINTEGER),CAST(? ASDOUBLE)), (?, ?), … (?, ?) • Use anywhere likeINSERTINTO T VALUES (1, 2), (3, 4), (5, 6)

  16. VALUES for multi row INSERT CREATETABLE T(c1 INT, c2 VARCHAR(10)); INSERTINTO T VALUES (?, ?), (?, ?), (?, ?), … (?, ?); Tip:For mass inserts prepare INSERT statements with: 1, 10, 100, 1000, 10000 rows of parameter markers each. Execute the biggest that fits the remaining load in a loop.

  17. PIVOT and UNPIVOT CREATE TABLE Sales(Year INTEGER, Quarter INTEGER, Results INTEGER);

  18. PIVOT and UNPIVOT CREATE TABLE SalesAgg(year INTEGER,q1 INTEGER,q2 INTEGER,q3 INTEGER,q4 INTEGER);

  19. PIVOT SELECT Year, MAX(CASEWHENQuarter = 1 THEN Results END) AS Q1, MAX(CASEWHEN Quarter = 2 THEN Results END) AS Q2, MAX(CASEWHEN Quarter = 3 THEN Results END) AS Q3, MAX(CASEWHEN Quarter = 4 THEN Results END) AS Q4FROM SalesGROUPBY Year Tip: Use MAX() because it is supported for all comparable types including strings

  20. PIVOT Access Plan: ----------- RETURN ( 1) | GRPBY ( 2) | FETCH ( 3) /----+---\ IXSCAN TABLE: SALES ( 4) | INDEX: SALESIDX

  21. UNPIVOT SELECT Year, Quarter, ResultsFROM SalesAgg AS S,LATERAL(VALUES(1, S.q1), (2, S.q2), (3, S.q3), (4, S.q4)) AS Q(Quarter, Results); Tip:Use LATERAL (or TABLE) to allow correlation of S.q* to the left.

  22. UNPIVOT Access Plan: ----------- RETURN ( 1) | NLJOIN ( 2) /------+-----\ TBSCAN TBSCAN ( 3) ( 4) | | TABLE: SALESAGG TABFNC: GENROW

  23. Aggregation • Problem:DB2 does not support user defined aggregates • Thoughts: • XMLAGG() provides aggregation without loss of data • Use mathematical rules • Examples: • Aggregate concatenation • Aggregate multiplication

  24. XML functions – a primer • XMLAGG() • Aggregates XML values into an XML sequence • XMLELEMENT() • Tags a scalar value and returns XMLe.g. XMLELEMENT(NAME ‘x’ 5) => <x>5</x> • XMLSERIALIZE() • Casts an XML value into a string

  25. Aggregate concatenation CREATETABLE Employee(name VARCHAR(15), dept VARCHAR(15));

  26. Aggregate concatenation

  27. Aggregate concatenation Strip last comma SELECT Dept,SUBSTR(Names, 1, LENGTH(names) -1) FROM(SELECT Dept,REPLACE (REPLACE (XMLSERIALIZE (CONTENTXMLAGG(XMLELEMENT(NAME a, name)ORDERBY name) ASVARCHAR(60)), '<A>', ''), '</A>', ',') AS NamesFROM Employee GROUPBY Dept) AS X; Replace end tags with commas Strip start tags XML to VARCHAR Aggregate in order of names Tag a name to become XML

  28. Aggregate multiplication CREATETABLE probabilities( model VARCHAR(10), event VARCHAR(10), percent FLOAT);

  29. Aggregate multiplication • a = EXP(LOG(a))X* Y = EXP(LOG(X * Y))X * Y = EXP(LOG(X) + LOG(Y)) • PROD(Xi)i=1..n = EXP(SUM(LOG(Xi)i=1..n)

  30. Aggregate multiplication SELECTmodel, DEC(EXP(SUM(LOG(percent))), 8, 7) AS percentFROM probabilitiesWHERE event IN (‘Engine 1’, ‘Fin’)GROUPBY model

  31. Retrieving MAXimum row CREATE TABLE emp(name VARCHAR(10),dept VARCHAR(10),salary INTEGER); CREATEINDEX emp_ind ON emp(dept, salary DESC); • Standard using selfjoin: • SELECT name, salary FROM empWHERE salary = (SELECT MAX(salary) FROM emp WHERE dept = ‘SQL Compiler’AND dept = ‘SQL Compiler’);

  32. Retrieving MAXimum row SELECT name, salary FROM empWHERE dept='SQL Compiler'ORDER BYsalaryDESCFETCH FIRST ROW ONLY;

  33. Retrieving MAXimum row Access Plan: ------------ RETURN ( 1) | FETCH ( 2) /----+---\ IXSCAN TABLE: EMP ( 3) | INDEX: EMP_IND

  34. Delete duplicate rows CREATETABLE Inventory(Item INTEGER, Quantity INTEGER, InvDate DATE); CREATEUNIQUEINDEX InvIdx ON Inventory(Item ASC, InvDate DESC);

  35. Delete duplicate rows

  36. Delete duplicate rows

  37. Delete duplicate rows - classic • DELETEFROMInventoryASD WHERE (Item, InvDate)NOTIN (SELECT Item, InvDate FROMInventory IWHERED.Item = I.Item ORDER BY InvDate DESCFETCH FIRST ROW ONLY);

  38. Delete duplicate rows - improved • Using OLAPDELETE FROM (SELECT row_number() OVER(PARTITIONBYItemORDERBYInvDateDESC) AS rn FROM Inventory) WHERE rn > 1; Tip: Remove ORDER BY clause if it doesn’t matter which duplicates get eliminated.

  39. Delete duplicate rows Access Plan: ------------ RETURN ( 1) | DELETE ( 2) /---+---\ FETCH TABLE: INVENTORY ( 3) /---+---\ FILTER TABLE: INVENTORY ( 4) | IXSCAN ( 5) | INDEX: INVIDX

  40. Conclusion • Exploit SQL to: • Increase concurrency • Reduce I/O • Reduce code-path • Make the application more readable • SQL provides powerful support • Tip: • Part II with even meaner examples after the break

  41. Serge Rielau SQL on Fire! Part 1 IBM srielau@ca.ibm.com

More Related