1 / 31

New Optimizer Features in MySQL 5.6 and Beyond

New Optimizer Features in MySQL 5.6 and Beyond. Manyi Lu Senior Engineering Manager MySQL Optimizer Team. Safe Harbor Statement.

kerri
Download Presentation

New Optimizer Features in MySQL 5.6 and Beyond

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. New Optimizer Features in MySQL 5.6 and Beyond Manyi LuSenior Engineering Manager MySQL Optimizer Team

  2. Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decision. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

  3. MySQL Optimizer • Find an optimal query execution plan: • Index selection • Access method • Join order • Query transformations • Cost-based optimization • Find cost of different plans, choose the best

  4. MySQL 5.6 Optimizer Improvements • Improved subqueryexecution • File sort optimizations with small limit • Index condition pushdown • Batched key access and multi range read • Postponed materialization • Many table join

  5. MySQL 5.6 Optimizer Improvements • IN() query with large number of values • Utilizing extended secondary keys • Persistent optimizer statistics • Explain for insert, delete, and update • Optimizer traces • Structured explain in JSON format

  6. Subquery Optimizations • Prior to 5.6: Users try to avoid subqueries due to poor performance • 5.6: optimized IN subqueries SELECT title FROM film WHERE film_id IN (SELECT film_id FROM film_actor); • Algorithms: • Table pullout • Semi-join • Materialization • Example: DBT3 Query #18 • Execution time reduced from days to seconds

  7. File Sort Optimization with Small Limit CREATE TABLE products( productidintauto_increment PRIMARY KEY, product_namevarchar(200)); SELECT * FROM products ORDER BY productname LIMIT 100; • Web use case – list first 100 products sorted by name • Avoid intermediate sorted files • Produce ordered result set using a single table scan • Example above: 20 million rows, default sort buffer =>3X better execution time, drops from 40s to 10s

  8. Multi Range Read (MRR) SELECT * FROM t WHERE key BETWEEN 1 and 20; Non-MRR: • Random disk access to base table data MRR: • More sequential scan of the base table data

  9. MySQL 5.5: Data Access without DS-MRR Index Table Random access Index scan

  10. MySQL 5.6: Data Access with DS-MRR Example with InnoDB Collect PKs in buffer Sweep-read rows Table Index Sort Index scan PKs in index order PKs inPK order

  11. MySQL 5.6: Regular Nested Loop Join Without join buffering SELECT * FROM t1 JOIN t2 ON t1.col1 = t2.col2; Table2 Table1 Index Table scan

  12. MySQL 5.6: Batched Key Access (BKA) MRR Applied to Join Buffering SELECT * FROM t1 JOIN t2 ON t1.col1 = t2.col2; Collect PKs in buffer Sweep-read rows Table2 Table1 Index Sort PKs inPK order Join buffer PKs in join buffer order

  13. Batched Key Access & Multi Range Read Improves performance of disk bound queries Query execution time without BKA and MRR DBT3 Q13 Customer distribution query Query execution time with BKA and MRR

  14. Index Condition Pushdown CREATE TABLE person ( personid INTEGER PRIMARY KEY, firstname CHAR (20), lastname CHAR(20), postalcode INTEGER, age INTEGER, address CHAR(50), KEY k1(postcode, age) )ENGINE=innodb; • With ICP disabled • 15s (buffer pool 128Mb) • 1.4s (buffer pool 1.5Gb) SELECT firstname, lastname from person where postalcode BETWEEN 5000 and 5500 AND age BETWEEN 21 and 22; • With ICP enabled • Execution time drops to 90 ms for both

  15. Postpone Materialization of Views/Subqueries in FROM EXPLAIN SELECT * FROM (SELECT * FROM a_big_table); • Late materialization • Fast EXPLAINs for views/subqueries • Avoid materialization when possible, faster bail out

  16. Postpone Materialization of Views/Subqueries in FROM • Adding index for derived table • Allow ref access to derived tables • SELECT … FROM derived_table AS dt • JOINtable AS t WHERE dt.fld = t.dlf => 240X better execution time (drops from ~8 min to ~2 sec)

  17. Improved Performance for Many-Table Joins A query joining 24 tables • (Number of tables)! possible combinations • Drastically reduce cost of finding the optimal query plan • More optimal final chosen plan • 5.5: tables ordered in increasing number of rows • 5.6: take key dependency into account

  18. Queries with Many Values in the IN clause SELECT * FROM t1 WHERE col1 IN (large number of value); • Before • Two index dive for each value to estimate the number of rows • Takes longer to optimize the query than to execute it • Now • Use average number of rows per value from the statistics • Significantly reduce the optimization time • eq_range_index_dive_limit = 10 by default

  19. Utilizing Extended Secondary Keys CREATE TABLE orders( orderkey INT PRIMARY KEY, custkey INT, ….INDEX (custkey) ) ENGINE = InnoDB; INSERT 260K rows Rows with custkey=27: 5347 rows Rows with orderkey<10000: 10.000 rows • InnoDB secondary keys contain both user defined columns and PK • 5.6: Hidden keys are now fully used for optimizations SELECT custkey, orderkey FROM orders WHERE custkey=27 AND orderkey < 10000; • Use_index_extensions=off • Handler_read_next: 5347 • Response time: 0.04 sec • use_index_extensions=on • Handler_read_next: 219 • Response time: 0.00 sec

  20. Persistent Optimizer Statistics (InnoDB) • More accurate statistics • More stable statistics • Turned on by default • Automatically recalculates statistics by default • ANALYZE TABLE • Can be manually updated (Good for testing purpose)

  21. Explain for Data Modifying Statements

  22. Explain for Data Modifying Statements, cont

  23. Mysql> EXPLAIN FORMAT=JSON SELECT * FROM t2 WHERE I > 1 AND J < 3; { “query_block”: { “select_id“:1, ”table” : { “table_name”: “t2”, “access_type”: ”range”, “possible_keys”: [ “PRIMARY” ], ”key”: “PRIMARY”, ”key_length”: “4”, “rows”: “2”, “filtered”: 100, “index_condition”: ”(‘test’.’t2’.’I’ > 1)”, ”attached_condition”: “(‘test’.’t2’.’J’ < 3)” } } } Structured Explain • EXPLAIN FORMAT=JSON • More precise than traditional explain • More information • Visualized in MySQL workbench

  24. Optimizer Traces "rows_estimation": [ { "table": "`t1`", "range_analysis": { "table_scan": { "rows": 5, "cost": 4.1 }, "potential_range_indices": [ { "index": "v_idx", "usable": true, "key_parts": [ "v", "i1” ] } ], "best_covering_index_scan": { "index": "v_idx", "cost": 2.0063, "chosen": true }, SET SESSION.OPTIMIZER_TRACE=‘enabled=on’; SELECT (SELECT 1 FROM t6 WHERE d = c) AS RESULT FROM t5; SELECT * from INFORMATION_SCHEMA.OPTIMIZER_TRACE; • Explain shows the generated plan • Trace shows how the plan was generated, decision points, cost etc • Developers, support, advanced customers • First step in 5.6, more tracing to come

  25. MySQL 5.7 Explain on a Running Query EXPLAIN [FORMAT=(JSON|TRADITIONAL)] [EXTENDED] FOR CONNECTION <id>; SHOW PROCESSLIST INFORMATION SCHEMA PROCESSLIST • Shows query plan on connection <id> • Useful for diagnostic on long running queries • Plan isn’t available when query plan is under creation • Applicable to SELECT/INSERT/DELETE/UPDATE

  26. MySQL 5.7 Additional Cost Data in Explain mysql> EXPLAIN FORMAT=JSON SELECT SUM(o_totalprice) FROM orders WHERE o_orderdate BETWEEN '1994-01-01' AND '1994-12-31'; {   "query_block": {     "select_id": 1,     "cost_info": {       "query_cost": "3118848.00"     },     "table": {       "table_name": "orders",       "access_type": "ALL",       "possible_keys": [         "i_o_orderdate" ],       "rows_examined_per_scan": 15000000,       "rows_produced_per_join": 4489990,       "filtered": 29.933,       "cost_info": {         "read_cost": "2220850.00",         "eval_cost": "897998.00",         "prefix_cost": "3118848.00",         "data_read_per_join": "582M"       },       "used_columns": [         "o_totalprice",         "o_orderDATE"       ],       "attached_condition": "(`dbt3`.`orders`.`o_orderDATE` between '1994-01-01' and '1994-12-31')" } } } • Total query cost of a query block • Cost per table • Cost of sorting operation • Cost of reading data • Cost of evaluating conditions • Cost of prefix join • Rows examined/produced per join • Used columns • Data read per join – (# of rows)*(record width) in byte

  27. Visual Explain in MySQL Workbench

  28. What is on Our Roadmap? • Improved prepared statement performance • Support functional index • Redesign cost model, add histogram • Rewrite to bottom-up parser • Continued optimizer refactoring

  29. More Optimizer Sessions • MySQL's EXPLAIN Command New Features Saturday, 11:30 - 12:30, Hilton • When and How to Take Advantage of New Optimizer Features in MySQL 5.6 Sunday, 4:00 PM - 5:00 PM - Hilton • Evolving the MySQL Server ArchitectureSunday, 5:30 PM - 6:30 PM – Hilton • How to Analyze and Tune SQL Queries for Better Performance, Monday, Sep 23, 10:15 AM - 12:30 PM Moscone South

  30. Graphic Section Divider

  31. MySQL Optimizer Understanding the optimizeris crucial in designing efficient schemas and queries. • Many needs for workarounds are going away • Know where to look to see what the optimizer is doing • Joins just got a lot faster - know how to make the most of it

More Related