New Optimizer Features in MySQL 5.6 and Beyond

New Optimizer Features in MySQL 5.6 and Beyond Manyi LuSenior Engineering Manager MySQL Optimizer Team

Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decision. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

MySQL Optimizer • Find an optimal query execution plan: • Index selection • Access method • Join order • Query transformations • Cost-based optimization • Find cost of different plans, choose the best

MySQL 5.6 Optimizer Improvements • Improved subqueryexecution • File sort optimizations with small limit • Index condition pushdown • Batched key access and multi range read • Postponed materialization • Many table join

MySQL 5.6 Optimizer Improvements • IN() query with large number of values • Utilizing extended secondary keys • Persistent optimizer statistics • Explain for insert, delete, and update • Optimizer traces • Structured explain in JSON format

Subquery Optimizations • Prior to 5.6: Users try to avoid subqueries due to poor performance • 5.6: optimized IN subqueries SELECT title FROM film WHERE film_id IN (SELECT film_id FROM film_actor); • Algorithms: • Table pullout • Semi-join • Materialization • Example: DBT3 Query #18 • Execution time reduced from days to seconds

File Sort Optimization with Small Limit CREATE TABLE products( productidintauto_increment PRIMARY KEY, product_namevarchar(200)); SELECT * FROM products ORDER BY productname LIMIT 100; • Web use case – list first 100 products sorted by name • Avoid intermediate sorted files • Produce ordered result set using a single table scan • Example above: 20 million rows, default sort buffer =>3X better execution time, drops from 40s to 10s

Multi Range Read (MRR) SELECT * FROM t WHERE key BETWEEN 1 and 20; Non-MRR: • Random disk access to base table data MRR: • More sequential scan of the base table data

MySQL 5.5: Data Access without DS-MRR Index Table Random access Index scan

MySQL 5.6: Data Access with DS-MRR Example with InnoDB Collect PKs in buffer Sweep-read rows Table Index Sort Index scan PKs in index order PKs inPK order

MySQL 5.6: Regular Nested Loop Join Without join buffering SELECT * FROM t1 JOIN t2 ON t1.col1 = t2.col2; Table2 Table1 Index Table scan

MySQL 5.6: Batched Key Access (BKA) MRR Applied to Join Buffering SELECT * FROM t1 JOIN t2 ON t1.col1 = t2.col2; Collect PKs in buffer Sweep-read rows Table2 Table1 Index Sort PKs inPK order Join buffer PKs in join buffer order

Batched Key Access & Multi Range Read Improves performance of disk bound queries Query execution time without BKA and MRR DBT3 Q13 Customer distribution query Query execution time with BKA and MRR

Index Condition Pushdown CREATE TABLE person ( personid INTEGER PRIMARY KEY, firstname CHAR (20), lastname CHAR(20), postalcode INTEGER, age INTEGER, address CHAR(50), KEY k1(postcode, age) )ENGINE=innodb; • With ICP disabled • 15s (buffer pool 128Mb) • 1.4s (buffer pool 1.5Gb) SELECT firstname, lastname from person where postalcode BETWEEN 5000 and 5500 AND age BETWEEN 21 and 22; • With ICP enabled • Execution time drops to 90 ms for both

Postpone Materialization of Views/Subqueries in FROM EXPLAIN SELECT * FROM (SELECT * FROM a_big_table); • Late materialization • Fast EXPLAINs for views/subqueries • Avoid materialization when possible, faster bail out

Postpone Materialization of Views/Subqueries in FROM • Adding index for derived table • Allow ref access to derived tables • SELECT … FROM derived_table AS dt • JOINtable AS t WHERE dt.fld = t.dlf => 240X better execution time (drops from ~8 min to ~2 sec)

Improved Performance for Many-Table Joins A query joining 24 tables • (Number of tables)! possible combinations • Drastically reduce cost of finding the optimal query plan • More optimal final chosen plan • 5.5: tables ordered in increasing number of rows • 5.6: take key dependency into account

Queries with Many Values in the IN clause SELECT * FROM t1 WHERE col1 IN (large number of value); • Before • Two index dive for each value to estimate the number of rows • Takes longer to optimize the query than to execute it • Now • Use average number of rows per value from the statistics • Significantly reduce the optimization time • eq_range_index_dive_limit = 10 by default

Utilizing Extended Secondary Keys CREATE TABLE orders( orderkey INT PRIMARY KEY, custkey INT, ….INDEX (custkey) ) ENGINE = InnoDB; INSERT 260K rows Rows with custkey=27: 5347 rows Rows with orderkey<10000: 10.000 rows • InnoDB secondary keys contain both user defined columns and PK • 5.6: Hidden keys are now fully used for optimizations SELECT custkey, orderkey FROM orders WHERE custkey=27 AND orderkey < 10000; • Use_index_extensions=off • Handler_read_next: 5347 • Response time: 0.04 sec • use_index_extensions=on • Handler_read_next: 219 • Response time: 0.00 sec

Persistent Optimizer Statistics (InnoDB) • More accurate statistics • More stable statistics • Turned on by default • Automatically recalculates statistics by default • ANALYZE TABLE • Can be manually updated (Good for testing purpose)

Explain for Data Modifying Statements

Explain for Data Modifying Statements, cont

Mysql> EXPLAIN FORMAT=JSON SELECT * FROM t2 WHERE I > 1 AND J < 3; { “query_block”: { “select_id“:1, ”table” : { “table_name”: “t2”, “access_type”: ”range”, “possible_keys”: [ “PRIMARY” ], ”key”: “PRIMARY”, ”key_length”: “4”, “rows”: “2”, “filtered”: 100, “index_condition”: ”(‘test’.’t2’.’I’ > 1)”, ”attached_condition”: “(‘test’.’t2’.’J’ < 3)” } } } Structured Explain • EXPLAIN FORMAT=JSON • More precise than traditional explain • More information • Visualized in MySQL workbench

Optimizer Traces "rows_estimation": [ { "table": "`t1`", "range_analysis": { "table_scan": { "rows": 5, "cost": 4.1 }, "potential_range_indices": [ { "index": "v_idx", "usable": true, "key_parts": [ "v", "i1” ] } ], "best_covering_index_scan": { "index": "v_idx", "cost": 2.0063, "chosen": true }, SET SESSION.OPTIMIZER_TRACE=‘enabled=on’; SELECT (SELECT 1 FROM t6 WHERE d = c) AS RESULT FROM t5; SELECT * from INFORMATION_SCHEMA.OPTIMIZER_TRACE; • Explain shows the generated plan • Trace shows how the plan was generated, decision points, cost etc • Developers, support, advanced customers • First step in 5.6, more tracing to come

MySQL 5.7 Explain on a Running Query EXPLAIN [FORMAT=(JSON|TRADITIONAL)] [EXTENDED] FOR CONNECTION <id>; SHOW PROCESSLIST INFORMATION SCHEMA PROCESSLIST • Shows query plan on connection <id> • Useful for diagnostic on long running queries • Plan isn’t available when query plan is under creation • Applicable to SELECT/INSERT/DELETE/UPDATE

MySQL 5.7 Additional Cost Data in Explain mysql> EXPLAIN FORMAT=JSON SELECT SUM(o_totalprice) FROM orders WHERE o_orderdate BETWEEN '1994-01-01' AND '1994-12-31'; { "query_block": { "select_id": 1, "cost_info": { "query_cost": "3118848.00" }, "table": { "table_name": "orders", "access_type": "ALL", "possible_keys": [ "i_o_orderdate" ], "rows_examined_per_scan": 15000000, "rows_produced_per_join": 4489990, "filtered": 29.933, "cost_info": { "read_cost": "2220850.00", "eval_cost": "897998.00", "prefix_cost": "3118848.00", "data_read_per_join": "582M" }, "used_columns": [ "o_totalprice", "o_orderDATE" ], "attached_condition": "(`dbt3`.`orders`.`o_orderDATE` between '1994-01-01' and '1994-12-31')" } } } • Total query cost of a query block • Cost per table • Cost of sorting operation • Cost of reading data • Cost of evaluating conditions • Cost of prefix join • Rows examined/produced per join • Used columns • Data read per join – (# of rows)*(record width) in byte

Visual Explain in MySQL Workbench

What is on Our Roadmap? • Improved prepared statement performance • Support functional index • Redesign cost model, add histogram • Rewrite to bottom-up parser • Continued optimizer refactoring

More Optimizer Sessions • MySQL's EXPLAIN Command New Features Saturday, 11:30 - 12:30, Hilton • When and How to Take Advantage of New Optimizer Features in MySQL 5.6 Sunday, 4:00 PM - 5:00 PM - Hilton • Evolving the MySQL Server ArchitectureSunday, 5:30 PM - 6:30 PM – Hilton • How to Analyze and Tune SQL Queries for Better Performance, Monday, Sep 23, 10:15 AM - 12:30 PM Moscone South

Graphic Section Divider

MySQL Optimizer Understanding the optimizeris crucial in designing efficient schemas and queries. • Many needs for workarounds are going away • Know where to look to see what the optimizer is doing • Joins just got a lot faster - know how to make the most of it

New Optimizer Features in MySQL 5.6 and Beyond

New Optimizer Features in MySQL 5.6 and Beyond

Presentation Transcript

New Features in 8.4

New Features in ML

New Features in 8.4

New Features in TerraScan

New Features and Upcoming Features in ARIN Online

New Features in Flack

What’s New in MySQL Cluster 7.3

New features in McStas

New Features in DiFX2.0

Oracle MySQL Released Version 5.6

New Features in Leave

New features in KE EMu 3.1 and beyond

MySQL 5.6 Database System

New Features in V6

New Features in V6

DM213 - The Adaptive Optimizer - Evolution of the Optimizer from 11.9.2 to 12.0 and Beyond

Full Text Search in MySQL 5.1 New Features and HowTo

MySQL 5.6 1Z0-883 dumps

What’s New in Laravel 5.6 ? [ Complete Guide ]

[PDF] 1Z0-882 - MySQL Test 5.6 Developer Exam Questions

MySQL Governor and its features

Full Text Search in MySQL 5.1 New Features and HowTo