300 likes | 378 Views
Making Fast Databases. FASTER. @ andy_pavlo. Fast. +. Cheap. Legacy Systems. TPC-C NewOrder. 12.3%. 29.6%. 10.2%. CPU Cycles. 18.7%. 21.1%. 8.1%. OLTP Through the Looking Glass, and What We Found There SIGMOD 2008. OLTP Transactions. Fast. Repetitive. Small.
E N D
Making Fast Databases FASTER @andy_pavlo
Fast + Cheap
Legacy Systems TPC-C NewOrder 12.3% 29.6% 10.2% CPU Cycles 18.7% 21.1% 8.1% OLTP Through the Looking Glass,and What We Found There SIGMOD 2008
OLTP Transactions Fast Repetitive Small
Main Memory • Parallel • Shared-Nothing Transaction Processing H-Store: A High-Performance, DistributedMain Memory Transaction Processing SystemVLDB vol. 1, issue 2, 2008
Stored Procedure Execution Procedure Name Input Parameters Transaction Result Client Application Database Cluster Database Cluster
Optimization #1: Partition database to reduce the number of distributed txns. Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel OLTP SystemsSIGMOD 2012
CUSTOMER ITEM CUSTOMER ORDERS CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS ITEM ITEM ITEM
? ? ? NewOrder(5, “Method Man”, 1234) ? ? ? Client Application CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS ITEM ITEM ITEM
DDL CUSTOMER NewOrder SELECT * FROM WAREHOUSE WHERE W_ID = 10; SELECT * FROM DISTRICT WHERE D_W_ID = 10 AND D_ID =9; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID,…) VALUES(10, 9, 12345,…); ⋮ SELECT * FROM WAREHOUSE WHERE W_ID = 10; SELECT * FROM DISTRICT D_W_ID = 10 AND D_ID =9; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345); ⋮ SELECT * FROM WAREHOUSE WHERE W_ID = 10; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345); ⋮ ITEM DDL SELECT * FROM WAREHOUSE WHERE W_ID = 10; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345); ⋮ Schema ORDERS DTxn Estimator Skew Estimator ------- -------------- Workload Large-Neighorhood Search Algorithm … CUSTOMER CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS ORDERS ITEM ITEM ITEM ITEM
Large-Neighborhood Search Restart DDL Schema ------- -------------- Workload Initial Design Relaxation Local Search
Throughput Horticulture State-of-the-Art (txn/s) TATP TPC-C TPC-C Skewed +88% +16% +183%
Search Times TATP SEATS TPC-C TPC-C Skewed % Single-Partitioned Transactions TPC-E AuctionMark
? ? Undo Log ? Client Application ? Database Cluster Database Cluster
Optimization #2: Predict what txns will do before they execute. On Predictive Modeling for OptimizingTransaction Execution in Parallel OLTP SystemsVLDB, vol 5. issue 2, October 2011
» Partitions Touched? » Undo Log? » Done with Partitions? Client Application Database Cluster Database Cluster
Current State: Input Parameters: w_id=0 i_w_ids=[0,1] i_ids=[1001,1002] ? ? GetWarehouse: SELECT * FROM WAREHOUSEWHERE W_ID = ?
Estimated Execution Path Input Parameters: w_id=0 i_w_ids=[0,1] i_ids=[1001,1002] Transaction Estimate:
Throughput Houdini Assume Single-Partitioned (txn/s) TATP TPC-C AuctionMark +57% +126% +117%
Prediction Overhead TATP TPC-C AuctionMark
Conclusion: Achieving fast performance ismore than just using only RAM. Future Work: Reduce distributed txn overhead through creative scheduling.
h-store hstore.cs.brown.edu github.com/apavlo/h-store
Help is Available +1-212-939-7064 Graduate Student Abuse HotlineAvailable24/7 Collect Calls Accepted