Database replication policies for dynamic content applications

Database replication policies for dynamic content applications Gokul Soundararajan, Cristiana Amza, Ashvin Goel University of Toronto Presented by Ahmed Ataullah Wednesday, November 22nd 2006

The plan • Motivation and Introduction • Background • Suggested Technique • Optimization/Feeback ‘loop’ • Summary of Results • Discussion

The Problem • 3-Tier Framework • Recall: The benefits of partial/full database replication in dynamic content (web) applications • We need to address some issues in the framework as presented in ‘Ganymed’ • Problem: • How many replicas do we allocate? • How do we take care of overload situations while maximizing utilization and minimizing the number of replicas? • Solution: • Dynamically (de)allocate replicas as needed

Assumptions • Query Load: • Write queries are short, sweet and simple • Read queries are complex, costly and more frequent • Infrastructure: • Replica addition is time consuming and ‘expensive’ • Machines are flexible in nature • Replica Allocation vs. Replica Mapping • Assume an intelligent middleware is present

Replica Allocation • Full overlapping allocation • All databases replicated across all machines in the cluster • Disjoint allocation (no overlapping) B A (A two database Scenario: Global replica pool not shown) RS = Read Sets, WS=Write Sets : Ratio of WS to RS may be misleading in the above diagram

Partial Overlapping Allocation • Only share write sets. Read sets do not overlap B A (A two database scenario) RS = Read Sets, WS=Write Sets

Dynamic Replication (Eurosys 2006 Slides) • Assume a cluster hosts 2 applications • App1 (Red) using 2 machines • App2 (Blue) using 2 machines • Assume App1 has a load spike

Dynamic Replication • Choose # of replicas to allocate to App1 • Say, we adapt by allocating one more replica • Then, two options • App2 still uses two replicas (overlap replica sets) • App2 loses one replica (disjoint replica sets)

Challenges • Adding a replica can take time • Bring replica up-to-date • Warm-up memory • Can avoid adaptation with fully-overlapped replica sets

Challenges • However, overlapping applications compete for memory causing interference • Can avoid interference with disjoint replica sets

Challenges • However, overlapping applications compete for memory causing interference • Can avoid interference with disjoint replica sets Tradeoff between adaptation delay and interference

Our Solution – Partial Overlap • Reads of applications sent to disjoint replica sets • Avoids interference • Read-Set • Set of replicas where reads are sent

Our Solution – Partial Overlap • Writes of apps also sent to overlapping replica sets • Reduces replica addition time • Write-Set • Set of replicas where writes are sent

Optimization • For a given application, • Replicas in Write-Set – Fully Up-to-Date • Other Replicas – Periodic Batch Updates

Secondary Implementation Details • Scheduler(s): • Separate read-only from read/write queries • One copy serializability is guaranteed • Optimization: • Scheduler also stores some cached information (queries, write sets etc,) to reduce warm-up/ready time. • Conflict awareness at the scheduler layer

Replica Allocation Logic ? Stability Delay Measure Average Query Latency by solving: WL= (alpha)L + (1 – alpha) WL L is the current query latency and alpha a constant. Note: Responsiveness/stability both depend on alpha

Results It works…

One last interesting issue • WL= (alpha)L + (1 – alpha) WL • L is the current query latency and alpha a constant

Discussion • Questionable Assumptions • Are write requests really (always) simple? • Scalability beyond 60 replicas (is it an issue?) • How closely does this represent a real data center situation? • Load contention issues • Overlap assignment • Determination of alpha(s) • Actual cost savings vs. Implied cost savings • Depends on SLA etc. • Depends on hardware leasing agreements • The issue of readiness in secondary replicas • What level of ‘warmth’ is good enough for each application. Can some machines be turned off? • What about contention in many databases trying to stay warm. • Management concerns • Can we truly provide strong guarantees for keeping our end of the SLA promised?

Database replication policies for dynamic content applications

Database replication policies for dynamic content applications

Presentation Transcript

Dynamic Content

Distributed Database and Replication

Replication solutions for Oracle database 11g

Dynamic content

Database Replication in Tashkent

Database Replication in WAN

Dynamic Content

Replication Policies for Federated Digital Repositories

Clustering Web Content for Efficient Replication

Dynamic languages for dynamic applications

Cisco CallManager Database Replication

Database Replication Policies for Dynamic Content Applications

Database Replication

Dynamic Content

Clustering Web Content for Efficient Replication

Database Replication - Distribution

Cisco CallManager Database Replication

Cisco CallManager Database Replication

Dynamic Restart Policies

Oracle Database Replication

Database Replication Tool