360 likes | 371 Views
Our solution offers an algorithm to automate the process of mapping application requirements to store consistency levels, ensuring strong consistency, integrity, durability, and availability.
E N D
Declarative Programming over Eventually Consistent Data Stores Gowtham Kaki Suresh Jagannathan KC Sivaramakrishnan
Http Http Http Http Stateless AppServer AppServer AppServer Cache Cache Cache Cache Consistency, Integrity, Durability, Availability, etc.
Account balances should be non-negative Usernames should be unique Only bona fide bids are accepted in an auction. Application invariants Strong consistency Linearizability & Serializability
INTERNET ☐ Strongly consistent, but not “always on” ☐ Be “always on”, but no strong consistency Eventual Consistency ∞ (convergence)
Session 1 //init balance = 0 deposit(100) ? get_balance() Store Consistency Levels Basic eventual Read-my-writes Read-my-writes Causal Read committed Monotonic writes Parallel Snapshot Isolation Bounded staleness Eventually Consistent Data Stores INTERNET
Session 1 Session 1 //init balance = 0 deposit(100) 0 get_balance() //init balance = 0 deposit(100) ??? get_balance() Replica 1 Replica 1 bal=0 bal=100 bal=100 bal=0 Replica 2 Eventual Consistency Read-my-writes consistency Replica 2 Store Consistency Levels Basic eventual Read-my-writes Read-my-writes Causal Read committed Monotonic writes Parallel Snapshot Isolation Bounded staleness Eventually Consistent Data Stores INTERNET
Session 1 Session 1 //init balance = 0 deposit(100) 0 get_balance() //init balance = 0 deposit(100) 100 get_balance() Replica 1 Replica 1 bal=100 bal=100 bal=100 bal=0 Replica 2 Eventual Consistency Read-my-writes consistency Replica 2 Store Consistency Levels Basic eventual Read-my-writes Read-my-writes Causal Read committed Monotonic writes Parallel Snapshot Isolation Bounded staleness Eventually Consistent Data Stores INTERNET
Application invariants deposit() withdraw() tweet() bid() Store Consistency Levels Basic eventual Read-my-writes Causal Read committed Monotonic writes Parallel Snapshot Isolation Bounded staleness Eventually Consistent Data Stores INTERNET
Application invariants deposit() withdraw() tweet() bid() Can we automate the process of mapping application requirements to store consistency levels? Store Consistency Levels Basic eventual Read-my-writes Causal Read committed Monotonic writes Parallel Snapshot Isolation Bounded staleness Eventually Consistent Data Stores INTERNET
Our solution … Classification Scheme An algorithm to … Application requirements Store consistency guarantees Map • Sound. • Optimal • Unique usernames. • Non-negative balance. • Bona fide bids. • Read-my-writes consistency • Causal consistency • Read committed isolation level • Repeatable read isolation level SpecificationLanguage A common medium to express both.
Prelims - System Model Replicated Data Store Replica 1 Replica n Deposit(200) Withdraw(10) …… Deposit(200) Withdraw(20) Withdraw(10) …… Visibility (Vis) …… Vis getBalance getBalance Session 1 Session n v1 = getBalance(); …… v2 = getBalance(); …… Session Order (SO)
Specification Language • Axiomatically capture set of valid executions • Associate with each operationa single abstract effect • Express relationshipbetween effects • Visibility (vis), Session order (so), Same object (sameobj) Primitive relations Per-object session order Happens-before
Replicated Bank Account (1) balance >= 0 violated Alice Bob vis a b Session 2 Session 1 vis //init balance = 100 withdraw(70); //init balance = 100 withdraw(70);
Bank Account Contracts (2) vis b a Bob vis vis Alice c getbalance () 50 getbalance () Session 3 Session 2 Session 1 Cheryl withdraw(50) deposit(100) -50 A.getbalance ()
Various consistency levels offered by stores can be specified.
Causal Visibility a • Causal Visibility (CV) is the strongest known guarantee that can be ensured under eventual consistency with high availability. • We therefore consider causal visibility formula as the specification for eventual consistency ( ). hbo b vis vis c
Causal Consistency • Causal consistency (CC) guarantees that writes from a session are made visible to subsequent reads in the same session. CV doesn’t guarantee this. • Hence CC is stronger than CV • Therefore it cannot be achieved with high availability under EC. Still, it is better than SC. a hbo vis b
Sequential Consistency vis • Strong/Sequential Consistency (SC) guarantees that all distinct operations performed on a same object are totally ordered w.r.t the visiblity relation. • SC is the strongest consistency level we consider. • It is most expensive in terms of availability. b a vis
Capturing Store Consistency Levels Eventual Consistency Causal Consistency Strong Consistency
Classification Scheme Decidable Automatically discharged with the help of Z3 SMT solver. Eventual Consistency Causal Consistency • deposit EC • withdraw SC • getBalance CC Strong Consistency
Our classification scheme is parametric with respect to the lattice of consistency levels.
Transactions • Real applications often use transactions • To write to multiple objects atomically • To read consistent state of objects in isolation. • Our specification language can be extended to specify isolation requirements. • Add single primitive relation - sametxn(a,b) • Derived relation: • Full atomicity can be ensured without affecting availability. • Various isolation requirements, on the other hand, affect availability in various ways.
Capturing Store Isolation Levels Read Committed Monotonic atomic view Repeatable Read
Classification Scheme for Isolation Levels Read Committed (RC) BankAccount Transactions Monotonic Atomic View (MAV) • “Save” transaction RC • “totalBal” transaction RR Repeatable Read (RR)
Haskell library for Eventually Consistent Data Stores (ECDS) • Definition language define operations and transactions on replicated data. • Specification language specify consistency and isolation requirements. GHC DEFS + Quelea Data Store
Conclusion • Quelea Haskell-library for programming ECDS • Automatic classification of operation and transaction contracts through SMT solver • Leveraging off-the-shelf ECDS • Avoid re-engineering complex systems • Makes it practical! http://gowthamk.github.io/Quelea
Thank you! http://gowthamk.github.io/Quelea
Future Work – Inferring Specifications (1) • Our specification language is based on a low-level axiomatic system model. • Crafting a specification requires • Comprehensive knowledge about the system model, & • Exhaustive reasoning about application semantics under this system model to determine possible anomalies. Can we reduce the programmer effort required to use Quelea by inferring low-level specifications? But, infer specificationsbased on what?
Future Work – Inferring Specifications (2) • Infer low-level specifications based on application integrity specifications. • E.g: balance≥0 • But, application requirements often cannot be stated as simple integrity specifications. • It’s ok for Alice to not see Bob’s tweet immediately after he posted it. • But, Alice must be able to see her tweets immediately in her timeline. • Also, if Bob tweeted in reply to Cheryl, then Alice cannot see Bob’s tweet without also seeing Cheryl’s.
Future Work – Inferring Specifications (3) • Infer low-level specifications based on test cases. • Easier to write. Programmers write them all the time. • Execute/simulate the tests to get passing runs (good samples) and failing runs (bad simples). • Specification is the classifier that separates good samples from the bad samples. • Challenge: how to overcome the overfit? • Take programmer assistance? • Ongoing research. Comments are welcome!
Future Work – Inferring Specifications (3) • Infer low-level specifications based on test cases. • Easier to write. Programmers write them all the time. • Execute/simulate the tests to get passing runs (good samples) and failing runs (bad simples). • Specification is the classifier that separates good samples from the bad samples. Example: Infer monotonicity property of a monotonically increasing counter.
State Summarization • Summarization is essential to check the unbounded growth of the log. • How is summarization done? • Ask developer for summarization semantics. • Replace (many) original effects with (few) summary effects.