1 / 78

Transaction

Transaction. All-or-nothing. Principles of Computer System (2012 Fall). Where are we?. System Complexity Modularity & Naming Enforced Modularity Network Fault Tolerance Transaction All-or-nothing Before-or-after. Review. Goal Building reliable system out of unreliable components

doane
Download Presentation

Transaction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transaction All-or-nothing Principles of Computer System (2012 Fall)

  2. Where are we? • System Complexity • Modularity & Naming • Enforced Modularity • Network • Fault Tolerance • Transaction • All-or-nothing • Before-or-after

  3. Review • Goal • Building reliable system out of unreliable components • Fault, Error, Failure • MTTF, MBTF, MTTR • Bath curve • RAID

  4. Review • Disk Failure Tolerance • Use replication • Checksum used to detect faults • If cannot read one disk, read data from replica(s) instead • Weaker (but still very effective): error-correcting codes on disk platter • Replication allows masking component failure

  5. All-or-Nothing and Before-or-After Atomicity • An action is atomic • If there is no way for a higher layer to discover the internal structure of its implementation

  6. All-or-Nothing and Before-or-After Atomicity • From the point of view of a procedure that invokes an atomic action • The atomic action always appears either to complete as anticipated, or to do nothing • This consequence is the one that makes atomic actions useful in recovering from failures

  7. All-or-Nothing and Before-or-After Atomicity • From the point of view of a concurrent thread • An atomic action acts as though it occurs either completely before or completely after every other concurrent atomic action • This consequence is the one that makes atomic actions useful for coordinating concurrent threads • Atomicity hides • not just the details of which steps form the atomic action • but the very fact that it has structure

  8. All-or-Nothing and Before-or-After Atomicity • 1. Data abstraction • Hide the internal structure of data • 2. Client/server organization • Hide the internal structure of major subsystems • 3. Atomicity • Hide the internal structure of an action • Enforce industrial-strength modularity • Guarantee absence of unanticipated interactions among components of a complex system • The implementer’s point of view • Painful

  9. Atomic actions’ benevolent side effects • Audit log • atomic actions that run into trouble record • the nature of the detected failure and • the recovery sequence • for later analysis • Data management system when insert a record • rearrange the file into a better physical order • Cache • Garbage collection • They are all hidden from upper levels

  10. Overall system fault tolerance model • 1. Error-free operation: • All work goes according to expectations • The user initiates actions and the system confirms the actions by displaying messages to the user • 2. Tolerated error: • The user who has initiated an action notices that the system failed before it confirmed completion of the action • when the system is operating again, checks to see whether or not it actually performed that action

  11. Overall system fault tolerance model • 3. Un-tolerated error: • The system fails without the user noticing • the user does not realize that he or she should check or retry an action that the system may not have completed

  12. Disk Storage System Fault Tolerance Model • A perfect-disk assumption • a disk never decays and that it has no hard errors • only one thing can go wrong: a system crash at just the wrong time

  13. Disk Storage System Fault Tolerance Model • The fault tolerance model • Error-free operation: • CAREFUL_GET returns the result of the most recent call to CAREFUL_PUT • at sector_number on track, with status = OK. • Detectable error: • The operating system crashes during a CAREFUL_PUT • and corrupts the disk buffer in volatile storage • and CAREFUL_PUT writes corrupted data on one sector of the disk.

  14. ALL_OR_NOTHING_PUT • procedure ALMOST_ALL_OR_NOTHING_PUT (data, all_or_nothing_sector) • CAREFUL_PUT(data, all_or_nothing_sector.S1) • CAREFUL_PUT (data, all_or_nothing_sector.S2) • CAREFUL_PUT (data, all_or_nothing_sector.S3) • procedure ALL_OR_NOTHING_GET (referencedate,all_or_nothing_sector) • CAREFUL_GET (data1, all_or_nothing_sector.S1) • CAREFUL_GET (data2, all_or_nothing_sector.S2) • CAREFUL_GET (data3, all_or_nothing_sector.S3) • if (data1 = data2) data ← data1 • elsedata ← data3

  15. ALL_OR_NOTHING_PUT • procedure ALL_OR_NOTHING_PUT (data, all_or_nothing_sector) • CHECK_AND_REPAIR (all_or_nothing_sector) • ALMOST_ALL_OR_NOTHING_PUT (data, all_or_nothing_sector) • procedure CHECK_AND_REPAIR (all_or_nothing_sector) // Ensure copies match • CAREFUL_GET (data1, all_or_nothing_sector.S1) • CAREFUL_GET (data2, all_or_nothing_sector.S2) • CAREFUL_GET (data3, all_or_nothing_sector.S3)

  16. ALL_OR_NOTHING_PUT • if (data1 = data2) and (data2 = data3) return// State 1 or 7, no repair • if (data1 = data2) • CAREFUL_PUT (data1, all_or_nothing_sector.S3) return// State 5 or 6. • if (data2 = data3) • CAREFUL_PUT (data2, all_or_nothing_sector.S1) return// State 2 or 3. • CAREFUL_PUT (data1, all_or_nothing_sector.S2) // State 4, go to state 5 • CAREFUL_PUT (data1, all_or_nothing_sector.S3 // State 5, go to state 7

  17. Atomicity

  18. Commit

  19. Commit • Pre-commit • identify all the resources needed • establish their availability • maintain the ability to abort at any instant • shared resources, once reserved, cannot be released until the commit point is passed • should not do anything externally visible • Post-commit • release reserved resources that are no longer needed • perform externally visible actions • cannot try to acquire additional resources • Q: where’s commit in ALL_OR_NOTHING_PUT?

  20. Shadow Copy • Pre-commit: • Create a complete duplicate working copy of the file that is to be modified • make all changes to the working copy • Commit point: • Carefully exchange the working copy with the original • Typically this step is bootstrapped • Post-commit: • Release the space that was occupied by the original The golden rule of atomicity: Never modify the only copy!

  21. Bank account transfer xfer(bank, a, b, amt):     bank[a] = bank[a] – amt bank[b] = bank[b] + amt

  22. Bank account transfer xfer(bank, a, b, 50): xfer(bank, a, b, amt):     bank[a] = bank[a] – amt bank[b] = bank[b] + amt <- a=100, b=100 <- a=50, b=100 <- a=50, b=150

  23. Bank account transfer xfer(bank, a, b, amt):     bank[a] = bank[a] – amt bank[b] = bank[b] + amt audit(bank): sum = 0 for acct in bank:         sum = sum + bank[acct] return sum

  24. Bank account transfer audit(bank): xfer(bank, a, b, amt):     bank[a] = bank[a] – amt bank[b] = bank[b] + amt audit(bank): sum = 0 for acct in bank:         sum = sum + bank[acct] return sum <- sum=200 <- sum=150 <- sum=200

  25. Eventual goal: transactions xfer(bank, a, b, amt): begin     bank[a] = bank[a] – amt bank[b] = bank[b] + amt commit audit(bank): begin sum = 0 for acct in bank:         sum = sum + bank[acct] commit return sum

  26. Strawman implementation xfer(bank, a, b, amt): bank[a] = read_accounts(bankfile)     bank[a] = bank[a] – amt bank[b] = bank[b] + amt write_accounts(bankfile)

  27. Shadow copy xfer(bank, a, b, amt): bank[a] = read_accounts(bankfile)     bank[a] = bank[a] – amt bank[b] = bank[b] + amt write_accounts(#bankfile) rename(“#bankfile”, bankfile)

  28. Rename • Rename • unlink(bankname) • link("newfile", bankname) • unlink("newfile")

  29. File system data structures • directory data blocks: • filename “bank” → inode 12 • filename “#bank” → inode 13 • inode 12: • data blocks: 3, 4, 5 • refcount: 1 • inode 13: • data blocks: 6, 7, 8 • refcount: 1

  30. Rename • What needs to happen during rename? • Point "bank" directory entry at inode 13 • Remove "newfile" directory entry • Remove refcount on inode 12 • First try at rename(x, y): • y's dirent gets x's inode # • decref(y's original inode) • remove x's dirent • Problems • What happens if we crash after modifying y's dirent? • Two names point to inode 13, refcount is 1. • What if we increment refcount before, and decrement it afterwards?

  31. Increase ref-count before rename(x, y): newino= lookup(x) oldino= lookup(y) incref(newino) change y's dirent to newino decref(oldino) remove x's dirent decref(newino)

  32. rename(“#bank”, “bank”) • directory data blocks: • filename “bank” → inode 12 • filename “#bank” → inode 13 • inode 12: • data blocks: 3, 4, 5 • refcount: 1 • inode 13: • data blocks: 6, 7, 8 • refcount: 2

  33. rename(“#bank”, “bank”) • directory data blocks: • filename “bank” → inode 13 • filename “#bank” → inode 13 • inode 12: • data blocks: 3, 4, 5 • refcount: 1 • inode 13: • data blocks: 6, 7, 8 • refcount: 2

  34. rename(“#bank”, “bank”) • directory data blocks: • filename “bank” → inode 13 • filename “#bank” → inode 13 • inode 12: • data blocks: 3, 4, 5 • refcount: 0 • inode 13: • data blocks: 6, 7, 8 • refcount: 2

  35. rename(“#bank”, “bank”) • directory data blocks: • filename “bank” → inode 13 • filename “#bank” → inode 13 • inode 12: • data blocks: 3, 4, 5 • refcount: 0 • inode 13: • data blocks: 6, 7, 8 • refcount: 2

  36. rename(“#bank”, “bank”) • directory data blocks: • filename “bank” → inode 13 • filename “#bank” → inode 13 • inode 12: • data blocks: 3, 4, 5 • refcount: 0 • inode 13: • data blocks: 6, 7, 8 • refcount: 1

  37. Recovery after crash salvage(disk): for inode in disk.inodes: inode.refcnt= find_all_refs(disk.root_dir, inode) if exists(“#bank”): unlink(“#bank”)

  38. Shadow copy • Write to a copy of data, atomically switch to new copy • Switching can be done with one all-or-nothing operation (sector write) • Requires a small amount of all-or-nothing atomicity from lower layer (disk) • Main rule: only make one write to current/live copy of data • In our example, sector write for rename. • Creates a well-defined commit point.

  39. Shadow copy • Does the shadow copy approach work in general? • +: Works well for a single file. • -: Hard to generalize to multiple files or directories. • Might have to place all files in a single directory, or rename subdirs. • -: Requires copying the entire file for any (small) change. • -: Only one operation can happen at a time. • -: Only works for operations that happen on a single computer, single disk.

  40. Summary • Not always possible to use replication to avoid failures. • To recover from failure, need to reason about what happened to the system • Atomicity makes it much easier to reason about possible system states: • All-or-nothing atomicity • Before-or-after atomicity • Shadow copy: one approach to all-or-nothing atomicity • Make all modifications to a shadow copy of data. • Replace old copy with new copy with an all-or-nothing write to one sector. • -> commit point • Rule: never modify live data (unless one sector write is all you need)

  41. Summary • Two kinds of atomicity: all-or-nothing, before-or-after • Shadow copy can provide all-or-nothing atomicity • Golden rule of atomicity: never modify the only copy! • Typical way to achieve all-or-nothing atomicity. • Works because you can fall back to the old copy in case of failure. • Software can also use all-or-nothing atomicity to abort in case of error. • Drawbacks of shadow file approach: • only works for single file (maybe fixable with shadow dirs) • copy the entire file for every all-or-nothing action (harder to avoid) • Still, shadow copy is a simple and effective design when it suffices. • Many Unix applications (e.g., text editors) use it, owing to rename.

  42. All-or-nothing: Logging • Amore general techniques for achieving all-or-nothing atomicity • Idea: keep a log of all changes, and whether each change commits or aborts • We will start out with a simple scheme that's all-or-nothing but slow • Then, we will optimize its performance while preserving atomicity

  43. Journal Storage • Astore operation • Not overwrites old data • create a new, tentative version of the data • remains invisible to any reader outside this all-or-nothing action • until the action commits

  44. Journal Storage

  45. Journal Storage

  46. Journal Storage • procedure NEW_ACTION () • id ← NEW_OUTCOME_RECORD () • id.outcome_record.state ← PENDING • return id • procedure COMMIT (referenceid) • id.outcome_record.state ← COMMITTED • procedure ABORT (referenceid) • id.outcome_record.state ← ABORTED

  47. Journal Storage • procedure READ_CURRENT_VALUE (data_id, caller_id) • starting at end of data_id repeat until beginning • v ← previous version of data_id // Get next older version • a ← v.action_id // Identify the action a that created it • s ← a.outcome_record.state // Check action a’s outcome record • if s = COMMITTED then • return v.value • else skip v // Continue backward search • signal (“Tried to read an uninitialized variable!”)

  48. Journal Storage • procedure WRITE_NEW_VALUE (referencedata_id, new_value, caller_id) • if caller_id.outcome_record.state = PENDING • append new version v to data_id • v.value ← new_value • v.action_id ← caller_id • else signal (“Tried to write outside of an all-or-nothing action!”)

  49. Journal Storage

  50. Journal Storage • procedure TRANSFER (referencedebit_account, referencecredit_account, amount) • my_id ← NEW_ACTION () • xvalue ← READ_CURRENT_VALUE (debit_account, my_id) • xvalue ← xvalue - amount • WRITE_NEW_VALUE (debit_account, xvalue, my_id) • yvalue ← READ_CURRENT_VALUE (credit_account, my_id) • yvalue ← yvalue + amount • WRITE_NEW_VALUE (credit_account, yvalue, my_id) • if xvalue > 0 then • COMMIT (my_id) • else • ABORT (my_id) • signal (“Negative transfers are not allowed.”)

More Related