260 likes | 385 Views
Efficient Software-Based Fault Isolation. By Robert Wahbe, Steven Lucco, Thomas E. Anderson, and Susan L. Graham Presented by Pehr Collins. Background: Tannenbaum-Torvalds Debate. Linus Torvalds. Andrew Tannenbaum. Monolithic versus Microkernel. Monolithic OS
E N D
Efficient Software-Based Fault Isolation By Robert Wahbe, Steven Lucco, Thomas E. Anderson, and Susan L. Graham Presented by Pehr Collins
Background: Tannenbaum-Torvalds Debate Linus Torvalds Andrew Tannenbaum
Monolithic versus Microkernel • Monolithic OS • Co-locates modules in same address space • Faults in extension code could bring down whole OS or corrupt data • Not safe! • Many developers choose performance over safety • Microkernel OS • Core functions handled by microkernel • Additional functionality added by means of modules in separate address spaces • Faults are isolated • Safe but has a performance cost • Calls between modules required a full context switch • Three orders of magnitude more expensive than normal procedure call to same address space
Resolving Conflict Between Safety and Performance • Last week we looked at “Improving IPC by Kernel Design” by J. Liedtke • Optimization techniques could decrease context switch performance penalty to two orders of magnitude for microkernel IPC • Simultaneous with this paper, but still not enough to tip the balance • Enter software-based fault isolation • No more conflict: OS extension code can be both safe and efficient
How to Resolve Conflict? Sandboxing • Fault domains are contiguous memory segments used for untrusted modules • Distinguished by unique identifiers • Protection is handled by software in the same address space for all modules
Isolating the Fault Domain • Distrusted module code in a fault domain is modified to prevent writing and jumping to outside addresses • This prevents distrusted module from harming other domains • Two ways to accomplish this • Segment matching which pinpoints fault locations • Sandbox addressing which provides no data on source of faults
Segment Matching • Most control transfer instructions can be statically verified as address is known at compile time • Checks are added to all other potentially unsafe instructions • Jumps to register address • Stores to register address • Illegal addresses prevented via segment matching • Check if unsafe instruction’s target address has correct segment identifier • If check fails, trap to system error routine outside distrusted module’s fault domain Target Address Segment ID = Upper Address Bits
Segment Matching • Requires four dedicated registers • Holds addresses in the code segment • Holds addresses in the data segment • Holds the segment shift amount • Holds the segment identifier • These registers are used only by inserted code, never modified by distrusted module code • Dedicated registers are used to perform the checks on untrusted code • Performance impact of assigning some registers to become dedicated registers is minimal on a RISC system Target Address Segment ID = Upper Address Bits
Address Sandboxing • Even better performance than segment matching • Cost: lose the information about the source of the faults • Before each unsafe instruction insert code that sets the upper bits of the target address to the correct segment identifier • Does not catch illegal addresses • Prevents illegal addresses from affecting any other fault domain • But what happens when there is an illegal address? • It just jumps/writes to a garbage location within the fault domain Target Address Segment ID overwrite
Address Sandboxing • Requires five dedicated registers • Holds the segment mask • Holds the code segment identifiers • Holds the data segment identifiers • Holds the sandboxed code address • Holds the sandboxed data address Target Address Segment ID overwrite
Both Techniques Require Dedicated Registers • Segment Checking: 4 dedicated registers • Address Sandboxing: 5 dedicated registers • What happens if all registers are already allocated by the compiler?
Trust/Performance Tradeoff • Only distrusted modules incur performance penalty • Trusted modules can run at full speed • We have covered write and jump, but what about load? • Security can be ramped up to prevent distrusted modules from reading data outside their fault domain • Increases execution time overhead (by quite a bit)!
Resource Protection • Fault domains share the same virtual address space • Problem: if a fault domain made system calls it can close or delete files needed by other code in the address space • Could cause crash • Potential solution: modify the OS to know about fault domains • Not portable • Their solution: resource arbitration
Resource Arbitration • Require distrusted modules to access resources through cross-fault-domain RPC • Reserve a fault domain to hold trusted arbitration code • Arbiter determines safeness of system calls by other fault domains • System calls in object code of distrusted modules are transformed to use the arbiter RPC call • Trusted modules make system calls as normal and share fault domain with arbiter
How Do Modules Communicate?Cross-fault-domain RPC • Since the whole idea of fault domains is to provide better IPC performance, this is essential • Trusted stubs used for fault domains to call outside their domain • Stubs run unprotected outside caller and callee domains • Stubs copy cross-domain arguments (marshal) and manage machine state • Trustworthiness of stub allows caller and callee to communicate via a shared buffer • This creates a LRPC as only a single shared copy of the data is necessary • Stubs are created manually for now
Cross-fault-domain RPC • Jump Table • Allows the untrusted module to call into a stub outside its fault domain • Each entry in the jump table is a legal entry point to a stub outside the untrusted fault domain • Is read only to untrusted module • Is written to by trusted modules to set the entry point addresses
Performance Testing • Prototype running on DEC-MIPS and DEC-ALPHA • Considered: • How much overhead incurred by software encapsulation? • How fast is cross-default domain RPC? • Performance impact of using software enforced fault isolation on an application
RPC and Fault Isolation Costs Fault Isolation Overhead in POSTGRES Fault Domain RPC Cost
Results Analysis • Savings can be represented by the following formula • Function of: • Time spent in distrusted code (td) • Percentage of time spent crossing fault domains (tc) • Overhead of encapsulation (h) • Ratio (r) of fault domain crossing time to the crossing time of competing hardware based RPC
Conclusion • Results are impressive at first glance • Suggest that software based fault isolation is the way to go in many cases where crossing time is sufficiently quicker than standard RPC • However, security, security, security! • When security for reads is desired, overhead shoots way up • from 4.3% on average to 21.8%! • Errors from sandbox addressing difficult to track • Could generate a garbage address inside the fault domain • Stubs are manually generated • Requirement to dedicate 4 or 5 registers could be problematic • Solution is geared towards RISC architecture • Authors mention that CISC systems like 8086 would suffer performance penalties due to dedicated register requirements
Thanks for your attention! • Diagrams on Monolithic/Microkernel from Wikipedia • Photos of Linus Torvalds and Andrew Tannenbaum from Wikipedia • Segment Matching and Sandboxing Addressing figures from Tony Bock’s presentation on the same paper (Winter 2006)
Quick RecapFault Isolation in cooperating modules: what is the problem? • Existing schemes place each module in own address space • This isolates faults • Major context switch overhead for tightly-coupled modules
Quick RecapA Solution in Two Parts • Load code and data for distrusted module into own fault domain • Modify object code of this module to prevent writing jumping to addresses outside fault domain • Portable and language agnostic solutions • Cost is slight increase in execution time for distrusted modules • Yields significant boost in inter-fault domain performance and hence overall performance