1 / 26

SQCK: A Declarative File System Checker

This paper discusses the need for reliable file system checkers and presents SQCK, a declarative query language-based approach to build robust checkers. SQCK simplifies the complexity of traditional checkers by writing fewer lines of code while ensuring simple, reliable, and flexible checks and repairs. The evaluation shows that SQCK outperforms traditional checkers in terms of simplicity and reliability.

sfaircloth
Download Presentation

SQCK: A Declarative File System Checker

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin – Madison OSDI ’08 – December 9th, 2008

  2. Corrupt file systems • File systems • Store massive amounts of data • Must be reliable • Corrupted file system images • Due to hardware errors, file system bugs, etc. • Need to be repaired a.s.a.p.

  3. Who should repair? • Does journaling (write-ahead log) help? • No, only for crashes • Does file system repair itself online? • No, not enough machinery • Fsck: the last line of defense • It’s a “must have” utility • XFS: “no need fsck ever”, but deploys fsck at the end • Must be fully reliable

  4. But … fsck is complex • Fsck has a big task • Turn any corrupt image to a consistent image • E.g. check if a data block is shared by two inodes • How are they implemented? • Written in C  hard to reason about • Large and complex • Ext2 fsck: 150 checks in 16 KLOC • XFS fsck: 340 checks in 22 KLOC • Hundreds of cluttered if-check statements • Bottom line: fsck code is “untouchable”

  5. Are current checkers really reliable? If not, how should we build robust checkers? Two Questions

  6. e2fsck is unreliable • Analyze e2fsck (ext2 file system checker) • Findings: • Inconsistent repair • The file system becomes unreadable • Consistent but not “correct” • Fsck deletes valid directory entries • Fsck loses a huge number of files

  7. SQCK • Lesson: Complexity is the enemy of reliability • Big task + bad design  complexity  unreliability • Need a higher-level approach for simplicity • SQCK (SQL-based Fsck) • Use a declarative query language to write checks • Put simply: write fewer lines of code • Evaluation • Simple and reliable: e2fsck in 150 queries (vs. 16 KLOC of C) • More: Great flexibility and reasonable performance

  8. Outline • Introduction • Analysis of e2fsck • SQCK Design • SQCK Evaluation • Conclusion

  9. Methodology • E2fsck task: cross-check all ext2 metadata • An indirect pointer should not point to the superblock • A subdir should only be accessible from one directory • Inject single corruption • Observe how e2fsck repairs a single corruption • Only corrupton-disk pointers • Corrupt an indirect pointer to point to the superblock • Corrupt a directory entry to point to another directory • Usually, a corrupt pointer is simply cleared to zero

  10. Indirect block 0 … 850 … 851 … 998999 … 853 Superblock 0 … … 0 … … … … 0 … … Inconsistent (Out-of-order) Repair • Check bad indirect pointer 2. Check indirect content Inode *ind Inode *ind Superblock 0 Ideal fsck e2fsck 2. Check indirect content • Check bad indirect pointer Inode *ind Inode *ind Superblock

  11. / a1 b1 a2 b2 Consistent but Incorrect Repair (1) / / / a1 b1 a1 b1 a1 b1 LF X a2 b2 a2 b2 b2 a2 Ideal fsck Kidnapping problem! e2fsck E2fsck does not use all available information / a1 b1 X b2

  12. Result Summary • Four problems • Inconsistent • Information-incomplete • Policy-inconsistent • Insecure • E2fsck does not handle all corruptions • “Warning: Programming bug in e2fsck! Or some bonehead (you) is checking a mounted (live) filesystem.” • Not simple implementation bugs • Difficult to combine available information • Difficult to ensure correct ordering

  13. Outline • Introduction • Analysis • SQCK Design • SQCK Evaluation • Conclusion

  14. Hundreds of checks Complex cross-checks Taxonomy of checks in e2fsck: Must be ordered correctly struct A { int x int y } A { x y } A { x y } A { x y } B { m n } A { x y } A { x y } A { x y } A { x y } B { m n } B { m n } B { m n } Fsck Properties

  15. A Declarative Approach • Lesson: Complexity is the enemy of reliability • SQCK • Use a declarative query language (e.g. SQL), why? • It is declarative: high-level intent is clear • Fit for cross-checking massive information • Goals achieved • Simple: e2fsck in 150 queries (vs. 16 KLOC of C) • Reliable: Each check/query is easy to understand • Flexible: Plug in/out different queries

  16. Using SQCK • Take a fs image • Load metadata to db tables • Temporary tables • Ex: InodeTable, GroupDescTable, DirEntryTable • Run checks and repairs (in the form of queries) • Flush any modification, and delete tables Database tables Scanner Loader Checks + Repairs Flush File system image

  17. Declarative check (example 1) • Cross-checking asingle instance of a structure • “Find block bitmap that is not located within its block group” first_block = sb->s_first_data_block; last_block = first_block + blocks_per_group; for (i = 0, gd=fs->group_desc; i < fs->group_desc_count; i++, gd++) \{ if (i == fs->group_desc_count - 1) last_block = sb->s_blocks_count; if ((gd->bg_blk_bmap < first_block) || (gd->bg_blk_bmap >= last_block)) { px.blk = gd->bg_block_bitmap; if (fix_problem(BB_NOT_GROUP, ...)) gd->bg_block_bitmap = 0; } ... } SELECT * FROM GroupDescTable G WHERE G.blockBitmap NOT BETWEEN G.startANDG.end

  18. Declarative check (example 2) • Cross-checking multiple instances of the same structure • “Find false parents (i.e. directory entries that point to a subdirectory that already belongs to another directory)” • Must read all directory entries in dir data blocks • Wrong implementation in e2fsck (the kidnapping problem)

  19. Declarative check (example 2) if ((dot_state > 1) && (ext2fs_test_inode_bitmap (ctx->inode_dir_map, dirent->inode))) { // ext2fs_get_dir_info // is 20 lines long subdir = e2fsck_get_dir_info (dirent->inode); ... if (subdir->parent) { if (fix_problem(LINK_DIR,..)) { dirent->inode = 0; goto next; } } else { subdir->parent = ino; } }

  20. Declarative check (example 2) SELECT F.* //  returns the // false parent(s) FROM DirEntryTable P, C, F WHERE // P says C is its child P.entry_num >= 3 AND P.entry_ino = C.ino AND // and C says P is his parent C.entry_num = 2 AND C.entry_ino = P.ino AND // F also says C is its child F.entry_num >= 3 AND F.entry_ino = C.ino AND F.ino <> P.ino AND F P C

  21. Running declarative checks is part of the problem Must also perform the declarative repairs A repair = An update query Some repairs simply update a few fields A repair = A series of queries Ex: Reconnect an orphan directory to the lost+found directory Combine a series of queries with C code All repairs are written in SQL C code is only used for connecting them Declarative Repairs ... SET T.field = newValue, T.dirty = 1

  22. Outline • Introduction • Analysis • SQCK Design • SQCK Evaluation • Conclusion

  23. SQCK Evaluation • Complexity • 150 queries in 1100 lines of SQL statements • (compared to 16,000 lines of C in e2fsck) • Reliability • Pass hundreds of corruption scenarios • Flexibility • Add new checks/repairs • Enable different versions of e2fsck • Performance • Introduce some optimizations

  24. SQCK vs. e2fsck • Reasonable • First generation of SQCK (with MySQL) • Within 1.5x of e2fsck • Future optimizations • Hierarchical checks • Concurrent queries

  25. Conclusion • Complexity is the enemy of reliability • Recovery code is complex • SQCK: Build recovery tools with a higher-level approach

  26. Thank you!Questions? ADvanced Systems Laboratory www.cs.wisc.edu/adsl

More Related