110 likes | 274 Views
Peck: Transparent Distributed Backup Using Chirp. Graduate Operating Systems, Fall 2005. Matthew Van Antwerp. December 15, 2005. Outline. Existing Methods Strengths and Weaknesses Chirp Overview Peck: Storage and Retrieval Mapfiles Conclusion. Existing Backup Methods.
E N D
Peck: Transparent Distributed Backup Using Chirp Graduate Operating Systems, Fall 2005 Matthew Van Antwerp December 15, 2005
Outline • Existing Methods • Strengths and Weaknesses • Chirp Overview • Peck: Storage and Retrieval • Mapfiles • Conclusion
Existing Backup Methods • Dedicated Backup Server • Portable Media (CD, DVD, etc.) • Freenet • USB Thumbdrives • Chirp
Chirp: Distributed Storage Pool at ND • Composed mostly of department and lab computers - catalog list • Each system (about 200 in total) offers up spare hard drive space. • Can be accessed through libchirp API, command line tools, or parrot. • Peck sits on top of Chirp.
` Peck Structure Peck libchirp chirp server chirp server chirp server
Peck Function • Input file list • Attempts upload and download of test file to each server to learn permissions • Locates sufficient servers for upload • Writes filename and server name to mapfile for each uploaded file • Uploads copy of mapfile to multiple servers
Peck Mapfile • Hypothetical: Your hard drive crashes and you lose all your data (yes, it will happen to you one day). • Luckily you have been backing up your files via Peck. • Peck scours the servers for your mapfile (relatively slow due to no knowledge of Chirp servers). • When it finds a copy, it retrieves the file and opens it. • One by one, Peck retrieves files listed in mapfile from the relevant servers.
Peck Mapfile Versions • How do we know we are retrieving the proper mapfile when we upload new files or retrieve files? • Also, how can we keep from filling up Chirp servers with redundant copies? • Answer: through careful use of mapfiles. • After first run of peck, we have a mapfile on the servers which we retrieve on next run. • We check mapfile for a given filename when attempting to upload and update appropriately. • Before uploading new mapfile, delete all old mapfiles.
Analysis • Checking permissions: slow (18.15s avg) • Upload: similar to ftp times • Mapfile retrieval after total system loss: slow • Download (after mapfile retrieved): similar to ftp times. • Updating mapfile and removing old mapfiles: fairly slow. • Very little overhead besides the obvious bottlenecks.
Conclusion • Peck achieves many of the strengths of other backup methods while avoiding their weaknesses. • Easy to use – simply give the application a list of files to upload (easily used in conjunction with file age scripts). An ideal cron job. • Cheap (free), fairly efficient (although not largely scalable), easy to setup and maintain, and transparent. • All necessary information for retrieval is stored on the Chirp servers.