1 / 20

Storage Technology Case Study: NetApp

Storage Technology Case Study: NetApp. Tyler Bletsch 16 February 2009. Physical hardware. Protocols. NAS: Network attached storage (files) CIFS: Mostly windows, works in *NIX NFS: Mostly *NIX, works in Windows HTTP: Read-only, supported everywhere SAN: Storage Area Network (blocks)

Download Presentation

Storage Technology Case Study: NetApp

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Storage Technology Case Study:NetApp Tyler Bletsch 16 February 2009

  2. Physical hardware

  3. Protocols • NAS: Network attached storage (files) • CIFS: Mostly windows, works in *NIX • NFS: Mostly *NIX, works in Windows • HTTP: Read-only, supported everywhere • SAN: Storage Area Network (blocks) • iSCSI: SCSI over IP • Fibre Channel (FCP): SCSI over special optical fabric

  4. What does the filer do? FCP iSCSI CIFS NFS HTTP Block IO File IO ? ? ? Disks

  5. What does the filer do? FCP iSCSI CIFS NFS HTTP Block IO File IO Volumes LUNs File system Aggregates Disks

  6. Disks • Disks – dumb arrays of 512-byte sectors • Prone to failure • Small (1500GB is small) • Slow • High seek time • Comparatively low throughput • Block device: only has two operations • Read block • Write block

  7. Aggregates (1) A B C P1(A,B,C) P2(A,B,C) D E P1(D,E,F) P2(D,E,F) F G P1(G,H,I) P2(G,H,I) H I ... ... ... ... ... • How to combine disks in a fault-tolerant way? • RAID (Redundant Array of Inexpensive Disks) • More specifically: “RAID-DP” ("Dual Parity") • An implementation of RAID 6 • For each RAID group with N disks, 2 disks store parity of the other N-2 • Any two disks can fail without data loss

  8. Aggregates (2) • Aggregate: • A set of disks combined into one or more RAID groups. • Fault-tolerant virtual block device built out of real block devices • Analogous to “md” devices in Linux, “raid groups” in Windows

  9. Volumes • An aggregate is a block device – only has two operations: • Read block • Write block • We want to make this useful by: (a) virtually chopping up aggregates, and (b) adding the concept of directories, files, and LUNs • Answer: volumes • Analagous to “LVM” on Linux or “Logical Disk Manager” in Windows

  10. Traditional vs. Flexible Volumes • Traditional volumes: Slice up the aggregate • Example: "Volume X is mapped to the first 30GB of aggregate A." • An empty 30GB volume uses 30GB of real disk. • Flexible volumes: lazily allocate as needed • Example: "Volume X gets blocks from aggregate A as they're needed up to a maximum of 30GB." • An empty 30GB volume uses almost no space. • Thin-provisioning: create volumes with more space than you have real disk capacity • Why?

  11. From volumes to file systems • Traditional systems: the volume is a block device, a file system gets written on top • Strict layering • NetApp/ZFS: the concepts of volume and file system are combined • So there's no additional "formatting" step • We can now do NAS: Just export volumes • Linux NFS: • mount –t nfs myfiler:/vol/myvolume /mnt/stuff • Windows CIFS: • \\myfiler\myvolume

  12. SAN on NetApp • LUN: “Logical Unit Number” (worst name ever) • A virtual block device mounted on a client via a storage area network (SAN) • Fibre Channel or iSCSI (SCSI-over-IP) • Why a SAN? • You can boot from it (with the right hardware) • Very fast for block-oriented apps • E.g. databases, video-on-demand systems • Historical: Fibre Channel used to be faster/more reliable than IP • Volumes can also store LUNs • NetApp feature: LUNs can be thin provisioned like volumes • E.g. capacity = 30GB, real disk usage = 10GB

  13. The complete picture • Looking back: • A LUN is a virtual block device … • in a virtual file system (volume) … • in a virtual block device (aggregate) … • spread over some real block devices (disks). FCP iSCSI CIFS NFS HTTP Block IO File IO Volumes LUNs File system Aggregates Disks

  14. Block indirection: NetApp’s big secret • How does thin provisioning work? • Block indirection layer Traditional model NetApp WAFL model myfile.txt 0 1 2 0 1 2 Filesystem: Block pointer table See 6056 See 6059 See 6002 6000 6001 6002 6002 6056 6059 Volume:

  15. Block indirection benefits (1) • Free snapshots • Make a copy-on-write duplicate of the block table • Cheap in-place backup lets recovers from many disasters without getting tapes involved • Example: • rm important.doccp ~/.snapshot/20080130.130852/important.doc . Snapshot Real working copy 0 1 2 0 1 2' See 6056 See 6059 See 6102 See 6056 See 6059 See 6155 6056 6059 6102 6155

  16. Block indirection benefits (2) • Thin clones • Copy a snapshot, but make it’s block table writable • Idea: Clone a LUN with an OS installed and tell hundreds of machines to boot from the clones Original Clone 0 1' 2 0 1 2' See 6056 See 6001 See 6102 See 6056 See 6059 See 6155 6001 6056 6059 6102 6155

  17. Block indirection benefits (3) • Data deduplication • Add a “data hash” field to the block table • When a block is written, hash it • If it’s already in the block table, just point this block entry to the same underlying data one-million-zeroes.txt Before After 0 1 2 0 1 2 See 6056 See 6059 See 6102 See 6056 See 6056 See 6056 6056 6059 6102 6056

  18. Demonstration • NetApp simulator • Runs as a Linux process, same as the real thing • Tasks • Check out the hardware • Make an aggregate • Make a volume • Mount the volume • Make data, snapshot, delete data, recover

  19. Demonstration • Even more tasks • Volume clone • Create LUN • Snapshot • LUN Clone

  20. Additional topics • SnapMirror: continuous remote backup with constant Recovery Point Objective (RPO) monitoring • “You’re mirroring filer1:/vol/stuff to filer2, filer2 is 31 seconds behind” • Cluster fail-over • Two filers connected with infiniband, each connected to the other’s disks • One can go down, the other takes over automatically • Zero down-time failure recovery, hardware upgrades, etc. • True clustering • Next-generation filer software “GX” • Combine tons of filers and shelves into a single namespace with a backend IP LAN • Next step in storage scalability

More Related