1 / 32

nfsv4 and linux

nfsv4 and linux. peter honeyman linux scalability project center for information technology integration university of michigan ann arbor. open source reference implementation. sponsored by sun microsystems part of citi’s linux scalability project ietf reference implementation

sheri
Download Presentation

nfsv4 and linux

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. nfsv4 and linux peter honeyman linux scalability project center for information technology integration university of michigan ann arbor

  2. open source reference implementation • sponsored by sun microsystems • part of citi’s linux scalability project • ietf reference implementation • 257 page spec • linux and openbsd • interoperates with solaris, java, network appliance, hummingbird, emc, ... • september 1 code drop for linux 2.2.14

  3. what’s new? • lots of state • compound rpc • extensible security added to rpc layer • delegation for files - client cache consistency • lease-based, non-blocking, byte-range locks • win32 share locks • mountd: gone • lockd, statd: gone

  4. nfsv4 state • state is new to nfs protocol • nfsv3: lockd manages state • compound rpc - server state • dos share locks - server and client state • delegation - server and client state • server maintains per-thread global state • client and server maintain file, lock, and lock owner state

  5. server state per global thread • compound operations often use result of previous operation as arguments • nfs file handle is the coin of the realm • current file handle current working directory • some operations (rename) need two file handles - save file handle

  6. compound rpc • hope is to reduce traffic • complex calling interface • partial results used • rpc/xdr layering • variable length: kmalloc buffer for args and recv • want to xdr args directly into rpc buffer • want to allow variable length receive buffer

  7. rpc/xdr layering • rpc layer does not interpret compound ops • replay cache: locking vs. regular • have to decode to decide which replay cache to use

  8. example: mount compound rpc putrootfh lookup getattr getfh

  9. nfsv4 mount • server pseudofs joins exported subtrees with a read only virtual file system • any client can mount into the pseudofs • users browse the pseudofs (via lookup)

  10. nfsv4 pseudofs • access into exported sub trees based on user’s credentials and permissions • client /etc/fstab doesn’t change with servers export list • server /etc/exports doesn’t need to maintain an ip based access list

  11. mounting a pseudo file system Local FS Pseudo FS nfsv4 client / / a b a b c user creds d e f the server boots, parses /etc/exports, creates the pseudo fs, mirroring the local fs up to the exported directories. the local fs exported directories are mounted on their pseudo fs counterparts. user has read-only access to the pseudo fs, and traverses the pseudo fs until encountering an exported directory. the user’s permissions in the negotiated security realm determine access to the exported directory. the client boots and mounts a directory of the pseudo fs with the AUTH_SYS security flavor. the first nfsv4 procedure that acts on the exported directory causes nfsd to return NFS4ERR_WRONGSEC, causing the client to call SECINFO and obtain the list of security flavors on the exported directory. before the first open, the client calls SETCLIENTID to negotiate a per-server unique client identifier. g h i local fs directory pseudo fs directory exported directory

  12. rpcsec_gss • mit krb5 gssrpc and sesame are open source, but neither is really rpcsec_gss • sun released their rpcsec_gss, a complete rewrite of onc • gss & sun onc a tough match • both are transport independent • gss channel bindings / onc xprt • overloading of programs’ null_proc

  13. kernel rpcsec_gss • rpc layering had to be violated • gss implementations are not kernel safe • security service code not kernel safe (kerberos 5) • kernel security services implemented as rpc upcalls to a user-level daemon, gssd • but only some services - e.g. encryption -need to be in the kernel

  14. rpcsec_gss: where are we now? • (mostly) complete user-level kerberos 5 implementation • linux kernel implementation with kerberos 5 • mutual authentication • session key setup • no encryption • gssd

  15. kerberos 5 security initialization kerberos 5 kdc 3 2 gssd gssd user user 4 7 kernel kernel 1 6 5 8 nfsd nfs client 9 10 2,3 kerberos 5 tcp/ip 1,4,6,7 gssd rpc interface 5,8 nfsv4 overloaded null procedure 9,10 nfsv4 compound procedure

  16. locking • lease based locks • no byte range callback mechanism • server defines a lease for per client lock state • server can reclaim client state if lease not renewed • open sets lock state, including lock owner (clientid, pid) • server returns lock stateid

  17. locking • stateid mutating operations are ordered (open, close, lock, locku, open_downgrade) • lock owner can request a byte range lock and then: • upgrade the initial lock • unlock a sub-range of the initial lock • server is not required to support sub-range lock semantics

  18. server lock state • need to associate file, lock, lock owner, & lease • per lock owner: lock sequence number • per file state: in hash table • may move file state into struct file private area

  19. server lock state • lock owners in hash table • server doesn’t own the inode • lock state in linked list off file state • stateid: handle to server lock state • per client state: in hash table - lock lease

  20. client lock state • lock owners: in hash table • per lock owner lock sequence number • use struct file private data area • client owns the inode, use private inode data area

  21. client lock state • use inode file_lock struct private data area for byte range lock state • (eventually) store same locking state as the server for delegated files • use the super block private data area to store per server state (returned clientid)

  22. delegation • intent is to reduce traffic • server decides to hand out delegation at open • if client accepts, client provides callback • many read delegations, or one write delegation

  23. delegation • when client delegates a cached file it handles: • all locking, share and byte range • future opens • client can’t reclaim a delegation without a new open • no delegation for directories

  24. server delegation state • associates delegation with a file • delegation state in linked list off file state • stateid: separate from the lock stateid • client call back path

  25. linux vfs changes • shared problem: open with o_excl described by peter braam • nfsv4 implements win32 share locks, which require atomic open with create • linux 2.2.x and linux 2.4 vfs is problematic

  26. linux vfs changes • to create and open a file, three inode operations are called in sequence • lookup resolves the last name component • create is called to create an inode • open is called to open the file

  27. xopen • inherent race condition means no atomicity • we partially solved this problem • we added a new inode operation which performs the open system call in one step • int xopen(struct file *filep, struct inode *dir_i, struct dentry *dentry, int mode) • if the xopen() inode operation is null, the current two step code is used • nfsv4 open subsumes lookup, create, open, access

  28. user name space • local file system uses uid/gid • protocol specifies <user name>@<realm> • different security can produce different name spaces

  29. user name space • unix user name • kerberos 5 realm • pki realm - x500 or dn naming • gssd resolves <user name>@<realm> to local file system representation

  30. open issues • local file system choices • currently ext2 • acl implementation will determine fs for linux 2.4 • kernel additions and changes • rpc rewrite • crypto in the kernel • atomic open

  31. next steps • march 31 - full linux 2.4 implementation, without acl’s • june 30 - acl’s added • network appliance sponsored nfsv3/v4 linux performance project

  32. any questions? http://www.citi.umich.edu/projects/nfsv4 http://www.nfsv4.org

More Related