1 / 81

Chapter 8: Distributed File Systems

Chapter 8: Distributed File Systems. Speaker : 呂俊欣 M9129001 5/2/2003. Outline. Introduction File service architecture Sun Network File System (NFS) Andrew File System (AFS) Recent advances Summary. File System. A file system

shawngarcia
Download Presentation

Chapter 8: Distributed File Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 8: Distributed File Systems Speaker : 呂俊欣 M9129001 5/2/2003

  2. Outline • Introduction • File service architecture • Sun Network File System (NFS) • Andrew File System (AFS) • Recent advances • Summary

  3. File System • A file system • Is responsible for the organization, storage, retrieval, naming, sharing and protection of files • Is designed to store and manage large number of files, with facilities for creating, naming and deleting files • Stores programs and data and makes them available as needed

  4. Layered File System

  5. Persistence • Probably one of the most important services provided by a file system is persistence • The files exist after the program, and even the computer, has terminated • Files typically do not go away • They are persistent and exist between sessions • In conventional systems files are the only persistence objects

  6. Distributed Objects • Using the OO paradigm is it easy to build distributed systems • Place objects on different machines • Systems have been developed that allow this • Java RMI • CORBA ORB • Having a persistent object store would be useful • Java RMI activation daemon • Certain ORB implementations

  7. Properties of Storage Systems • Types of consistency between copies: 1 - strict one-copy consistency • - approximate consistency • X - no automatic consistency Figure 8.1 Sharing Persis- Distributed Consistency Example tence cache/replicas maintenance Main memory 1 RAM 1 File system UNIX file system Distributed file system Sun NFS Web server Web Distributed shared memory Ivy (Ch. 16) Remote objects (RMI/ORB) CORBA 1 1 Persistent object store CORBA Persistent Object Service Persistent distributed object store PerDiS, Khazana *

  8. File Model • Files contain both data and attributes • Data is a sequence of bytes accessible by read/write operations • Attributes consist of a collection of information about the file

  9. File length Creation timestamp Read timestamp Write timestamp Attribute timestamp Reference count Owner File type Access control list Common File Attributes

  10. Write a simple C program to copy a file using the UNIX file system operations shown in Figure 8.4. #define BUFSIZE 1024 #define READ 0 #define FILEMODE 0644 void copyfile(char* oldfile, char* newfile) { char buf[BUFSIZE]; int i,n=1, fdold, fdnew; if((fdold = open(oldfile, READ))>=0) { fdnew = creat(newfile, FILEMODE); while (n>0) { n = read(fdold, buf, BUFSIZE); if(write(fdnew, buf, n) < 0) break; } close(fdold); close(fdnew); } else printf("Copyfile: couldn't open file: %s \n", oldfile); } main(int argc, char **argv) { copyfile(argv[1], argv[2]); } Figure 8.4 UNIX file system operations filedes = open(name, mode) Opens an existing file with the given name. filedes = creat(name, mode) Creates a new file with the given name. Both operations deliver a file descriptor referencing the open file. The mode is read, write or both. status = close(filedes) Closes the open file filedes. count = read(filedes, buffer, n) Transfers n bytes from the file referenced by filedes to buffer. Transfers n bytes to the file referenced by filedes from buffer. count = write(filedes, buffer, n) Both operations deliver the number of bytes actually transferred and advance the read-write pointer. pos = lseek(filedes, offset, Moves the read-write pointer to offset (relative or absolute, whence) depending on whence). status = unlink(name) Removes the file name from the directory structure. If the file has no other names, it is deleted. status = link(name1, name2) Adds a new name (name2) for a file (name1). status = stat(name, buffer) Gets the file attributes for file name into buffer. UNIX file system operations • A Example • Write a simple C program to copy a file using the UNIX file system operations shown in Figure 8.4. • copyfile(char * oldfile, * newfile) • { • <you write this part, using open(), creat(), read(), write()> • } • Note: remember that read() returns 0 when you attempt to read beyond the end of the file. *

  11. File system modules

  12. File Service • A file service allows for the storage and access of files on a network • Remote file access is identical to local file access • Convenient for users who use different workstations • Other services can be easily implemented • Makes management and deployment easier and more economical • File systems were the first distributed systems that were developed • Defines the service not the implementation

  13. File Server • A process that runs on some machine and helps to implement the file service • A system may have more than one file server

  14. File Service Models Upload/ download model ReadMe.txt ReadMe.txt ReadMe.txt Client Server Remote access model ReadMe.txt Server Client

  15. File service Requirements • Transparency • Access • Programs are unaware of the fact that files are distributed • Location • Programs see a uniform file name space. They do not know, or care, where the files are physically located • Mobility • Programs do need to be informed when files move (provided the name of the file remains unchanged)

  16. Transparency Revisited • Location Transparency • Path name gives no hint where the file is physically located • \\redshirt\ptt\dos\filesystem.ppt • File is on redshirt but where is redshirt?

  17. File service Requirements cont. • Transparency • Performance • Satisfactory performance across a specified range of system loads • Scaling • Service can be expanded to meet additional loads

  18. File service Requirements cont. • Concurrent File Updates • Changes to a file by one program should not interfere with the operation of other clients simultaneously accessing the same file • File-level or record-level locking • Other forms of concurrency control to minimize contention

  19. File is locked File Locking • Lock cannot be granted if other process already has a lock on the file (or block) • Client request gets put at the end of a FIFO queue • As soon as lock is removed, the server grants the next lock to the client at the top of the list Client 2 Client 3 Client 1 copy copy Request 3 Return lock Server Disk Sent to end of queue file Request 3 Request 2 FIFO queue

  20. File service Requirements cont. • File Replication • A file may be represented by several copies of its contents at different locations • Load-sharing between servers makes service more scalable • Local access has better response (lower latency) • Fault tolerance • Full replication is difficult to implement. • Caching (of all or part of a file) gives most of the benefits (except fault tolerance)

  21. File service Requirements cont. • Hardware and Software Heterogeneity • Service can be accessed by clients running on (almost) any OS or hardware platform. • Design must be compatible with the file systems of different OSes • Service interfaces must be open - precise specifications of APIs are published.

  22. File service Requirements cont. • Fault Tolerance • The service can continue to operate in the face of client and server failures. • Consistency • UNIX one-copy update semantics • Security • Based on identity of user making request • Identities of remote users must be authenticated • Privacy requires secure communication • Efficiency • Should offer facilities that are of at least the same power and as those found in conventional systems

  23. File Sharing Semantics • When more than one user shares a file, it is necessary to define the semantics of reading and writing • For single processor systems • The system enforces an absolute time ordering on all operations and always returns the result of the last operation • Referred to as UNIX semantics

  24. UNIX Semantics Original file A B PID0 A B C Writes “c” PID1 Read gets “abc”

  25. Distributed A B PID0 Read “ab” A B C Writes “c” Client 1 A B A B PID1 Read “ab” Read gets “c” Client 2

  26. Summary

  27. Caching • Attempt to hold what is needed by the process in high speed storage • Parameters • What unit does the cache manage? • Files, blocks, … • What do you do when the cache fills up? • Replacement policy

  28. Cache Consistency • The real problem with caching and distributed file systems is cache consistency • If two processes are caching the same file, how to the local copies find out about changes made to the file? • When they close their files, who wins the race? • Client caching needs to be thought out carefully

  29. Cache Strategies

  30. Replication • Multiple copies of files are maintained • Increase reliability by having several copies of a file • Allow file access to occur even if a server is down • Load-balancing across servers • Replication transparency • To what extent is the user aware that some files are replicated?

  31. Types of Replication Explicit File Replication Lazy File Replication Group Replication S0 S0 S0 Later Now S1 C S1 C S1 C S2 S2 S2

  32. Update Protocols • Okay so now we have replicas, how do we update them? • Primary Copy Replication • Change is sent to primary • Primary sends changes to secondary servers • Voting • Primary is down you are dead • Client must receive permissions of multiple servers before making an update

  33. LookupAddName UnName GetNames Client computer Server computer Directory service Application Application program program Flat file service Client module ReadWrite Create Delete GetAttributes SetAttributes File Service Architecture Figure 8.5

  34. System Modules • Flat File Service • Implements operations on the contents of files • UFIDs are used to refer to files (think I-node) • Directory Service • Provides a mapping between text names and UFIDs • Note that the name space is not necessarily flat • Might be a client of the flat file service if it requires persistent storage • Client Module • Provides client access to the system

  35. Flat file service Read(FileId, i, n) -> Data Write(FileId, i, Data) Create() -> FileId Delete(FileId) GetAttributes(FileId) -> Attr SetAttributes(FileId, Attr) Directory service Lookup(Dir, Name) -> FileId AddName(Dir, Name, File) UnName(Dir, Name) GetNames(Dir, Pattern) -> NameSeq position of first byte position of first byte FileId Server operations for the model file service Figures 8.6 and 8.7 Pathname lookup Pathnames such as '/usr/bin/tar' are resolved by iterative calls to lookup(), one call for each component of the path, starting with the ID of the root directory '/' which is known in every client. Example B Show how each file operation of the program that wrote in Example A would be executed using the operations of the Model File Service in Figures 8.6 and 8.7. FileId A unique identifier for files anywhere in the network. Similar to the remote object references described in Section 4.3.3.

  36. server operations for: copyfile("/usr/include/glob.h", "/foo") fdold = open('/usr/include/glob.h", READ) Client module actions: FileId = Lookup(Root, "usr") - remote invocation FileId = Lookup(FileId, "include") - remote invocation FileId = Lookup(FileId, "glob.h") - remote invocation client module makes an entry in an open files table with file =FileId, mode = READ, and RWpointer = 0. It returns the table row number as the value for fdold fdnew = creat("/foo", FILEMODE) Client module actions: FileId = create() - remote invocation AddName(Root, "foo", FileId) - remote invocation SetAttributes(FileId, attributes)- remote invocation client module makes an entry in its openfiles table with file =FileId, mode = WRITE, and RWpointer = 0. It returns the table row number as the value for fdnew • n = read(fdold, buf, BUFSIZE) • Client module actions: • Read(openfiles[fdold].file, openfiles[fdold].RWpointer, BUFSIZE) - remote invocation • increment the RWpointer in the openfiles table by BUFSIZE and assign the resulting array of data to buf Example B solution Show how each file operation of the program that you wrote in Example A would be executed using the operations of the Model File Service in Figures 8.6 and 8.7. if((fdold = open(oldfile, READ))>=0) { fdnew = creat(newfile, FILEMODE); while (n>0) { n = read(fdold, buf, BUFSIZE); if(write(fdnew, buf, n) < 0) break; } close(fdold); close(fdnew); *

  37. To construct a globally unique ID we use some unique attribute of the machine on which it is created, e.g. IP number, even though the file group may move subsequently. File Group ID: 32 bits 16 bits IP address date File Group A collection of files that can be located on any server or moved between servers while maintaining the same names. • Similar to a UNIX filesystem • Helps with distributing the load of file serving between several servers. • File groups have identifiers which are unique throughout the system (and hence for an open system, they must be globally unique). • Used to refer to file groups and files *

  38. NFS • NFS was originally designed and implemented by Sun Microsystems • Three interesting aspects • Architecture • Protocol ( RFC 1094 ) • Implementation • Sun’s RPC system was developed for use in NFS • Can be configured to use UDP or TCP

  39. Overview • Basic idea is to allow an arbitrary collection of clients and servers to share a common file system • An NFS server exports one of its directories • Clients access exported directories by mounting them • To programs running on the client, there is almost no difference between local and remote files

  40. Client computer NFS Client Kernel Application Application program program UNIX system calls UNIX kernel Operationson remote files Operations on local files NFS Client Other file system NFSprotocol (remote operations) NFS architecture Figure 8.8 Client computer Server computer Application Application program program Virtual file system Virtual file system UNIX UNIX NFS NFS file file client server system system *

  41. 3.Network message transferred to remote host 2.Local kernel called with network message 1.Client stub called; argument marshalling performed 9.Client stub receives message from kernel 4.Server stub is given message; arguments are unmarshaled/converted 7.Server stub converts/marshals values; remote kernel is called 5.Remote procedure is executed with arguments 8.Network message transferred back to local host 6.Procedure return values given back to server stub 10.Return values are given to client from stub Server Process Client Process Server Routine Client Routine Local Procedure call (10) (1) (5) (6) Server Stub Client Stub (2) (9) (4) (7) System call (8) Network Routine Network Routine (3) Network Communication Local Kernel Local Kernel Remote Procedure Call Model

  42. Virtual File System • Part of Unix kernel • Makes access to local and remote files transparent • Translates between Unix file identifiers and NFS file handles • Keeps tack of file systems that are currently available both locally and remotely • NFS file handles: • File system ID • I-node number • I-node generation number • File systems are mounted • The VFS keeps a structure for each mounted file system

  43. Virtual File System cont.

  44. NFS architecture: does the implementation have to be in the system kernel? No: • There are examples of NFS clients and servers that run at application-level as libraries or processes (e.g. early Windows and MacOS implementations, current PocketPC, etc.) But, for a Unix implementation there are advantages: • Binary code compatible - no need to recompile applications • Standard system calls that access remote files can be routed through the NFS client module by the kernel • Shared cache of recently-used blocks at client • Kernel-level server can access i-nodes and file blocks directly • But a privileged (root) application program could do almost the same. • Security of the encryption key used for authentication. *

  45. Model flat file service Read(FileId, i, n) -> Data Write(FileId, i, Data) Create() -> FileId Delete(FileId) GetAttributes(FileId) -> Attr SetAttributes(FileId, Attr) fh = file handle: Model directory service Lookup(Dir, Name) -> FileId AddName(Dir, Name, File) UnName(Dir, Name) GetNames(Dir, Pattern) ->NameSeq Filesystem identifier i-node number i-node generation NFS server operations (simplified) Figure 8.9 • read(fh, offset, count) -> attr, data • write(fh, offset, count, data) -> attr • create(dirfh, name, attr) -> newfh, attr • remove(dirfh, name) status • getattr(fh) -> attr • setattr(fh, attr) -> attr • lookup(dirfh, name) -> fh, attr • rename(dirfh, name, todirfh, toname) • link(newdirfh, newname, dirfh, name) • readdir(dirfh, cookie, count) -> entries • symlink(newdirfh, newname, string) -> status • readlink(fh) -> string • mkdir(dirfh, name, attr) -> newfh, attr • rmdir(dirfh, name) -> status • statfs(fh) -> fsstats

  46. NFS Client Module • Part of Unix kernel • Allows user programs to access files via UNIX system calls without recompilation or reloading • One module serves all user-level processes • A shared cache holds recently used blocks • The encryption key for authentication of user IDs is kept in the kernel

  47. NFS access control and authentication • Stateless server, so the user's identity and access rights must be checked by the server on each request. • In the local file system they are checked only on open() • Every client request is accompanied by the userID and groupID • not shown in the Figure 8.9 because they are inserted by the RPC system • Server is exposed to impostor attacks unless the userID and groupID are protected by encryption • Kerberos has been integrated with NFS to provide a stronger and more comprehensive security solution • Kerberos is described in Chapter 7. Integration of NFS with Kerberos is covered later in this chapter. *

  48. Access Control • NFS servers are stateless • User’s identity must be verified for each request • The UNIX UID and GID of the user are used for authentication purposes • Does this scare you? • Kerberized NFS

  49. Mount service • Mount operation: mount(remotehost, remotedirectory, localdirectory) • Server maintains a table of clients who have mounted filesystems at that server • Each client maintains a table of mounted file systems holding:< IP address, port number, file handle> • Hard versus soft mounts *

  50. File Mounting • File mounting protocol • A client sends a path name to a server and can request permission to mount the directory • If the request is legal, the server returns a file handle to the client • The handle identifies the file system type, the disk, the I-node number of the directory, and security • Subsequent calls to read/write in that directory use the file handle

More Related