1 / 20

UNIX Files

UNIX Files. How UNIX Sees and Uses Files. I/O: UNIX approach. The basic model of the UNIX I/O system is a sequence of bytes that can be accessed either randomly or sequentially.

jeff
Download Presentation

UNIX Files

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UNIX Files How UNIX Sees and Uses Files

  2. I/O: UNIX approach • The basic model of the UNIX I/O system is a sequence of bytes that can be accessed either randomly or sequentially. • The UNIX kernel uses a single data model, byte stream, to serve all applications. It imposes no structure on the data but instead views it as a stream of bytes. Put another way, everything to the kernel is a stream of bytes – stream I/O. • As a result an I/O stream from one program can be fed as input to any other program; • Pipelines can be formed between processes for exchanging data. • Applications may impose various levels of structure for their data, but the kernel imposes no structure on I/O. • Example: ASCII text editors process documents consisting of lines of characters where each line is terminated by ASCII line-feed character. Kernel knows nothing about this convention

  3. What is a file? • File is a named persistent collection of data; a linear array of bytes with a one name. • To the kernel, data is unstructured, sequential bytes is accessed by specifying an offset from beginning of the file. • File attributes (metadata) include owner(s), permissions, time stamps, size etc. • A file exists until all its names are deleted, or no process holds a descriptor for it. • In the kernel I/O devices are accessed as files. These are called special device files. • UNIX processes pass information to each other using in memory special files: pipes, FIFO, sockets. • User processes access all files as ordinary files - regular files on disk, device special files or in memory “pipe” files. • Terminals, printers, tapes are all accessed as if they were streams of bytes. They have names in the file system and are referred to through their descriptors.

  4. Special (Device) Files • The kernel can determine to what hardware device a special file refers and uses a resident module called device driver to communicate with the device. • Device special files are created by the mknode() system call (by the super-user only) • To manipulate device parameters ioctl() system call is used; • Different devices allow different operations through ioctl() • Devices are divided into two groups: – Block devices (structured) – Character devices (unstructured)

  5. Block devices • Random (anywhere in the stream) access devices; • Filetype “b” in “ls –al” listing of /dev directory • Internal implementation is based on the notion of block, a minimal group of bytes that can be transferred in one operation to and from the device. • A number of blocks can be transferred in one operation (for effiiciency), but less then block bytes of data is not transferred. • To user application, the block structure of the device is transparent through internal buffering being done in kernel. User process may read/write a single byte because it works with I/O stream abstraction • Examples: tapes, magnetic disks, drums, cd-roms, zip disks, floppy disks, etc.

  6. Character devices • Sequential access devices. • Filetype “c” in “ls-al” listing of /dev directory • Internal implementation often supports the notion of block transfer, • In many cases the blocks supported by character devices are very large due to efficiency considerations (e.g., communication interfaces) • Called character because the first such devices were terminals, Mouse, keyboard, display, network interface, printer, etc. • Sometime devices listed as “rxxx”, raw devices of xxx block devices.

  7. Disk devices • Disk device files are natively defined as block devices. On some systems also support character device interface – “raw disk”. This feature is going away on some systems. • File systems, organized, collections of files, are always created on the block devices, and never on the character devices. • Single physical block device can be partitioned into a number of logical devices (partition). The physical device is usually named hda, hdb, sda, sdb … representing IDE drive 1, SCSI drive 1, 2 • Each such logical device can have its own file system and is represented by its own special device file. sda1, sda2, sda3, sda4, sda5 ….

  8. “In memory” files • Interprocess communication occurs in memory. • These files are always sequential, may be unidirectional or bidirectional. • Usually limited in size, may be represented by a temporary file in /tmp for user processes.

  9. Pipe • A linear array of bytes as files, but they are unidirectional sequential communication links between the related processes (parent/child).; • Transient objects of limited size • They get their file names in the /tmp directory automatically, but open() cannot be used for them. • Descriptors obtained from pipe() system call. • Data written to a pipe can be read only once from it, and only in the order it was written (FIFO);

  10. FIFO • There is a special kind of pipes, called named pipes. • They are identical to unnamed pipes in function, size and use. Except they have normal names, as any other file, and descriptors for them can be obtained through open() system call; • Processes that wish to communicate through FIFO in both directions must open one FIFO for each direction.

  11. Socket • Socket is a transient object that is used for inter-process communication; usually over a network protocol. • It exists only as long as some process holds a descriptor on it. • Descriptor is created through the socket() system call. • Sequential access; similar to pipes; • Different types of sockets exist: Local/Remote, RPC/IPC, reliable/unreliable etc.

  12. File Descriptor • A control structure, File Control Block (FCB) – e.g. descriptor, is associated with each file in the file system • Each FCB has a unique identifier (FCB ID) • UNIX: i-node, identified by i-node number • When opened, a filenumber presented to process as filehandle • FCB structure: • File attributes (INODE) • A data structure for accessing the file’s data • Operations: OPEN: Associate an FCB with a resource path READ: Bring a specified chunk of data from file into the process virtual address space WRITE: Write a specified chunk of data from the process virtual address space to the file CLOSE: Close FCB, release resource pathh • Other operations: CREATE, DELETE, SEEK, TRUNCATE, SET_ATTRIBUTES

  13. File Descriptors • open, close, set_attributes Unix processes use descriptors to reference I/O streams. • Descriptors are small unsigned integers. • Descriptors are obtained from system calls open(), socket(), pipe(). • System calls read() and write() are applied to descriptors to transfer data. • System call lseek() is used to specify position in the stream referred by descriptor. • System call close() is used to de-allocate descriptors and the objects they refer to. • File Descriptors represent objects whose access supported by the kernel: file, pipe, socket

  14. File Descriptor Table • The kernel maintains a per-process descriptor table that kernel uses to translate the external representation of I/O stream into internal representation. • Descriptor is simply an index into this table. Consequently, descriptors have only local meaning. Note the standard filehandles provided to all processes 0 (STDIN), 1 (STDOUT), 2 (STDERR) • Different descriptors in different processes can refer to the same I/O stream; • Descriptor table is inherited upon fork(); • Descriptor table is preserved upon exec(); • When a process terminates the kernel reclaims all descriptors that were in use by this process • Which is why UNIX processes inherit variables in only one direction

  15. File descriptor

  16. File descriptor

  17. File descriptor

  18. File descriptor

  19. File descriptor

  20. File commands • cp, rm, touch, ln ….etc • base64, uuencode, uudecode • ls (-i for inode) • lsof • stat • fuser • od • find –inum (see ls –i)

More Related