270 likes | 412 Views
Introduction to Unix (CA263) File Processing. Objectives. Explain UNIX and Linux file processing Use basic file manipulation commands to create, delete, copy, and move files and directories Employ commands to combine, cut, paste , rearrange, and sort information in files
E N D
Objectives • Explain UNIX and Linux file processing • Use basic file manipulation commands to create, delete, copy, and move files and directories • Employ commands to combine, cut, paste, rearrange, and sort information in files • Create a script file • Use the join command to link files using a common field • Use the awk command to create a professional-looking report Guide to UNIX Using Linux, Third Edition
Understanding File Structures • Files can be structured in many ways depending on the kind of data they store • Employee information can be stored on separate line separated by delimiters such as colon. This type of record is known as variable-length record. • Another way to create a records is to have fixed number of column for each column. This type of record is known as fixed-length record. Guide to UNIX Using Linux, Third Edition
Understanding File Structures (continued) • UNIX/Linux store data, such as letters and product records, as flat ASCII files • Three kinds of regular files are • Unstructured ASCII character, you can store any kind of data in any order. You can’t retrieve a particular column, you have to print everything to get what you need. • Unstructured ASCII records, it store data as a sequence of fixed-length record, contains similar information of different persons on different rows. • Unstructured ASCII trees, it is structured as a tree of record, that can be organized as fixed-length or variable length record. Each record contains key that helps in searching record quickly. Guide to UNIX Using Linux, Third Edition
Understanding File Structures (continued) Guide to UNIX Using Linux, Third Edition
Manipulating Files • Creating files • Delete files when no longer needed • Finding a file • Combining files using paste command and output redirection • Separating files suing cut command • Sorting the contents of a file Guide to UNIX Using Linux, Third Edition
Create Files • Redirection sign & touch command will create an empty file. $ > accountfiles $ touch accountsfile2 Syntax touch [-options] [filename(s)] Useful Options include: -a update the access time only -m update the last time the file was modified -c prevent creating file, if it does not exist Guide to UNIX Using Linux, Third Edition
Delete Files • Delete files or directory permanently when no longer needed • $ rm –i phonebook.bak Syntax rm [-options] [filename / directory] Useful Options include: -i display warning before deleting a file -r will remove directory and everything it contains Guide to UNIX Using Linux, Third Edition
Removing Directories • Remove directory permanently when no longer needed • $ rmdir documents Syntax rmdir [-options] [directory] Useful Options include: -i display warning before deleting a file ??? -r will remove directory and everything it contains • Note: Directory must be empty to delete with the rmdir command Guide to UNIX Using Linux, Third Edition
Finding a File • Finding a file helps you locate it in the directory structure • $ find –i phonebook.bak Syntax: find [pathname] [-name filename] Useful Options include: Pathname: . is for current directory -name indicates that you are searching for file with specific name. You can use wild card Guide to UNIX Using Linux, Third Edition
Combining Files • Combining files using output redirection • cat command - concatenate text of two different files via output redirection • paste command - joins text of different files in side by side fashion Guide to UNIX Using Linux, Third Edition
Sorting • Sorting a file’s contents alphabetically or numerically. Syntax sort [-option] [filename] Useful Options include: -k n sort on key field specified by n -t indicates a specified char that separate fields -m merges input file that have been previously sorted -o redirect output to the specified file -d sorts in alphanumeric or dictionary order -g sorts by numeric order -r sorts in reverse order Guide to UNIX Using Linux, Third Edition
Sorting Example $ sort –k 3 food >sortedfood $ Cat sortedfood Lettuce Sourdough Beef Spinach White bread Chicken Beans Pumpernickel Mutton Carrots Whole Wheat Turkey Guide to UNIX Using Linux, Third Edition
The –n option to sort specifies that the first field on the line is to be considered a number. $ cat data 5 27 2 12 3 33 23 2 -5 11 15 6 14 -9 $ sort data -5 11 14 -9 15 6 2 12 23 2 3 33 5 27 Sorting –n Example $ sort –ndata -5 11 2 12 3 33 5 27 14 -9 15 6 23 2 Guide to UNIX Using Linux, Third Edition
The +1 say to skip the first field. Similarly, +5n would mean to skip the first five field. $ cat data 5 27 2 12 3 33 23 2 -5 11 15 6 14 -9 Skip the first field in the sort $ sort +1ndata 14 -9 23 2 15 6 -5 11 2 12 5 27 3 33 Sorting +1n Example Guide to UNIX Using Linux, Third Edition
Creating Script Files • UNIX/Linux users create shell script files to contain commands that can be run sequentially as a set – this helps with the issues of command automation and re-use of command actions • UNIX/Linux users use the vi editor to create script files, then make the script executable using the chmod command with the x argument Guide to UNIX Using Linux, Third Edition
Creating Script Files (continued) Guide to UNIX Using Linux, Third Edition
Using the join Command on Two Files • Sometimes you want to link the information in two files • The join command is often used in relational database processing • The join command associates information in two different files on the basis of a common field or key in those files Guide to UNIX Using Linux, Third Edition
The join Command • Syntax: join [-option] [file1 file2] Useful Options include: • File1 and file2 are two input files that must be sorted on join field. -1 fieldnum specifies the common join field in file 1 -2 fieldnum specifies the common join field in file 2 -o specifies a list of field to output -t specifies the field separator character, space, tab or new line -a filenumproduce a file for each unpairable line -e str replaces empty fields for the unpairable line in the string specified by str. Guide to UNIX Using Linux, Third Edition
sed • sed is a program used for editing data. It stands for stream editor. • Example: To change first occurrences of “Unix” to “UNIX” on every line of intro $ cat intro The Unix operating system was pioneered by Ken Thompson. Main goal of Unix was to create Unix environment for efficient program development $ sed 's/Unix/UNIX/' intro The UNIX operating system was pioneered by Ken Thompson. Main goal of UNIX was to create Unix environment for efficient program development Guide to UNIX Using Linux, Third Edition
sed with g option • Example: To change all occurrences of “Unix” to “UNIX” on every line of intro $ cat intro The Unix operating system was pioneered by Ken Thompson. Main goal of Unix was to create Unix environment for efficient program development $ sed 's/Unix/UNIX/g' intro The UNIX operating system was pioneered by Ken Thompson. Main goal of UNIX was to create UNIX environment for efficient program development Guide to UNIX Using Linux, Third Edition
sed with –n option • Just print first 2 lines $ cat intro The Unix operating system was pioneered by Ken Thompson. Main goal of Unix was to create Unix environment for efficient program development $ sed –n '1,2p' intro The Unix operating system was pioneered by Ken Thompson. Main goal of Unix was to create Unix Guide to UNIX Using Linux, Third Edition
sed with –n option • Just print first lines containing Unix $ cat intro The Unix operating system was pioneered by Ken Thompson. Main goal of Unix was to create Unix environment for efficient program development $ sed –n '/Unix/p' intro The Unix operating system was pioneered by Ken Thompson. Main goal of Unix was to create Unix Guide to UNIX Using Linux, Third Edition
sed with d option • Delete lines 1 and 2 $ cat intro The Unix operating system was pioneered by Ken Thompson. Main goal of Unix was to create Unix environment for efficient program development $ sed '1,2d' intro environment for efficient program development • Delete all lines containing Unix $ sed '/Unix/d' intro environment for efficient program development Guide to UNIX Using Linux, Third Edition
A Brief Introduction to theAwk Program • Awk, a pattern-scanning and processing language helps to produce professional-looking reports • Awk provides a powerful programming environment that can perform actions on files that are difficult to duplicate with a combination of other commands Guide to UNIX Using Linux, Third Edition
A Brief Introduction to theAwk Program (continued) • Awk checks to see if the input records in specified files satisfy a pattern • If so, awk executes a specified action • If no pattern is provided, awk applies the action to every record Guide to UNIX Using Linux, Third Edition
awk command • Syntax: awk [-Fsep] [‘pattern {action}…’] [filename] Useful Options include: -F: means the field separator is colon awk ‘BEGIN { print "This is an awk print line." }’ This is an awk print line. awk –F: ‘{printf "%s\t %s\n", $1, $2}’ datafile Guide to UNIX Using Linux, Third Edition