Sunday, July 20, 2014

The Linux/Unix file system

The Unix file system

  • All of the files in the UNIX file system are organized into a multi-leveled hierarchy called a directory tree. 
  • The structure of the tree can be thought of as an inverted tree(image courtesy).
  • At the very top of the file system is single directory called "root" which is represented by a / (slash). All other files are "descendents" of root.
  • The number of levels is largely arbitrary, although most UNIX systems share some organizational similarities. The "standard" UNIX file system is discussed later. 

File types

The UNIX filesystem contains several different types of files:
  • Ordinary Files
    • Used to store your information, such as some text you have written or an image you have drawn. This is the type of file that you usually work with.
    • Always located within/under a directory file
    • Do not contain other files
  • Directories
    • Branching points in the hierarchical tree
    • Used to organize groups of files
    • May contain ordinary files, special files or other directories
    • Never contain "real" information which you would work with (such as text). Basically, just used for organizing files.
    • All files are descendants of the root directory, ( named / ) located at the top of the tree.
  • Special Files
    • Used to represent a real physical device such as a printer, tape drive or terminal, used for Input/Ouput (I/O) operations
    • Unix considers any device attached to the system to be a file - including your terminal:
      • By default, a command treats your terminal as the standard input file (stdin) from which to read its input
      • Your terminal is also treated as the standard output file (stdout) to which a command's output is sent
      • Stdin and stdout will be discussed in more detail later
    • Two types of I/O: character and block
    • Usually only found under directories named /dev
  • Pipes
    • UNIX allows you to link commands together using a pipe. The pipe acts a temporary file which only exists to hold data from one command until it is read by another
    • For example, to pipe the output from one command into another command:
       
           who | wc -l 
           
      This command will tell you how many users are currently logged into the system. The standard output from the who command is a list of all the users currently logged into the system. This output is piped into the wc command as its standard input. Used with the -l option this command counts the numbers of lines in the standard input and displays the result on its standard output - your terminal.

 Directory Paths

  • Absolute path names start from root.
    /report/style/book
  • relative path names start from the current directory
    style/book

File names - Traversing through the file system

  • UNIX permits file names to use most characters, but avoid spaces, tabs and characters that have a special meaning to the shell, such as:
    
         & ; ( ) | ? \ ' " ` [ ] { } < > $ - ! / 
         
  • Case Sensitivity: uppercase and lowercase are not the same! These are three different files:
    
         NOVEMBER       November     november 
         
  • Length: can be up to 256 characters
  • Extensions: may be used to identify types of files
    
         libc.a       - archive, library file 
         program.c    - C language source file 
         alpha2.f     - Fortran source file 
         xwd2ps.o     - Object/executable code 
         mygames.Z    - Compressed file 
         
  • Hidden Files: have names that begin with a dot (.) For example:
    
         .cshrc      .login      .mailrc     .mwmrc  
      
  • Uniqueness: as children in a family, no two files with the same parent directory can have the same name. Files located in separate directories can have identical names.
  • Reserved Filenames:
    
         /   - the root directory (slash)
         .   - current directory (period)
         ..  - parent directory (double period)
         ~   - your home directory (tilde)
         

File and Directory Commands

UNIX provides a number of commands for working with files. The more common ones are described in this section. Note that these commands usually have several options and accept wildcard characters as arguments. For details, see the respective man pages which are hyperlinked to each command name.
  • ls - lists files
    
         ls        - show contents of working directory 
         ls file   - list file, if it exists in working directory 
         ls dir    - show contents of the directory dir 
         ls -a     - shows all your files, including hidden ones 
         ls -al    - give detailed listing of contents 
         ls -F     - mark directories with "/" and executable 
                     files with "*" 
         ls *.doc  - show all files with suffix ".doc" 
         
  • more - browses/displays files one screen at a time. Use h for help, spacebar to page, b for back, q to quit, /string to search for string
    
         more sample.f
         
  • pg - browses/displays files one screen at a time. Similar to the more utility in function but has different commands and options. See the man page for details.
    
         pg sample.f
         
  • less - similar to more, but with more features. Not available on every system.
    
         less sample.f
         
  • head - displays the first n lines of a file
    
         head sample.f      - display first 10 lines (default) 
         head -5 sample.f   - display first 5 lines  
         head -25 sample.f  - display first 25 lines 
         
  • tail - displays the last n lines or n characters of a file
    
         less sample.f       - display last 10 lines (default) 
         less -5 sample.f    - display last 5 lines  
         less -5c sample.f   - display last 5 characters 
         less -25 sample.f   - display last 25 lines 
         
  • cat - dumps the entire file to the screen without paging. This command is more useful for concatenating (hence the name "cat") files together than it is for reading files.
    
         cat myprog.c              - diplays entire file 
         cat -b myprog.c           - shows line numbers 
         cat file1 file2 > file3   - adds file1 and file2 to make
                                    file3 
         
  • cp - copies files. Will overwrite unless otherwise specified. Must also have write permission in the destination directory.
    
         cp  sample.f  sample2.f   - copies sample.f to sample2.f 
         cp -R dir1 dir2           - copies contents of directory 
                                     dir1 to dir2
         cp -i file.1  file.new    - prompts if file.new will be 
                                     overwritten 
         cp *.txt chapt1           - copies all files with .txt 
                                     suffix to directory chapt1 
         cp /usr/doc/README  ~     - copies file to your home 
                                     directory 
         cp ~betty/index    .      - copies the file "index" from
                                     user betty's home directory 
                                     to current directory 
         
  • mv - moves files. Will overwrite unless otherwise specified. Must also have write permission in the destination directory.
    
         mv  sample.f  sample2.f   - moves sample.f to sample2.f 
         mv dir1 newdir/dir2       - moves contents of directory 
                                     dir1 to newdir/dir2 
         mv -i file.1  file.new    - prompts if file.new will be 
                                     overwritten 
         mv *.txt chapt1           - moves all files with .txt 
                                     suffix to directory chapt1 
         
  • rm - deletes/removes files or directories if file permissions permit.
    
         rm  sample.f    - deletes sample.f 
         rm  chap?.txt   - deletes all files with chap as the 
                           first four characters of their name 
                           and with .txt as the last four 
                           characters of their name 
         rm -i *         - deletes all files in current directory 
                           but asks first for each file 
         rm -r /olddir   - recursively removes all files in the 
                           directory olddir, including the 
                           directory itself 
         
    Begin the Filesystem exercises - Part 1.
  • file - identifies the "type" of file. The command syntax is:
    
         file  filename
         
    For example:
    
         file  *         - reports all files in current 
                           directory and their types.  The
                           output might appear as shown below: 
     
         about.html:      ascii text
         bin:             directory
         staff.directory: English text
         bggen:           executable or object module not stripped
         bmbinc:          commands text
         machines.sp1:    [nt]roff, tbl, or eqn input text
         man2html:        executable or object module not stripped
         man2html.c:      ascii text
         
  • find - finds files. The syntax of this command is:
    
         find pathname -name filename -print
         
    The pathname defines the directory to start from. Each subdirectory of this directory will be searched. The -print option must be used to display results.
    You can define the filename using wildcards. If these are used, the filename must be placed in 'quotes'.
    
         find . -name mtg_jan92 -print - looks for the file 
                                         mtg_jan92 in current 
                                         directory
         find ~/ -name README -print   - looks for files called 
                                         README throughout your 
                                         home directory
         find . -name '*.fm' -print    - looks for all files with 
                                         .fm suffix in current 
                                         directory
         find /usr/local -name gnu -type d -print  
                                       - looks for a directory 
                                         called gnu within the 
                                         /usr/local directory 
         
  • diff - comparing two files or directories. Indicates which lines need be added (a), deleted (d) or changed (c). Lines in file1 are identified with a (<) symbol: lines in file2 with a (>) symbol
    
         diff file1 file2          - compares file1 to file2 
         diff -iw file1 file2      - compares two files ignoring 
                                     letter case and spaces 
         diff dir1 dir2            - compares two directories 
                                     showing files which are 
                                     unique to each and also, 
                                     line by line differences 
                                     between any files in common. 
         
    For example, if file1 and file2 are:
    
         John erpl08@ed           John erpl08@ed
         Joe  CZT@cern.ch         Joe  CZT@cern.ch
         Kim  ks@x.co             Jean JRS@pollux.ucs.co
         Keith keith@festival     Jim  jim@frolix8
                                  Kim  ks@x.co
                                  Keith keith@festival
         
    Using the diff command: diff file1 file2 Yields the output:
    
         2a3,4
         > Jean JRS@pollux.ucs.co
         > Jim  jim@frolix8
         
    Which means that to make these files match you need to add (a) lines 3 and 4 (3,4) of file2 (>) after line 2 in file1.
  • sdiff - similar to diff, but displays each line of the two files side by side, making it easier for you to see the differences between them Lines that are different are shown with a | symbol. Lines unique to file1 are identified by a < symbol; lines unique to file2 with a > symbol. Identical lines appear next to each other. The option -w 80 is used to set the width of the output from the command to 80 characters. The default is 130 characters.
    
         sdiff -w 80 file1 file2
         Mike erpl08@ed                  | John erpl08@ed
         Joe  CZT@cern.ch                Joe  CZT@cern.ch
                                         >  Jean JRS@pollux.ucs.co
                                         >  Jim  jim@frolix8
         Kim  ks@x.co                    Kim  ks@x.co
         Sam  s.wally@aston              <
         Keith keith@festival            Keith keith@festival
         
  • ln - link one file name to another. The command syntax is:
    
         ln source linkname
         
    Making a link to a file or directory does not create another copy of it. It simply makes a connection between the source and the linkname. Allows a single file to be "pointed to" by other filenames without having to duplicate the file.
    
         ln  results.1  last.run  - links filename "last.run" to 
                                    the real file results.1 in 
                                    the current directory. 
         ln  notes ../Notes.jan   - links filename "notes" in 
                                    current directory to real file
                                    Notes.jan in parent directory. 
          
  • sort - sorts files, merges files that are already sorted, and checks files to determine if they have been sorted. The command syntax is:
    
         sort  options  filename
         
    By default, lines in "filename" are sorted and displayed to the screen. If the "filename" parameter specifies more than one file, the sort command concatenates the files and sorts them as one file. An output file can be specified with the -o flag.
    Files can be sorted by "fields" - single or multiple.
    The sort command supports many options. See the man page for details.
    
         sort addresses                   - sorts the file
                                            addresses and displays
                                            output on screen 
         sort -o sorted addresses         - sorts the file
                                            addresses and writes
                                            output to the file 
                                            called sorted. 
         sort -u -o mail_labels addresses - removes all duplicate
                                            lines from the file 
                                            addresses and writes 
                                            the output in the 
                                            file mail_labels. 
         sort +2 -4 addresses             - sorts the file by
                                            its third and fourth
                                            fields.  Note that 
                                            +2 means to skip first
                                            two fields and -4 
                                            means to stop after
                                            the fourth field.
         
    Continue the Filesystem exercises - Part 2.
  • pwd - print working directory. Tells you which directory you are currently in.
    
         pwd
         
  • mkdir - make directory. Will create the new directory in your working directory by default.
    
         mkdir  /u/training/data
         mkdir  data2
         
  • cd - change to specified directory. May specify either the absolute or relative pathname. cd with no pathname changes to your home directory.
    
         cd  /usr/local    - change to /usr/local  
         cd  doc/training  - change to doc/training in current 
                             directory 
         cd  ..            - change to parent directory 
         cd  ~/data        - change to data directory in 
                             home directory 
         cd  ~joe          - change to user joe's home directory 
         cd                - change to home directory 
         
  • rmdir - remove directory. Directories must be empty before you remove them.
    
         rmdir  project1
         
    To recursively remove nested directories, use the rm command with the -r option:
    
         rm -r  dirctory_name
         
    Continue the Filesystem exercises - Part 3.
  • A summary of commands and utilities related to the UNIX file system appears below. See the corresponding man pages for detailed information.
    
         awk      -search for and process patterns in a file,
         cat      -display, or join, files
         cd       -change working directory
         chgrp    -change the group that is associated with a file
         chmod    -change the access mode of a file
         chown    -change the owner of a file
         comm     -compare sorted files
         cp       -copy files
         df       -display the amount of available disk space
         diff     -display the differences between two files
         du       -display information on disk usage
         file     -display file classification
         find     -find files
         fsck     -check and repair a file system
         grep     -search for a pattern in files
         head     -display the first few lines of a file
         ln       -create a link to a file
         lp       -print files (System V)
         lpr      -print files (Berkeley)
         ls       -list information about files
         mkdir    -create a directory
         more     -display a file one screen at a time (System V)
         mv       -move and/or rename a file
         od       -dump a file
         pg       -display a file one screen at a time (Berkeley)
         pr       -paginate a file
         pwd      -print the working directory
         rm       -remove (delete) files
         rmdir    -remove (delete) a directory
         sed      -stream editor (non-interactive)
         sort     -sort and/or merge files
         spell    -check a file for spelling errors
         tail     -display the last few lines of a file
         tar      -store or retrieve files from an archive file
         umask    -set file creation permissions
         uniq     -display the lines in a file that are unique
         wc       -counts lines, words and characters in a file
         whatis   -list man page entries for a command
         whereis  -show where executable is located in path
         which    -locate an executable program using "path"
         



Standard UNIX File System

  • There is no single standard UNIX file structure. Most UNIX systems however, follow a general convention for filesystem organization at the highest level.
    
         /(root)      - The top level directory referred to as root.  
                        Contains all files in the file system. 
    
         /bin         - Executable files for standard UNIX  
                        utilities 
    
         /dev         - Files that represent input/output devices 
    
         /etc         - Miscellaneous and system administrative  
                        files such as the password file and system  
                        start up files. 
    
         /lib         - UNIX program libraries 
    
         /tmp         - Temporary space that can be used by  
                        programs or users. 
    
         /usr/bin     - More UNIX utilities.  By convention /bin 
                        contains standard utilities and /usr/bin  
                        contains less common utilities. 
    
         /usr/bin/X11 - X windows binaries 
    
         /usr/lib     - More UNIX libraries 
    
         /usr/lib/X11 - X windows libraries 
    
         /usr/local   - Programs installed by local site 
    
         /usr/ucb     - Berkeley utilities 
    
         /u           - User home directories 
    
         /var         - Variable sized files - can grow and
                        shrink dynamically, such a users mail
                        spool and print spool files. 
         
  • Begin the Standard UNIX Filesystem Exercises

0 comments:

Post a Comment