Top

A Linux User Reference

Search tips
  • search ignores words that are less than 4 characters in length
  • searches are case insensitve
  • if a search does not return anything try it in Boolean mode then Query expansion mode by checking the appropriate radio button e.g. searching for 'cron' in just the Administration category returns nothing - presumably because the 50% threshold is reached. Boolean mode ignores this threshold so a search for 'cron' returns several hits
  • in Boolean mode preceding a word with a '+' means the result must include that word, a '-' means it must not
  • in Boolean mode '+crontab -anacron' means match articles about crontab that DO NOT mention anacron
  • to match a phrase e.g. 'manage system' check the Boolean mode radio button and enclose the phrase in quotes "some phrase ..."
  • in Query expansion mode the search context is expanded beyond the keywords you entered - relevancy of hits may well be degraded

FILES AND DIRECTORIES

Regular expressions

  • Regular Expressions

    An expression

    • Regular expressions are defined as an arrangement of strings of text or patterns.
    • Patterns consist of characters (literals) and metacharacters (characters with special meaning).
    Has three parts

    Position anchors (optional)

    Symbol Description
    ^ Match at beginning of a line
    $ Match at end of a line
    \ delimit a word and match on word boundaries

    Character Sets

    Symbol Description
    [abc] Match any single character appearing inside []
    [^abc] Match any single character not appearing inside []
    [a-z] Match any single character in the range specified in []
    [^a-z] Match any single character not in the range specified in []

    Quantity modifiers (either Basic or Extended - optional)

    Basic Extended Description
    * * Match 0 or more of the character/character set that precedes it
    \? ? Match 0 or 1 instance of the character/character set that precedes it
    \+ + Match 1 or more of the character/character set that precedes it
    \{n,m} {n,m} Match range of a single character/character set that precedes it
    \{n\} {n} Match n occurrences of
    \{n,\} {n,} Match at least n occurrences of
    \{n,m\} {n,m} Match any number of occurrences in the range n - m
    \ \||Match the character/character set either before or after
    \( \) ( ) Group a character/character set
  • Print lines matching a pattern
    /bin/grep, egrep, fgrep, rgrep

    Search stdin or files for lines containing a match to a regular expression (regex). Read man page for all the available options.

    Three variant programs are available: - 'egrep' is the same as 'grep -E' - 'fgrep' is the same as 'grep -F' - 'rgrep' is the same as 'grep -r'

    Direct invocation as either 'egrep' or 'fgrep' is deprecated though still provided for backwards compatibility.

    grep [options] pattern [file ...]
    grep [options] [-e pattern | -f file] [file ...]
    
    Common options:
    
    Matcher selectors:
     -E | --extended-regexp            Interpret PATTERN as an extended regex.
     -F | --fixed-strings              Interpret PATTERN as a list of fixed strings,
                                       separated by '/n', any of which is to be matched.
     -G | --basic-regexp               Interpret PATTERN as a basic regular expression. 
                                       Default.
     -P | --perl-regexp                Interpret PATTERN as a Perl regular expression.
                                       Experimental.
    
    Matching control:
     -e PATTERN | --regexp=PATTERN     Use PATTERN as the pattern.  Useful to protect 
                                       patterns beginning with hyphen-minus (-).
     -f FILE | --file=FILE             Obtain patterns from FILE, one per line
     -i                                Ignore case.
     -w | --word-regexp                Matches a word consisting of letters, digits and
                                       the underscore.
     -x | --line-regexp                Exactly match the whole line.
     -v                                Inverts the match, lines not containing regex.
    
    Output control:
     -L | --files-without-match        Print the name of each input file which does NOT
                                       have a match.
     -l | --files-with-matches         Print the name of each input file which does have
                                       a match.
     -m NUM | --max-count=NUM          Stop reading a file after NUM matching lines.
     -o | --only-matching              Print only the matched (non-empty) parts of a 
                                       matching line.
     -s | --no-messages                Suppress error messages about non-existent or 
                                       unreadable files.
    
    File and Directory Selection:
     -a | --text                       Process a binary file as if it were text -  
                                       equivalent to '--binary-files=text'.
     --binary-files=TYPE               Tell grep how to treat a binary file.
     -D ACTION | --devices=ACTION      If an input file is a device, FIFO or socket, 
                                       use ACTION to process it.  Default ACTION is 
                                       read. ACTION = skip to silently skip.
     -d ACTION | --directories=ACTION  As above but for directories.
     --exclude=GLOB                    Skip files whose base name matches GLOB.
     --exclude-from=FILE               As above but get 'GLOBs' from a file.
     --exclude-dir=DIR                 Exclude directories matching the pattern DIR
                                       from recursive searches.
     -I                                Process a binary file as if it did not contain 
                                       matching data - skip it.
     --include=GLOB                    Search only files whose base name matches GLOB.
     -R | -r | --recursive             Read all files under each directory, recursively.
    

    Test file - file1

    1234abcdefg   linux  system
    12abcccdefgh Linux rules
    ab1abdefg
    

    Match lines with at least three adjacent digits in file1

    $ grep '[0-9][0-9][0-9]' file1
    1234abcdefg   linux  system
    

    Match lines with at least two adjacent digits in file1

    $ grep '[0-9][0-9][0-9]*' file1
    1234abcdefg   linux  system
    12abcccdefgh Linux rules
                                       (or)
    $ grep '[0-9]\{2,\}' file1
    1234abcdefg   linux  system
    12abcccdefgh Linux rules
    

    Match lines with at least a single digit in file1

    $ grep '[0-9][0-9][0-9]\?' file1
    1234abcdefg   linux  system
    12abcccdefgh Linux rules
    ab1abdefg
    

    Match lines with at least two adjacent digits in file1

    $ grep '[0-9][0-9]\+' file1
    1234abcdefg   linux  system
    12abcccdefgh Linux rules
    

    Match lines with the word Linux or linux in file1

    $ egrep '\' file1
    1234abcdefg   linux  system
    12abcccdefgh Linux rules
    

    Match lines with the string 'abcc' or lines starting with 'ad' in file1

    $ egrep '(abcc)|(^ad)' file1
    12abcccdefgh Linux rules
    

    Find an earlier command with 'Rule' or 'rule' in it

    $ history | grep [Rr]ule
         545  grep 'rules$' file1
    

    Print only the matched string, up to max of 3 times

    $ grep -o -m 3 man cmds-man.txt
    man
    man
    man
    
    $ grep -m 2 man cmds-man.txt
    Set manual pages paging program
    - Plain text manual pages require a Paging program.
    

    Print the name of all file(s) that contain the string "Man"

    $ grep -l Man *
    cmds-file-mode.txt
    cmds-man.txt
    cmds-proc.txt
    cmds-sh.txt
    grep: rootonly.txt: Permission denied
    
    $ grep -l -s Man *
    cmds-file-mode.txt
    cmds-man.txt
    cmds-proc.txt
    cmds-sh.txt
    

    '-s' supress error messages.

  • A stream editor
    /bin/sed
    • A stream editor that can run single or multiple editing commands on the command line or from a file/script.
    • A script used by sed does not need to be executable.

    Command usage

    $ sed --help
    
    Usage: sed [OPTION]... {script-only-if-no-other-script} [input-file]...
    
      -n, --quiet, --silent
                     suppress automatic printing of pattern space
      -e script, --expression=script
                     add the script to the commands to be executed
      -f script-file, --file=script-file
                     add the contents of script-file to the commands to be executed
      --follow-symlinks
                     follow symlinks when processing in place
      -i[SUFFIX], --in-place[=SUFFIX]
                     edit files in place (makes backup if extension supplied)
      -l N, --line-length=N
                     specify the desired line-wrap length for the `l' command
      --posix
                     disable all GNU extensions.
      -r, --regexp-extended
                     use extended regular expressions in the script.
      -s, --separate
                     consider files as separate rather than as a single continuous
                     long stream.
      -u, --unbuffered
                     load minimal amounts of data from the input files and flush
                     the output buffers more often
          --help     display this help and exit
          --version  output version information and exit
    
    If no -e, --expression, -f, or --file option is given, then the first
    non-option argument is taken as the sed script to interpret.  All
    remaining arguments are names of input files; if no input files are
    specified, then the standard input is read.
    .....
    

    Use commands in 'editcmds' file to edit afile

    $ sed -f editcmds afile
    
    $ cat editcmds
    1,10 {                          (For lines 1 through to 10)
    s/#/(/2                         (Replace the second occurrence of '#' with ')')
    s/#/)/3 }                       (Replace the third occurrence of'#' with ')'
    

    NB. The comments in ( .. ) are not part of the script file.

    Run multiple editing commands

    $ sed -e 'cmd1' -e 'cmd2' [-e '...'] [file]
    

    ** Delete lines 3 through to 5 in afile**

    $ sed '3,5d' afile
    

    Delete all lines in afile that start with a #

    $ sed '/^#/d' afile
    

    Back up and translate in place

    $ sed -i".orig" 'y/abc/xyz/' afile
    

    Backs up to 'afile.orig' first then translates (in place) all occurrences of 'a' to 'x', 'b' to 'y' and 'c' to 'z' in afile.

    Replace all empty lines in afile with '@'

    $ sed 's/^$/@/' afile
    

    Replace all instances of '#' with '', i.e. delete

    $ sed 's/#//g' afile
    

    Replace the third occurrence of '#' on each line with ''

    $ sed 's/#//3' afile
    

    Delete all empty lines in afile

    $ sed '/^$/d' afile
    

    Replace the first occurrence of abc on each line with xyz

    $ sed 's/abc/xyz/' file2
    xyzabcabc
    

    Replace all occurrences of abc on each line with xyz

    $ sed 's/abc/xyz/g' file2
    xyzxyzxyz
    

    For lines 3 to 6, replace all 'abc' with 'xyz'

    $ sed '3,6s/abc/xyz/g' afile
    

    Delete the first 12 lines of all files in current directory

    $ sed -i '1,12d' *
    

    Delete last 6 lines of all files in current directory

    $ sed -i -n -e :a -e '1,6!{P;N;D;};N;ba' *
    

    Append text to end of afile

    $ sed -i '$a\
    \
    Back to the top of the page\
    The end.' afile
    

    Delete all div tags and their contents with a class="no_js"

    $ cat afile.txt
    abcdefg
    <div class="no_js">
          <p> ..... some text or what have you ....</p>
    </div>
    hijklmn
    <div class="no_js">
          <br />
          <p> ..... some text or what have you ....</p>
    </div>
    opqrstu
    <div class="no_js">
          <p> ..... some text or what have you ....</p>
          <p> ..... some text or what have you ....</p>
          <br />
    </div>
    vwxyz
    
    $ sed -e '/^<div class="no_js">$/,/^<\/div>$/ d' afile.txt
    abcdefg
    hijklmn
    opqrstu
    vwxyz
    
  • Swap two lines
    a detailed sed example

    When a line with class="cmdhead" is followed by a line with class="qref" swap them

    ----- From ------
    <span class="cmdhead">Group account shadow password file - \
       <a class="toglink" href="#gshadow">/etc/gshadow</a></span>
    <div class="qref" id="gshadowlnk">   
    
    ----- To ------
    <div class="qref" id="gshadowlnk">   
    <span class="cmdhead">Group account shadow password file - \
       <a class="toglink" href="#gshadow">/etc/gshadow</a></span>
    

    Lines wrapped for presentation purposes - line continuation symbol '\' used.

    The sed command

    $ sed -i -n '/cmdhead/{h;n;/class="qref/{p;x;bb;};H;x;};:b p' *.php
    

    In English'ish

    Edit file in place                                  (-i)
    Do not print lines                                  (-n)
    
    for each line
       if a line contains 'cmdhead'                      (/cmdhead/{)
          copy it to the hold space                      (h)
          read the next line                             (n)
          if the next line matches 'class="qref'            (/class="qref/{)
            print it                                        (p)
            swap it's contents with the hold space          (x)
            go to label b                                   (bb)
          else                                              (})
            append it to the hold space                  (H) 
            swap the hold space into the pattern space   (x)
                                                         (})
       label:b                                           (:b)
       print the pattern space                           (p)
    
  • An AWK interpreter
    /usr/bin/mawk
    • AWK is a programming language.
    • Commands can be entered on the command line as 'params' to an interpreter such as mawk or they can be written in a file and the file passed to the interpreter - usually via the '-f ' option.
    • 'mawk' is a pattern scanning and text processing language
    mawk [-W option] [-F value] [-v var=value] [--] 'program text' [file ...]
    mawk [-W option] [-F value] [-v var=value] [-f program-file] [--] [file ...]
    
    Options:
    Available with any Posix compatible implementation of AWK.
    
     -F value              Sets the field separator, FS, to value.
     -f file               Program text is read from file instead of from
                           the command line. Multiple -f options are allowed.
     -v var=value          Assigns value to program variable var.
     --                    Indicates the unambiguous end of options.
    
    Implementation specific options are prefaced with -W.  mawk provides six:
    
     -W version            Writes its version and copyright to stdout
     -W dump               Writes an assembler like listing of the internal
                           representation of the program to stdout.
     -W interactive        Sets unbuffered writes to stdout and line buffered
                           reads from stdin.  Records from stdin are lines 
                           regardless of the value of RS.
     -W exec file          Program text is read from file and this is the 
                           last option.  Useful on systems that support the
                           #!  "magic number" convention for executable scripts.
     -W sprintf=num        Adjusts the size of mawk's internal sprintf buffer to
                           num bytes.
     -W posix_space        Forces mawk not to consider '\n' to be space.
    
    The short forms -W[vdiesp] are recognised and on some systems '-We' is 
    mandatory to avoid command line length limitations.
    
  • [m]awk program structure
    • An AWK program is a sequence of pattern {action} pairs and user function definitions.
    • Either pattern or {action} may be omitted but not both.
    • If the {action} is omitted then {print} is implied, if pattern is omitted then 'a match' is implied.

    A pattern can be

       BEGIN
       END
       expression
       expression , expression
    

    {action} only examples

    $ mawk '{print $0}' a.txt                    (Print complete line)
    abc def ghi
    #jkl #mno #pqr 
    tuv #wxyz
    
    $ mawk '{print $2}' a.txt                    (Print 2nd. field, FS=' ')
    def
    #mno
    #wxyz
    
    $ mawk -F# '{print $2}' a.txt                (Print 2nd. field, FS='#')
    
    jkl 
    wxyz
    

    Pattern only examples

    $ mawk  '/^abc/' a.txt
    abc def ghi
    
    $ mawk '/#/' a.txt
    #jkl #mno #pqr 
    tuv #wxyz
    

    BEGIN and END patterns

    $ mawk 'BEGIN{ print "Pre file processing statements"} {print $0 }
    > END { print "Post file processing statements" }' a.txt
    Pre file processing statements
    abc def ghi
    #jkl #mno #pqr 
    tuv #wxyz
    Post file processing statements
    

    BEGIN { .... } and END { .... } pattern action pairs are processed before any line is read in and after the last line has been processed.

  • Read in a file and store it's contents in an array
    mawk

    The problem with arrays, as this example demonstrates, is the order in which an array is traversed is not defined.

    $ mawk 'BEGIN { while  ( getline val ) lines[val] ; 
    > close(val)
    > }
    > END { for ( var in lines ) print var;
    > }' dummy.txt
    2. second line
    5. fifth line
    1. first line
    3. third line
    4. fourth line
    
  • Defining and using a function
    mawk

    Functions are defined outside any pattern's { action } brackets

    $ mawk 'BEGIN { while ( getline val  ) array[val];
    close(val) 
    }
    
    function printarray(starttxt, endtxt) {
       print starttxt
       for (var in array) {
           print var
       }
       print endtxt
       return "printarray() done ..."  
    }
    
    END { res=printarray("starting ..", "ending ..")
          print res
    }' dummy.txt
    starting ..
    2. second line
    5. fifth line
    1. first line
    3. third line
    4. fourth line
    ending ..
    printarray() done ...