Home > Power Computing > Linux: (g) awk – search and replace.

Linux: (g) awk – search and replace.

January 29th, 2010 Jimmy Leave a comment Go to comments

awk or gawk (gnu awk)

Find and Replace text, database sort/validate/index

Syntax

      awk <options> 'Program' Input-File1 Input-File2 ...

      awk -f PROGRAM-FILE <options> Input-File1 Input-File2 ...

Key
 -F FS
 --field-separator FS
     Use FS for the input field separator (the value of the `FS'
     predefined variable).

 -f PROGRAM-FILE
 --file PROGRAM-FILE
     Read the `awk' program source from the file PROGRAM-FILE, instead
     of from the first command line argument.

 -mf NNN
 -mr NNN
     The `f' flag sets the maximum number of fields, and the `r' flag
     sets the maximum record size.  These options are ignored by
     `gawk', since `gawk' has no predefined limits; they are only for
     compatibility with the Bell Labs research version of Unix `awk'.

 -v VAR=VAL
 --assign VAR=VAL
     Assign the variable VAR the value VAL before program execution
     begins.

 -W traditional
 -W compat
 --traditional
 --compat
     Use compatibility mode, in which `gawk' extensions are turned off.

 -W lint
 --lint
     Give warnings about dubious or non-portable `awk' constructs.

 -W lint-old
 --lint-old
     Warn about constructs that are not available in the original
     Version 7 Unix version of `awk'.

 -W posix
 --posix
     Use POSIX compatibility mode, in which `gawk' extensions are
     turned off and additional restrictions apply.

 -W re-interval
 --re-interval
     Allow interval expressions, in regexps.

 -W source=PROGRAM-TEXT
 --source PROGRAM-TEXT
     Use PROGRAM-TEXT as `awk' program source code.  This option allows
     mixing command line source code with source code from files, and is
     particularly useful for mixing command line programs with library
     functions.

 --
     Signal the end of options.  This is useful to allow further
     arguments to the `awk' program itself to start with a `-'.  This
     is mainly for consistency with POSIX argument parsing conventions.

'Program'
     A series of patterns and actions: see below

Input-File
     If no Input-File is specified then `awk' applies the Program to
     "standard input", (piped output of some other command or the terminal.
     Typed input will continue until end-of-file (typing `Control-d')

Basic functions

The basic function of awk is to search files for lines (or other units of text) that contain a pattern. When a line matches, awk performs a specific action on that line.

TheĀ Program statement that tells `awk’ what to do; consists of a series of “rules”. Each rule specifies one pattern to search for, and one action to perform when that pattern is found.

For ease of reading, each line in an `awk’ program is normally a separateĀ Program statement , like this:

     pattern { action }
     pattern { action }
     ...

e.g. Display lines from my_file containing the string “123″ or “abc” or “some text”:

awk '/123/ { print $0 }
     /abc/ { print $0 }
     /some text/ { print $0 }' my_file

A regular expression enclosed in slashes (`/’) is an `awk’ pattern that matches every input record whose text belongs to that set. e.g. the pattern /foo/ matches any input record containing the three characters `foo’, *anywhere* in the record.

`awk’ patterns may be one of the following:

/Regular Expression/        - Match =
Pattern && Pattern          - AND
Pattern || Pattern          - OR
! Pattern                   - NOT
Pattern ? Pattern : Pattern - If, Then, Else
Pattern1, Pattern2          - Range Start - end
BEGIN                       - Perform action BEFORE input file is read
END                         - Perform action AFTER input file is read

In addition to simple pattern matching `awk’ has a huge range of text and arithmetic Functions, Variables and Operators.

`gawk’ will ignore newlines after any of the following:

    , { ? : || && do else

Comments – start with a `#’, and continue to the end of the line:

 # This program prints a nice friendly message

Examples

This program prints the length of the longest input line:

 awk '{ if (length($0) > max) max = length($0) }
      END { print max }' data

This program prints every line that has at least one field. This is an easy way to delete blank lines from a file (or rather, to
create a new file similar to the old file but from which the blank lines have been deleted)

 awk 'NF > 0' data

This program prints seven random numbers from zero to 100, inclusive.

 awk 'BEGIN { for (i = 1; i <= 7; i++)
                print int(101 * rand()) }'

This program prints the total number of bytes used by FILES.

 ls -lg FILES | awk '{ x += $5 } ; END { print "total bytes: " x }'

This program prints a sorted list of the login names of all users.

 awk -F: '{ print $1 }' /etc/passwd | sort

This program counts lines in a file.

 awk 'END { print NR }' data

This program prints the even numbered lines in the data file. If you were to use the expression `NR % 2 == 1′ instead, it would print the odd numbered lines.

 awk 'NR % 2 == 0' data
Categories: Power Computing Tags:
Digg: DIGG ME
  1. No comments yet.
  1. No trackbacks yet.