2.0.0 The Bash Shell

The BASH shell or bourne again shell is the GNU version of the stock bourne shell (or simply sh) with GNU extensions. To date bash is the most popular Unix shell.

Bash is being used as the starting point in the examples as it has a low entry point and at the same time is pivotal to the system. What many Unix users are not aware of (but administrators, developers and home users are all too well aware) is the system shell is used to bring up and manage most services on a Unix system. This makes looking at shell scripting examples a little more interesting because knowledge here can be leveraged elsewhere.

2.0.1 Relatively Small Bash Scripts

2.0.1.i. Qtop

A shell script can do an amazing amount of work with a relatively small amount of typing. In the following script from Shelldorado (http://shelldorado.com/) gets a list of the top 15 processes and refreshes it every 5 seconds:

        #!/bin/bash
        DISPPROC=15
        DELAY=5
        clear
        while (true)
        do
                clear
                echo "----------------------------------------------------------------------------"
                    echo "                                  Top Processes"
                    uname -a
                    uptime
                    date
                echo "----------------------------------------------------------------------------"
        /bin/ps aux | head -$DISPPROC
        sleep $DELAY
        done

Amazingly small yet it creates a nice display:

Top Processes Linux adm102 2.6.16.46-0.12-default #1 Thu May 17 14:00:09 UTC 2007 x86_64 x86_64 x86_64 GNU/Linux 1:08pm up 61 days 5:32, 4 users, load average: 0.49, 0.76, 0.70 Wed Nov 26 13:08:51 EST 2008

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0    784    76 ?        S    Sep26  11:06 init [3]  
root         2  0.0  0.0      0     0 ?        SN   Sep26   0:00 [ksoftirqd/0]
root         3  0.0  0.0      0     0 ?        S<   Sep26   0:04 [events/0]
root         4  0.0  0.0      0     0 ?        S<   Sep26   0:00 [khelper]
root         5  0.0  0.0      0     0 ?        S<   Sep26   0:00 [kthread]
root         7  0.0  0.0      0     0 ?        S<   Sep26   0:02 [kblockd/0]
root         8  0.0  0.0      0     0 ?        S<   Sep26   0:00 [kacpid]
root        91  0.0  0.0      0     0 ?        S    Sep26  11:28 [pdflush]
root        94  0.0  0.0      0     0 ?        S<   Sep26   0:00 [aio/0]
root        93  0.0  0.0      0     0 ?        S    Sep26   8:33 [kswapd0]
root       302  0.0  0.0      0     0 ?        S<   Sep26   0:00 [cqueue/0]
root       303  0.0  0.0      0     0 ?        S<   Sep26   0:00 [kseriod]
root       334  0.0  0.0      0     0 ?        S<   Sep26   0:00 [kpsmoused]
root       674  0.0  0.0      0     0 ?        S<   Sep26   0:00 [scsi_eh_0]

2.0.1.ii MD5 Script

The next script generates and md5 checksum of a file specified on the command line using the md5 binary. First a look at the surrounding pieces of the script; comments are where they are needed to explain what these do. Also note that functions are being used for the first time.

        progname=${0##*/} # Capture the name of the script
        toppid=$$ # Capture the PID
        trap "exit 1" 1 2 3 15 # Trap on these exits

        # We use this routine to bail out. Print our error then 
        # kill ourselves with great discrimination.
        bomb()
        {
             cat >&2 <<ERRORMESSAGE

                ERROR: $@
                *** ${progname} aborted! ***
                        ERRORMESSAGE
                        kill ${toppid}
                        exit 1
        }

Pretty simple so far; now the next two functions; generate the hash and a usage printer:

# Based on the number being passed either generate a quiet sum or loud one

        DoGen()
        {
                if [ $3 -ne "0" ]; then
                        md5 $1 > $2 || bomb "Could not gen md5 for ${1}"
                else
                        md5 $1 | awk '{print $4}' > $2 || bomb "Could not gen md5 for ${1}"
                fi
        }
        # Print a usage message
        Usage()
        {
                cat <<__usage__
        Usage: ${progname} [-f file-to-use] [-F]
        Arguments:
            -f The file to get a signature from
            -o Output file (md5 is the default
        Flags:"
            -F Create a signature using all of md5 output
        __usage__
        }

Note the usage message uses an atomic echo; that is all at once (which is why it is not indented). There is a logical reason for this; each echo, print, printf etc. requires a separate system call. One atomic print uses one call. While not a big deal now overtime such practice can add up to time saved.

Now on to the main part of the script. Here the options are cased in and the input file assigned, output of the sum, and what type:

        FULLMD5FILE=0
        OUTFILE="md5"
        while [ "$#" -gt "0" ]
        do
        opt="${1//-}"; pt=`cat $opt | cut -c 1`
        case $1 in
                F) FULLMD5FILE=1 ;;
                f)      shift ; SRCFILE=$1 ;;
                o) shift ; OUTFILE=$1 ;;
                u) Usage ; exit 0  ;;
                *) echo "Syntax Error" ; Usage ; exit 1  ;;
        esac
        shift
        done

        DoGen $SRCFILE $OUTFILE $FULLMD5FILE

        exit 0

One little oddity about shell scripting: when using functions note that the function_name(parameter list...) nomenclature is not used. Programmers coming from other languages might find this a little hard to get used to at first. Now the whole listing:

        progname=${0##*/} # Capture the name of the script
        toppid=$$ # Capture the PID
        trap "exit 1" 1 2 3 15 # Trap on these exits

        # We use this routine to bail out. Print our error then
        # kill ourselves with great discrimination.
        bomb()
        {
                cat >&2 <<ERRORMESSAGE
                ERROR: $@
                *** ${progname} aborted! ***
                ERRORMESSAGE
                     kill ${toppid}
                     exit 1
        }

        DoGen()
        {
                if [ $3 -ne "0" ]; then
                        md5 $1 > $2 || bomb "Could not gen md5 for ${1}"
            else
                        md5 $1 | awk '{print $4}' > $2 || bomb "Could not gen md5 for ${1}"
                fi
        }

        Usage()
        {
            cat <<__usage__
        Usage: ${progname} [-f file-to-use] [-F]
        Arguments:
            -f The file to get a signature from
            -o Output file (md5 is the default
        Flags:"
            -F Create a signature using all of md5 output
        __usage__
        }

        FULLMD5FILE=0
        OUTFILE="md5"
        while [ "$#" -gt "0" ]
        do
        opt="${1//-}"; pt=`cat $opt | cut -c 1`
        case $1 in
            F) FULLMD5FILE=1 ;;
            f)  shift ; SRCFILE=$1 ;;
            o) shift ; OUTFILE=$1 ;;
            u) Usage ; exit 0  ;;  
            *) echo "Syntax Error" ; Usage ; exit 1  ;;  
        esac
        shift
        done
   
        DoGen $SRCFILE $OUTFILE $FULLMD5FILE
   
        exit 0

2.0.2 Bash and I/O

One of the jobs of the Unix shell is navigate and work with Unix filesystems. Naturally if one of the jobs of the shell is to work with filesystems and files then shell scripting is an ideal solution to the same end. The next two scripts tackle two distinctly different topics:

  1. Finding header files without taxing a system.
  2. Checking Network Filesystem(s)

2.0.2.i. Finding Header Files

Once upon a time there as a sysadmin who wanted to be able to quickly look for header files without using the find command across too many filesystems and so the hfind script
was born. Outside of the core function the hfind script comes packed with the usual pieces needed with 2 extra globals, where to look and the maximum number of passes, that is when to stop looking across the $ldpaths string:

        program=${0##*/}
        toppid=$$
        ldpaths="/usr/include /usr/local/include /opt /usr"
        passmax=4
        trap "exit 1" 1 2 3 15

        bomb()
        {
                cat >&2 <<ERRORMESSAGE

        ERROR: $@
        *** ${program} aborted ***
        ERRORMESSAGE
                kill ${toppid}      # in case we were invoked from a subshell
                exit 1
        }

        usage()
        {
                cat <<usage
        Usage: ${program} [option arg][option]
        Usage: ${program} [-f filename][-p passes] [-u]
        Passes: Number of different paths to try and pass through.
        usage
        }

The next helper function simply prints out where the search is looking and some advice if someone needed to add (or substract) the paths:

        showpaths()
        {
                echo "Current paths searched (in order):"
                for pth in ${ldpaths} ; do
                        echo ${pth}
                done
                echo "Modify the \$ldpaths variable at the top of the script if you like"
        }

Now take a look at the main loop, not altogether different than previous examples but formatted to be a little more compact. This shows the flexibility of how shell scripts literally interpret what they are fed:

        passes=1
        verbose=0
        while [ "$#" -gt "0" ]
        do
                opt="${1//-}"
                opt=$(echo "${opt}" | cut -c 1 2>/dev/null)
                case $opt in
                        f) shift;filename=$1;;
                        p) shift;passes=$1;;
                        s) showpaths;exit 0;;
                        u) usage;exit 0;;
                        v) verbose=1;;
                        *) usage;exit1;;
                esac
                shift
        done

        if [ ! $filename ]; then
                echo "Error: No filename specified"
                usage
                exit 1
        fi

        search $passes $filename

Finally onto the core function of the script; the actual search which is relatively simple - it goes to the top of each path and looks only within that subdirectory:

        search()
        {
                pass=$1
                file=$2

                cnt=0

                for pth in ${ldpaths} ; do
                        cnt=$(($cnt+1))
                        if [ "$verbose" -gt "0" ]; then
                                echo "Searching for ${file} in ${pth}"
                        fi

                        find $pth -type f -name $file
                        if [ "$cnt" -eq "$pass" ]; then
                                break;
                        elif [ "$cnt" -eq "$passmax" ]; then
                                echo "Search exhuasted"
                                exit 0
                        fi

                done

                return $?
        }

It is worth noting at this point there should be a pattern that can be observed regarding how the hfind script was presented. Outside of the main portion of the script all other functions were considered helper functions. The core function is really the key to the entire script, and this is a central axiom of Unix programming that much of a program exists outside of a core algorithm to either support it or in the case of the main function to get information needed by the core algorithm. This idea of do one thing well will continue to crop up throughout this book. Now for the finished program:

        program=${0##*/}
        toppid=$$
        ldpaths="/usr/include /usr/local/include /opt /usr"
        passmax=4

        trap "exit 1" 1 2 3 15

        #-----------------------------------------------------------------------------
        # bomb - Simple death routine; display ERRORMESSAGE, kill toppid and exit.
        #
        # requires: ERRORMESSAGE
        # returns : exit 1
        #-----------------------------------------------------------------------------
        bomb()
        {
                cat >&2 <<ERRORMESSAGE

        ERROR: $@
        *** ${program} aborted ***
        ERRORMESSAGE
                kill ${toppid}      # in case we were invoked from a subshell
                exit 1
        }

        #-----------------------------------------------------------------------------
        # search - Search $ldpaths
        #
        # requires: Npass
        # returns : $?
        #-----------------------------------------------------------------------------
        search()
        {
                pass=$1
                file=$2

                cnt=0

                for pth in ${ldpaths} ; do
                        cnt=$(($cnt+1))
                        if [ "$verbose" -gt "0" ]; then
                                echo "Searching for ${file} in ${pth}"
                        fi

                        find $pth -type f -name $file
                        if [ "$cnt" -eq "$pass" ]; then
                                break;
                        elif [ "$cnt" -eq "$passmax" ]; then
                                echo "Search exhuasted"
                                exit 0
                        fi

                done

                return $?
        }

        #-----------------------------------------------------------------------------
        # usage - simple usage print
        #-----------------------------------------------------------------------------
        usage()
        {
                cat <<usage
        Usage: ${program} [option arg][option]
        Usage: ${program} [-f filename][-p passes] [-u]
        Passes: Number of different paths to try and pass through.
        usage
        }

        #-----------------------------------------------------------------------------
        # showpaths - print all  of the currently searched paths
        #-----------------------------------------------------------------------------
        showpaths()
        {
                echo "Current paths searched (in order):"
                for pth in ${ldpaths} ; do
                        echo ${pth}
                done
                echo "Modify the \$ldpaths variable at the top of the script if you like"
        }

        passes=1
        verbose=0
        while [ "$#" -gt "0" ]
        do
                opt="${1//-}"
                opt=$(echo "${opt}" | cut -c 1 2>/dev/null)
                case $opt in
                        f) shift;filename=$1;;
                        p) shift;passes=$1;;
                        s) showpaths;exit 0;;
                        u) usage;exit 0;;
                        v) verbose=1;;
                        *) usage;exit1;;
                esac
                shift
        done

        if [ ! $filename ]; then
                echo "Error: No filename specified"
                usage
                exit 1
        fi

        search $passes $filename

There are a lot more comments in the full script. Good commenting even if it is as rudimentry as simply explaining what a section of code is doing is as important for the author as it is someone who may wish to modify and use it.

2.0.2.ii Check NFS

The next example is not an entire program... yet - we will be revisiting it at the end of this section. Instead only some functions which use some tricks. We want to first check to see if the count of actual mounted filesystems matches against a known minimum by counting the number of nfs mounts that should be mounted versus the number of nfs mounts that are mounted. The trick? ... we need to check them on a remote host. There is an assumption: ssh public key exchange is working for the account being used - as with previous examples detailed commenting is used to by pass lenthgy discussion:

        chknfs()
        {       
        host=$1  # Pass in the name or IP of the host

        # SSH to the remote host and grep out of /etc/fstab any line
        # that is not a comment and is a nfs filesystem ONLY. 
        nfs_fstab=$(ssh $host  grep "^[^#].*nfs" /etc/fstab |  
                                awk '{print $2}' 2>/dev/null)

        # Check the ACTUAL number of NFS mounts currently mounting using 
        # the mount command: DO NOT USE df - it likes to hangs
        nfs_mount=$(ssh $host mount | grep nfs | awk '{print $3}' 2>/dev/null)

        # Compare the mounts (1 for 1) to see if a NFS mount is mounted more
        # than once or is not mounted at all
        for i in $nfs_fstab; do  # For every valid nfs mount in fstab
            matches=0        # no matches yet
                for j in $nfs_mount; do # Increment matches if we find it
                                if [ $i ==  $j ]; then
                                        matches=$(($matches+1))
                                fi
                        done

                # Print out an alarm if needed based on the number of
                # matches found (or not as the case may be)
                # Ruh-roh - no matches - we are missing one
                if [ "$matches" -eq 0 ]; then
                        cnerror $host "nfs: ${i} not mounted"
                        return 1
                # Crap; must be an old version of nfs... too many mounts
                elif [ "$matches" -gt 1 ]; then
                        cnerror $host "nfs: ${i} is mounted multiple times"
                        return 2
                fi
        done

        return 0
        }

The observant will note there is a lot that can be done with the above function - for instance what if the settings need to be checked? What if the actual version needs to be checked? All in all however, it is easy see how powerful shell programming can be especially for diagnostics.

2.0.3 Bash Network and OS Scripts

Even though the nfs check uses SSH, a better script that deals with networking would be nice plus an example of how shell scripts interact closely with Operating System services are the next and final scripts before the program.

2.0.3.i Remote Sync Script

The rsync utility can be used to synchronize a source and destination directory either on the same machine or across the network using some sort of protocol mechanism. For our example the secure shell protocol is used again - the assumption for keys is not mandatory - if passwords are permitted the script will request the user enter a password as soon as the synchornization begins. Unlike previous scripts the main concern is providing as much information as possible to rsync in order to simplify an rsync operation of this type. Additonally the POSIX getopt function is used which means there will be an extra safety check. Here is the top part of the script:

        #!/bin/sh
        progname=${0##*/}
        toppid=$$
        PROTO=ssh
        UTIL=rsync
        UTIL_FLAGS="-az --delete -e $PROTO"
        bomb()
        {
                cat >&2 <<ERRORMESSAGE

        ERROR: $@
        *** ${progname} aborted ***
        ERRORMESSAGE
                kill ${toppid}
                exit 1
        }
        # We check to see if all the needed binaries are in PATH
        for i in $PROTO $UTIL
        do
                if ! type ${i} >/dev/null; then
                        bomb "${i} not found"
                fi
        done
        # make sure our shell supports getopt
        if ! type getopts >/dev/null 2>&1; then
                bomb "/bin/sh shell is too old; try ksh or bash"
        fi

Since there are some non-standard tools being used some more safety checks are piled on. Once all of that is done; it is a matter of building up the string needed by parsing input and passing along all of the information to the command:

        while getopts s:d: ch; do
                case ${ch} in
                        s) SRC=${OPTARG};;
                        d) DST=${OPTARG};;
                esac
        done
        shift $((${OPTIND} - 1))

        $UTIL $UTIL_FLAGS $SRC $DST || 
                bomb "Could not run ${UTIL}  ${UTIL_FLAGS} ${SRC} ${DST}

        exit 0

Simple and sweet and ironically, all it is doing is the following:

        #
        # rsync -az --delete -e ssh hostname_or_ip:/path/to/sync/src /path/to/dst
        #

And reducing that to:

        #
        # script_name host:/path /path/to/dst
        #

Which generally is a lot easier to read in the crontab.

2.0.3.ii Service Script

Several Linux distributions ship with a neat utility called the service utility. All this utility does is locate and execute a init script and any arguments. Surprisingly I have found this utility to be very handy and not on BSD or several other Linux distributions so of course - I wrote my own based on a version from RedHat.

In order to accomodate multiple platforms the script looks for the init directory; there should be only one:

        for dir in /etc/rc.d /sbin/init.d /etc/init.d ; do # set init dir
             if [ -d "${dir}" ];then
                SERVICEDIR=${dir}
             else
                     echo "No init script directory found" && exit 1
             fi
        done

Next check to make sure there are arguments:

        if [ $# -eq 0 ]; then
        echo $"${USAGE}" >&2
        exit 1
        fi

Go somewhere safe then parse out the arguments. If a valid operation was passed attempt to perform the operation, otherwise parse other arguments or error out:

        cd / # go someplace safe
        while [ $# -gt 0 ]; do
                case "${1}" in
                     --help | -h | --h* ) # Need help?
                             echo $"${USAGE}" >&2
                             exit 0
                             ;;
                     --list | -l | --l* ) # Lets see what is in the init dir
                             cd $SERVICEDIR && ls
                             exit 0
                             ;;
                     --version | -V ) # What version is this?
                             echo $"${VERSION}" >&2
                             exit 0                        ;;
                     *)
                             SERVICE="${1}" # Try to perform the service op
                             COMMAND="${2}"
                             if [ -x "${SERVICEDIR}/${SERVICE}" ]; then
                                     env -i LANG=$LANG PATH=$PATH TERM=$TERM \
                                      "${SERVICEDIR}/${SERVICE}" ${COMMAND} || exit 1

                             else
                                     echo $"${SERVICE}: unrecognized service" >&2
                                     exit 1
                             fi
                             ;;
             esac
        done

        exit 2

Following is the full listing plus the missing information from the top:

        #!/bin/sh
        # Script ----------------------------------------------------------------------
        # service - A slightly enhanced version of the redhat service script.
        #           Supports several different service locations.
        # 
        #------------------------------------------------------------------------------
        PATH="/sbin:/usr/sbin:/bin:/usr/bin:/usr/X11R6/bin" ; export PATH
        VERSION="`basename $0` ver. 0.91"
        USAGE="Usage: `basename $0` < option > | [ service_name command ]"
        SERVICE=
        COMMAND=

        for dir in /etc/rc.d /sbin/init.d /etc/init.d ; do # set init dir
                if [ -d "${dir}" ];then
                        SERVICEDIR=${dir}
                else
                        echo "No init script directory found" && exit 1
                fi
        done

        if [ $# -eq 0 ]; then
                echo $"${USAGE}" >&2
                exit 1
        fi

        cd /
        while [ $# -gt 0 ]; do
                case "${1}" in
                        --help | -h | --h* )
                                echo $"${USAGE}" >&2
                                exit 0
                                ;;
                        --list | -l | --l* )
                                cd $SERVICEDIR && ls
                                exit 0
                                ;;
                        --version | -V )
                                echo $"${VERSION}" >&2
                                exit 0
                                ;;
                        *)
                                SERVICE="${1}"
                                COMMAND="${2}"
                                if [ -x "${SERVICEDIR}/${SERVICE}" ]; then
                                        env -i LANG=$LANG PATH=$PATH TERM=$TERM \
                                         "${SERVICEDIR}/${SERVICE}" ${COMMAND} || exit 1
                                else
                                        echo $"${SERVICE}: unrecognized service" >&2
                                        exit 1
                                fi
                                ;;
                esac
        done

        exit 2

Not too difficult and easily ported from one platform to the next.

2.0.4 Bash Program: cnchk

Expanding on the example from the Filesystem and I/O section, the program in this example runs a battery of tests on another Linux host (or hosts). Noting the central algorithm axiom, in the cnchk (or compute node check) the central algorigthm is the ability to call the battery tests easily and add to them with relative ease.

To keep the size of the program to a minimum we are performing three diagnostic tests on each system:

  1. The nfs check
  2. Make sure root is readable
  3. Check to see if any local filesystems are at 100 percent

The nfs check is already done, so here are the other two checks heavily commented:

        chkdsk() # The name of the function
        {
        host=$1 # Checking this host

        # Get fs usage in percentages from local filesystems ONLY
        # (if there are nfs problems they should not interfere 
        fsperc=$(ssh $host df -l | grep -v Use | awk '{ print $5 }' 2>/dev/null)

        # Check each local filesystem percentage to see if it is 100%
        for i in $fsperc; do
                if [ ${i} = "100%" ]; then
                        echo $i
                        echo -n "filesystem: a local filesystem is full"
                        return 1 # We hit a snag - let the caller know
                fi
        done

        return 0 # We are okay
        }

        chkwrite()
        {
        host=$1 # Checking this host

        ssh $host touch /tmp/nodechk.$$ # See if we can create a file

        # If we couldn't create a file throw an alarm 
        if [ $? -gt 0 ]; then
                echo "${host} filesystem: Read only root filesystem"
                return 1 # Return bad news
        else
                  ssh $host rm /tmp/nodechk.$$ # Otherwise clean up
                         fi

                         return 0 # we are okay
        }

A few things worth noting. The names of the check functions are consistent (albeit lame) - this is on purpose. In the larger listing it helps to differentiate helper functions from the test cases. Now armed with the common bomb routine all this program needs is the parser. To mix things up lets create the the program in such a way that it can do either one host, a range of hosts with a common name or a comma delimited list of hosts:

# Input parsing - the usage explains what each one does

        OPER=""
        if [ $# -gt 0 -a "$1" = "-h" ];then
        usage
        exit 0
        fi
        if [ $# -gt 0 -a "$1" = "--help" ];then
        usage
        exit 0
        fi

        while [ "$#" -gt "0" ]; do
        case $1 in
        # Create a list of hosts as n1,n2,n3
        --node=*)
                NODELIST="${1#*=}"
                NODELIST="${NODELIST//,/ }"
                OPER="list"
                ;;
        --node|-n)
                NODELIST="$2"
                shift
                NODELIST="${NODELIST//,/ }"
                OPER="list"
                ;;
        # Create a range of hosts as hostname[n-N]
        --range=*)
                RANGE="${1#*=}"
                RANGE="${RANGE//-/ }"
                RANGE="${RANGE//[a-z]/ }"
                PREFIX="${1#*=}"
                PREFIX="${PREFIX%%[0-9]*}"
                OPER="range"
                ;;
        --range|-r)
                RANGE="$2"
                RANGE="${RANGE//-/ }"
                RANGE="${RANGE//[a-z]/ }"
                PREFIX="${2#*=}"
                PREFIX="${PREFIX%%[0-9]*}"
                OPER="range"
                shift
                ;;
       # Default - just enter a hostname
       *)
                NODELIST="${1}"
                NODELIST="${NODELIST//,/ }"
                OPER="list"
                ;;
        esac
        shift
        done

The program just got a lot more complex. There are essentially two operating modes- a list of nodes or a range of nodes which have to be dealt with differently. Essentially the nodelist can be iterated over while the range just counted; how to differentiate? - that is what the OPER variable was for:

        # If a nodelist was specified, just iterate through it
        if [ ${OPER} = "list" ]; then
        for node in $NODELIST;do
                        for diag in chknfs chkwrite chkdsk ; do
                                $diag $node
                                [ $? -gt 0 ] && continue   # Node fails: report and skip remaining tests
                        done
                done
        elif [ ${OPER} = "range" ]; then # Range specified
        cur="$(echo ${RANGE} | awk '{ print $1 }' 2>/dev/null)"   # Start node
        end="$(echo ${RANGE} | awk '{ print $2 }' 2>/dev/null)"   # Ladt node
        prefix=$(echo ${PREFIX} | awk '{ print $1 }' 2>/dev/null) # Node prefix

        while [ "$cur" -le "$end" ]; do # For the duration of the range

            for diag in chknfs chkwrite chkdsk ; do
                $diag $node
                if [ $? -gt 0 ]; then   # If we fail a test take
                       cur=$(($cur+1)) # actions based on flags 
                       continue          # and skip remaining
                fi
            done
                done
        else
                echo "Error! Nodes improperly specified"
                exit 2
        fi
        exit 0  

Not so bad after all - notice anything missing? The bomb routine is not there. We do not want to completely bomb out if one node or check fails - we just want to know about it and move on. Also note some redundant typing; we could group the checks into one variable to save time later; for instance:

        ...
        CHECKS="chknfs chkdsk chkwrite"
        ...

Then replace them in the for loops where they are explicity typed in. Now for the final full listing:

        #!/bin/bash
        # Script ---------------------------------------------------------------------
        # Program   : cncheck (Compute Node CHECK
        # Author    : Jay Fink <fink.jr.1@pg.com>
        # Purpose   : Run a battery or selective tests on node(s).
        #-----------------------------------------------------------------------------
        PROG=${0##*/}
        TOPPID=$$
        HOST=$(hostname 2>/dev/null)
        trap "exit 1" 1 2 3 15

        #-----------------------------------------------------------------------------
        # chkdsk - Check local filesystem, if over the threshold send a message.
        #          SSH to the host, parse df -lhP output for 100%.
        #
        # requires: hostname
        # returns:  A 1 if a fs is > 100% else a 0
        #-----------------------------------------------------------------------------
        chkdsk() # The name of the function
        {
        host=$1 # Checking this host

        # Get fs usage in percentages from local filesystems ONLY
        # (if there are nfs problems they should not interfere
        fsperc=$(ssh $host df -l | grep -v Use | awk '{ print $5 }' 2>/dev/null)

        # Check each local filesystem percentage to see if it is 100%
        for i in $fsperc; do
                if [ ${i} = "100%" ]; then
                        echo $i
                        echo -n "filesystem: a local filesystem is full"
                        return 1 # We hit a snag - let the caller know
                fi
        done

        return 0 # We are okay
        }

        #-----------------------------------------------------------------------------
        # chkwrite - Make sure / is read/writable.
        #            SSH onto the host, create a tmp file then delete it.
        #
        # requires: hostname
        # returns:  1 if the filesystem is r-o else a 0
        #-----------------------------------------------------------------------------
        chkwrite()
        {
        host=$1 # Checking this host

        ssh $host touch /tmp/nodechk.$$ # See if we can create a file

        # If we couldn't create a file throw an alarm
        if [ $? -gt 0 ]; then
                echo "${host} filesystem: Read only root filesystem"
                return 1 # Return bad news
        else
          ssh $host rm /tmp/nodechk.$$ # Otherwise clean up
             fi

             return 0 # we are okay
        }

        #-----------------------------------------------------------------------------
        # chknfs - First do a nfs numeric count, then ensure what is listed in fstab
        #          is in fact mounted.
        #
        # requires: hostname
        # returns:  1 if a problem was encountered else a 0
        #-----------------------------------------------------------------------------
        chknfs()
        {       
        host=$1 
                        
        nfs_fstab=$(ssh $host  grep "^[^#].*nfs" /etc/fstab |  
                  awk '{print $2}' 2>/dev/null)

        nfs_mount=$(ssh $host mount | grep nfs | awk '{print $3}' 2>/dev/null)
                
        for i in $nfs_fstab; do
  
                matches=0       
                        for j in $nfs_mount; do
                                if [ $i ==  $j ]; then
                                        matches=$(($matches+1))
                                fi
                        done

                if [ "$matches" -eq 0 ]; then
                        cnerror $host "nfs: ${i} not mounted"
                        return 1
                elif [ "$matches" -gt 1 ]; then
                        cnerror $host "nfs: ${i} is mounted multiple times"
                        return 2
                fi
        done

        return 0
        }

        #-----------------------------------------------------------------------------
        # usage - Usage message
        #-----------------------------------------------------------------------------
        usage()
        {
        if [ -n "$*" ]; then
                echo " "
                echo "${PROG}: $*"
        fi
            cat <<usage
        ${PROG} [option argument][option]
        ${PROG} [nodename] [option arg]
        ${PROG} [--node=NODE1,NODE2|-node NODE1,NODE2|-n NODE1,NODE2]
                [--range=NODE1-NODEn|--range NODE1-NODEn|-r NODE1-NODEn]
        Examples:
         Check nodes 2-15 on dev cluster, run in verbose and log.
        ${PROG} --range=dev1-16 
        ${PROG} -r dev1-16 
        usage                 
        }

        #-----------------------------------------------------------------------------
        # Main Loop
        #-----------------------------------------------------------------------------
        # Input parsing - the usage explains what each one does
        OPER=""
        if [ $# -gt 0 -a "$1" = "-h" ];then
        usage
        exit 0
        fi
        if [ $# -gt 0 -a "$1" = "--help" ];then
        usage
        exit 0
        fi

        while [ "$#" -gt "0" ]; do
        case $1 in
        # Create a list of hosts as n1,n2,n3
        --node=*)
                NODELIST="${1#*=}"
                NODELIST="${NODELIST//,/ }"
                OPER="list"
                ;;
        --node|-n)
                NODELIST="$2"
                shift
                NODELIST="${NODELIST//,/ }"
                OPER="list"
                ;;
        # Create a range of hosts as hostname[n-N]
        --range=*)
                RANGE="${1#*=}"
                RANGE="${RANGE//-/ }"
                RANGE="${RANGE//[a-z]/ }"
                PREFIX="${1#*=}"
                PREFIX="${PREFIX%%[0-9]*}"
                OPER="range"
                ;;
        --range|-r)
                RANGE="$2"
                RANGE="${RANGE//-/ }"
                RANGE="${RANGE//[a-z]/ }"
                PREFIX="${2#*=}"
                PREFIX="${PREFIX%%[0-9]*}"
                OPER="range"
                shift
                ;;
        # Default - just enter a hostname
       *)
                NODELIST="${1}"
                NODELIST="${NODELIST//,/ }"
                OPER="list"
                ;;
        esac
        shift
        done

        # If a nodelist was specified, just iterate through it
        if [ ${OPER} = "list" ]; then
        for node in $NODELIST;do
            for diag in chknfs chkwrite chkdsk ; do
                $diag $node
                [ $? -gt 0 ] && continue   # Node fails: report and skip remaini
ng tests
            done
        done
        elif [ ${OPER} = "range" ]; then # Range specified
        cur="$(echo ${RANGE} | awk '{ print $1 }' 2>/dev/null)"   # Start node
        end="$(echo ${RANGE} | awk '{ print $2 }' 2>/dev/null)"   # Ladt node
        prefix=$(echo ${PREFIX} | awk '{ print $1 }' 2>/dev/null) # Node prefix

        while [ "$cur" -le "$end" ]; do # For the duration of the range

            for diag in chknfs chkwrite chkdsk ; do
                $diag $node
                if [ $? -gt 0 ]; then   # If we fail a test take
                       cur=$(($cur+1)) # actions based on flags
                       continue          # and skip remaining
                fi
            done
        done
        else
                echo "Error! Nodes improperly specified"
            exit 2
        fi
        exit 0

2.0.5 Bash Program: vmware init

The free vmware product, vmware-server (formerly GSX) does not have auto power on for certain guests. A simple workaround for not being able to auto power on guests using the vmware interface is call the vmware command line utility at boot up using the local init function. A better way is to write an init script to handle the start up and possibly other functions. This text will examine a simple method to create a control script for managing power functions in vmware-server. The Ultra Cheap Method

The easiest method without bothering to write a ctl script or wrapper is to find the vmid of the vmware guest to start up and embed a direct call using the vmware-vim-cmd interface in the systems local init script (generally /etc/rc.local). First get the id of the guest:

        # vmware-vim-cmd  vmsvc/getallvms
        Vmid          Name                                    File                               Guest OS      Version                Annotation              
        112    freebsd7              [standard] freebsd7/freebsd7.vmx \
         freebsd64Guest   vmx-07    vela: irc server running freebsd7    
        208    netbsd5.99.10_amd64   [standard] \
         netbsd5.99.10_amd64/netbsd5.99.10_amd64.vmx\
         otherGuest64     vmx-07                                         
        240    opensuse11-prime      [standard] \
         opensuse11-prime/opensuse11-prime.vmx  \
         suse64Guest      vmx-07                                         
        96     freebsd8              [standard] \
         freebsd8/freebsd8.vmx           freebsd64Guest  \
         vmx-07    pyxis: freebsd-8.0 development server

A bit messy looking but the ids are easy enough, now just add it to the local init script:

        # we get the VMID using  vmware-vim-cmd  vmsvc/getallvms
        if [ -x /usr/bin/vmware-vim-cmd ]; then
                echo "Starting Guest VMID 112, sleeping for 16 seconds"
                sleep 16
                /usr/bin/vmware-vim-cmd vmsvc/power.on 112
        fi

Of course that is far too simple, one of the most common operations performed on vmware guests is powering on, powering off and resetting the host. A good sysadmin is lazy, so it is time to draft an ipmitool-like ctl script for controlling the power of the guests.

Config File

Because the format of vmware-vim-cmd vmsvc/getallvms does not show hostname or IP address mapping the vmid to host or IP address saves some time. Following is the format of our gsxhosts file:

        # hostname VMID 
        vela    112
        carina  208
        pyxis   96

The format may not make sense now, however, when it is parsed in the script the overly simple format will make more sense. To get things started setup the hosts file, the vmsvc command, the usage message and a small error exit routine to make error handling simple - note the script supports all power operations:

        #!/bin/sh
        # gsx-ipmi - An ipmilike shell wrapper for vmware-vim-cmd vmsvc/power.*
        HOSTSFILE=/etc/gsxhosts # This can be anywhere the admin likes
        VIMVC=" vmware-vim-cmd  vmsvc/" # We just tack on the oper
        usage()
        {
            if [ -n "$*" ]; then
                echo " "
                echo "${PROG}: $*"
            fi
            cat <<usage
        ${PROG} [host][oper cmd]|[-u]
        ${PROG} [host][power getstate|hibernate|off|on|reboot|shutdown|suspend]|[-u]
        Commands:
          getstate    Display the current power state of the guest
          hibernate   Place the guest power into hibernate mode (OS must support)
          off         Power off the guest
          on          Power on and boot up the guest
          reboot      Normal reboot of the guest
          reset       Power reset (cold) the guest
          shutdown    Normal shutdown of the guest
          suspend     Place the guest into suspended mode
        Notes:
         The user must have appropriate privileges to perform power operations on 
        guests.
        usage
        }
        # Only call this if there was an input error because it displays the usage
        # message
        error_exit()
        {
            message=$1
            exit_code=$2

            echo $message
            usage
            exit $exit_code
        }

Take note of the usage, the script must specify an operation - the reason for this is to be able to add functionality later on. Even though for now the scope of the script is limited to power commands, done properly, the script could later have other vimvc operations added to it and the usage is similar to the ipmi command. Now onto the meat of the script, believe it or not it is straightforward, since all that has to be done is to tack on to the command string "power.$operation" there are 3 basic steps:

  1. validate input
  2. ascertain the vmid
  3. attempt to execute the command

3.

First the validation:

        # Input parsing - the usage explains what each one does
        if [ $# -gt 0 -a "$1" = "-u" ];then
            usage
            exit 0
        fi

        guest=$1
        oper=$2
        subcmd=$3

        if [ ! $guest ]; then
            echo "Error: No guest specified"
            usage
            exit 1
        fi

        [ ! $guest ] << error_exit "No guest specified" 1
        [ ! $oper ] << error_exit "No operation specified" 1
        [ ! $subcmd ] << error_exit "No subcommand specified" 1

not too difficult, next try to get the vmid using grep and awk, this is where the simple file format comes into play:

        #
        vmid=`grep $guest $HOSTSFILE|awk '{print $2}'`
        [ ! $vmid ] << error_exit "${guest} did not match anything in $HOSTSFILE" 2

With the vmid in hand the last step is to determine the operation then execute:

        #
        case $oper in
            power)
                $VIMVC"power."$subcmd $vmid 2>/dev/null
                if [ $? -gt 0 ]; then
                    error_exit "$subcmd failed" 1
                fi
                ;;
            *)
                error_exit "Invalid operation" 2
                ;;
        esac
        exit 0

The power case could be more exotic but to keep from having to include all of the valid commands (even as a sed compare) we just fail. Note how by casing in the operation the doorway is left open to add other vimvc commands.

The Full Script

        #!/bin/sh
        # gsx-ipmi - An ipmilike shell wrapper for vmware-vim-cmd vmsvc/power.*
        HOSTSFILE=/etc/gsxhosts # This can be anywhere the admin likes
        VIMVC=" vmware-vim-cmd  vmsvc/" # We just tack on the oper

        usage()
        {
            if [ -n "$*" ]; then
                echo " "
                echo "${PROG}: $*"
            fi
            cat <<usage
        ${PROG} [host][oper cmd]|[-u]
        ${PROG} [host][power getstate|hibernate|off|on|reboot|shutdown|suspend]|[-u]
        Commands:
          getstate    Display the current power state of the guest
          hibernate   Place the guest power into hibernate mode (OS must support)
          off         Power off the guest
          on          Power on and boot up the guest
          reboot      Normal reboot of the guest
          reset       Power reset (cold) the guest
          shutdown    Normal shutdown of the guest
          suspend     Place the guest into suspended mode
        Notes:
         The user must have appropriate privileges to perform power operations on
        guests.
        usage
        }

        # Only call this if there was an input error because it displays the usage
        # message
        error_exit()
        {
            message=$1
            exit_code=$2

            echo $message
            usage
            exit $exit_code
        }

        # Input parsing - the usage explains what each one does
        if [ $# -gt 0 -a "$1" = "-u" ];then
            usage
            exit 0
        fi

        guest=$1
        oper=$2
        subcmd=$3

        if [ ! $guest ]; then
            echo "Error: No guest specified"
            usage
            exit 1
        fi

        [ ! $guest ] << error_exit "No guest specified" 1
        [ ! $oper ] << error_exit "No operation specified" 1
        [ ! $subcmd ] << error_exit "No subcommand specified" 1

        vmid=`grep $guest $HOSTSFILE|awk '{print $2}'`
        [ ! $vmid ] << error_exit "${guest} did not match anything in $HOSTSFILE" 2

        case $oper in
            power)
                $VIMVC"power."$subcmd $vmid 2>/dev/null
                if [ $? -gt 0 ]; then
                    error_exit "$subcmd failed" 1
                fi
                ;;
            *)
                error_exit "Invalid operation" 2
                ;;
        esac

        exit 0

The end

2.1.0 The Perl Programming Language

The Practical Extraction and Reporting Language was invented by Larry Wall as a means to provide more solid programming contructs combined with the ease of scripting in making adminsitration and system programming easier. One way to look at Perl is a middle ground between C and shell scripting which makes it an ideal stepping stone between shell scripting and C programming.

In this section a similar format to shell scripting and programming is used: several smaller examples separated by type with a final full fledged program.

2.1.1 Small Perl Script Examples

The first two examples are very small:
The first example comes from an old Matt's Script Archive example (http://www.scriptarchive.com/) while the second script is more of a function but the examination is using it as a program unto itself.

2.1.1.i Nuke M's

One of the big problems of old (not so much these days) was when users would transfer a text file from a FAT32 or WINNT filesystem over to a Unix system the file would retain the non-Unix end of line control charachters; CTRL+Ms. The following snippet alleviates the problem, just run the script on the Unix host:

        #!/usr/bin/perl
        while (<>) {
                $_ =~ s/\cM\n/\n/g;
                print $_;
        }

The program simply prints out the lines changed; but as is plain to see it opens the whole file and using a reassignment through regular expression strips off the control M and replaces it with a Unix new line. A simple yet powerful example of Perl.

2.1.1.ii Return the Contents of File

Perl is great at opening, reading and arranging the contents of a file internally or directories for that matter. In the next segement of code a file is read into an array, sorted, then printed out:

        open FILE "/path/to/file" || die "Cannot Open File"
        my @file_contents = <FILE>;
        close FILE;
        sort (@file_contents);
        foreach(@file_contents) {
                print $_;
        }

The first line attempts to open a file handle (a pointer to a file) called FILE using the path to a file or die and print a nice message. Next assign the file contents to an array, close the open file handle, sort and finally print it all out in a forach loop.

2.1.2 File I/O and Filesystems

One of the most common uses of Perl is to produce well formatted reports. One of the most common tasks for system administrators is to produce reports. Naturally, Perl and administrative reports go hand in hand.

2.1.2.i Group Report

There exist two primary methods of ascertaining data about the groups on a Unix system via Perl; open the /etc/group file or call the built in system call to poll for the information. What the following script does is both - it gets all of the groups from the group file, then gets detailed information about each group from the system call and finally formats a nice ASCII report. First the /etc/group file is opened, then the names split over an array:

        my @group_file = load_file("/etc/group");
        my @groupnames;

        while (my $i = shift @group_file) {
            my @temp = split(":", $i);
            push @groupnames, $temp[0];
        }

Predeclare the format of the report, not the period after the formatting:

        format =
        @<<<<<<<<<<<<<<<<<<<< @<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
        $name,                $gid,  @who
                                     @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<~~
        .
        $name = "Group Name";
        $gid  = "Group ID";
        @who  = "Members";
        write();

Iterate over the groups and pull the interesting information from them:

        foreach(@groupnames) {
            my ($grpname, $grpw, $ggid, @members);
            ($grpname,
             $grpw,
             $ggid,
             @members) = getgrnam($_);

            $name = $grpname;
            $gid  = $ggid;
            @who  = @members;
            write();

        }

Finally we need a function that opens the file which was called earlier:

        sub load_file {
            my ($file) = shift; # what were reading
            my @flist;      # where were gonna stick it

            open(FILE, $file);
            @flist = <FILE>;
            close FILE;

            return(@flist); # send the list back to the caller
        }

2.1.2.ii Logging Stop and Restart Functions

In any program it is often useful to have file logging for a variety of purposes. The next two small functions start and restart logging respectively; the operations should look familar (file opening):

        sub start_logging{
            local($log_file) = @[0];
            open(LOG,">$logfile");
            if(LOG){
                $log=1;
                return 1;
                $|=1;
            }
        }
        sub restart_logging{
            local($log_file) = @[0];
            open(LOG,">>$logfile");
            if(LOG){
                $log=1;
                return 1;
            }
        }

In perl functions are called subroutines - hence the sub before the function name. Also, to guarentee scope two pre-declarations can be used, local keeps a variable local to the file and my keeps it local to the context.

2.1.3 Network and OS

As illustrated earlier, Perl has access to many network and OS capabilities. Even with access to system interfaces Perl can be particularly useful on systems that allow easy access to system information such as Linux kernel based distributions. In addition to working with OS details Perl itself can be used as a daemon (or service in non Unix parlance).

2.1.3.i IP Connection Tracker

Linux systems have a firewall package called iptables. One problem that used to occur on Linux kernel based systems was the IP connection tracker in the kernel would become overloaded and start dropping valid packets. The following script does several things:

Note that for brevity the load_file() sub routine discussed earlier is not repeated:

        # Check the current connections situation...
        sub check_ip_conntrack {
            my $ip_conntrack_max = `cat /proc/sys/net/ipv4/ip_conntrack_max`;
            my $ip_conntrack_cur = `wc -l /proc/net/ip_conntrack`;
            my $hi_water_mark = ($ip_conntrack_max * .6);
            my $ip_conntrack_hard_limit = get_mem_max();

            if ($ip_conntrack_cur >= $hi_water_mark) {
                my $new_value = ($hi_water_mark * 2);

                if (($new_value * 65000) >= $ip_conntrack_hard_limit) {
                    print "Error! IP CONNTRACK HAS REACHED THE 70% of RAM HIMARK!\n";
                    exit 0;
                } else {
                    system("echo $new_value > /proc/sys/net/ipv4/ip_conntrack_max");
                }
            }
        }
        # Calculate how much memory we can gobble up for conntrack
        sub get_mem_max {
            my @kmeminfo = load_file("/proc/meminfo");
            my $ram_total = @kmeminfo[3];

            $ram_total =~ s{MemTotal:}{};
            $ram_total =~ s{kB}{};
            $ram_total =~ s{MB}{};

            return (.7 * $ram_total)*.06;
        }
        
        check_ip_conntrack(); # No input required - just run

The above script could be run periodically by hand or simply scheduled using the cron system scheduler.

2.1.3.ii Creating A Daemon Process With Perl

he hypothetical scenario is simple, check for a subdirectory under /mnt/net, specifically, ajaax and then log the status. The Initial Prototype

To see if it is even feasible, a very simple program is banged out just to get the concept down:

#!/usr/bin/perl
my $DELAY = 300;
my $mount_point = "/mnt/net/ajaxx";
unless(fork()) {

        unless(fork()) {
                while(1) {
                        if ( -d "$mount_point" ) {
                                print "Mount point $mount_point is present\n";
                        } else {
                                print "Mount point $mount_point missing\n";
                        }
                sleep $DELAY;
                }
        }

}

An experienced Perl programmer probably sees one glaring flaw with the code (even though it does work):

unless(fork()) {

unless(fork()) {

Has two problems:

It works,however, so a good starting point has been established. Note the $DELAY which is in seconds. Version 0.2

Now it is time to start plugging in the desired parts:

        use strict;
        use POSIX qw(setsid);

Use strict checking and setsid from POSIX.

        my $DELAY       =  300;
        my $MNT         = "/mnt/net/ajaxx";
        my $LOG         = "/tmp/mntchkd.log";
        my $PROGRAM     = "mntchkd";

5 minute delay, the directory being checked on, logfile location and the name of the program.

        sub appendfile {
                my ($fp, $msg) = @_;

                if (open(FILE, ">>$fp")) {
                        print FILE ("$msg\n");
                        close FILE;
                }
        }

A very simple logging utility that appends to a logfile by opening, appending then closing. The actual filename is defined by the first argument and assigned as fp. The second argument is a full string with the error message. Then a open and append to the file.

        sub insert0 {
                my ($date) = shift;

                if ($date < 10) {
                        return "0$date";
                } 

                return $date;
        }

Insert zeros into a date part. Helps to make a consistent column format for dates in the logfile.

        sub longfmt {
                my ($sec,$min,$hour,$mday,$mon,$year,
                        $wday,$yday,$iddst) = localtime(time);
                my $datestring;

                $year += 1900;
                $mon++;
                $mon  = insert0($mon);
                $mday = insert0($mday);
                $min  = insert0($min);
                $datestring = "$year-$mon-$mday $hour:$min";

                return($datestring);
        }

A function that sets up and formats a date string for a log entry. Note the entire date string format, it can be customized easily.

        unless(my $pid = fork()) {
                exit if $pid;
                setsid;
                umask 0;
                my $date = longfmt();
                appendfile($LOG, "$date Starting $PROGRAM");
                while(1) {
                        $date = longfmt();
                        if ( -d "$MNT" ) {
                                appendfile($LOG, "$date Mount point $MNT present");
                        } else {
                                appendfile($LOG, "$date Mount point $MNT missing");
                        }
                sleep $DELAY;
                }
        }

Now the fun part. A simple fork, then assign the pid using setsid. Next setup a umask of 0 and an initial date for the "start" log message. The message is sent and a forever while loop is started. Note that the date is formatted at each iteration of the loop.

Last and not least, actually check for the subdirectory and sleep for the predefined delay.

2.1.4 Perl Program: Host Watch Daemon

Expanding on previous examples in this section; the Perl program in this section daemonizes and checks for processes on another host (or to see if the host is alive).

At the beginning is our initialization with descriptions:

        #!/usr/bin/perl
        ##
        # Perldaemon process to check a heartbeat
        ##
        use strict;
        use POSIX qw(setsid);
        my ($FALSE,$TRUE) = (0,1); # A boolean w/o declaration
        my $DELAY       =  30; # The delay in seconds between checks
        my $HOST        = "my.foo.net"; # The name of the host to check
        my $PROGRAM     = "host_watch";
        my $LOG         = "/var/log/$PROGRAM.log"; # Our logfile
        my $ACTIVE      = $FALSE; # This host is not the active host - yet
        # Signals to Trap and Handle
        $SIG{'INT' } = 'interrupt';
        $SIG{'HUP' } = 'interrupt';
        $SIG{'ABRT'} = 'interrupt';
        $SIG{'QUIT'} = 'interrupt';
        $SIG{'TRAP'} = 'interrupt';
        $SIG{'STOP'} = 'interrupt';
        $SIG{'TERM'} = 'interrupt';

So now that the basics are out of the way some helper functions:

        ##
        # Append file: Append a string to a file.
        ##
        sub appendfile {
                my ($fp, $msg) = @_;

                if (open(FILE, ">>$fp")) {
                        print FILE ("$msg\n");
                        close FILE;
                }
        }

        ##
        # Insert 0: Fix up date strings
        ##
        sub insert0 {
                my ($date) = shift;

                if ($date < 10) {
                        return "0$date";
                }

                return $date;
        }

        ##
        # Interrupt: Simple interrupt handler
        ##
        sub interrupt {
                my $date = longfmt();
                appendfile($LOG, "$date: caught @_ exiting");
                print "caught @_ exiting\n";
                die;
        }       

        ##      
        # Long format: Custom datestring for the logfile
        ##              
        sub longfmt {
                my ($sec,$min,$hour,$mday,$mon,$year,
                        $wday,$yday,$iddst) = localtime(time);
                my $datestring;

                $year += 1900;
                $mon++;
                $mon  = insert0($mon);
                $mday = insert0($mday);
                $min  = insert0($min);
                $datestring = "$year-$mon-$mday $hour:$min";
        
                return($datestring);
        }       

With the helper functions out of the way (some of them should look familar) it is time to look at two core functions. The central algorithm in this context is broken apart because there is a need for regression: that is to make absolutely sure the failure is real:

        ##
        # Ping Test: Check to see if a host is alive and return
        ##           2 if not, 0 if yes and 3 if unknown
        sub pingtest {
                my $host = shift;

                my $ping_cnt = `/bin/ping -c 8 $host | /usr/bin/wc -l`;

                if ( $ping_cnt <= 4 ) {
                        return 2;
                } elsif ( $ping_cnt >= 13 ) {
                        return 0;
                } else {
                        return 2;
                }

                return 3;
        }

        ##
        # Regress: basically rerun the ping test just to be sure
        ##
        sub regress {
                my $host = shift;

                my $date = longfmt();
                my $ping_result = pingtest($host); # Check again!

                if ( $ping_result >= 2 ) {
                        appendfile($LOG,
                          "$date: Problem with $host");
                                        # DO SOMETHING HERE
               } elsif ( $ping_result == 0 ) {
                        appendfile($LOG,
                          "$date: Missed ping on $host; now seems to be okay");
                } else {
                        appendfile($LOG,
                          "$date: $host is intermittent");
                }
        }

Essentially ping is used but as the code below illustrates even if a ping check fails the regression is called just to be absolutely sure:

        ##
        # MAIN: Fork and setsid().
        ##
        unless(my $pid = fork()) {
                exit if $pid;
                setsid;
                umask 0;
                my $date = longfmt();
                appendfile($LOG, "$date: Starting $PROGRAM");
                while(1) {
                        $date = longfmt();
                        my $ping_result = pingtest($HOST);
                        if (( $ping_result == 2 ) &&
                         ($ACTIVE == $FALSE)) {
                          appendfile($LOG, "$date: $HOST not replying; regressing");
                          regress($HOST);
                        } elsif ( $ping_result == 0 ) {
                          appendfile($LOG, "$date: $HOST answered all pings");
                          if ($ACTIVE == $TRUE) {
                                appendfile($LOG,
                                  "$date: Primary $host resumed. Stopping locally.");
                                sleep 2; # Give log time to flush to disk
                                # DO SOMETHING HERE
                          }
                        } else {
                          if ($ACTIVE == $FALSE) {
                                appendfile($LOG,
                                  "$date: $HOST didn't answer all pings");
                          } else {
                                appendfile($LOG,
                                "$date: Primary down; local host is active");
                          }
                        }
                sleep $DELAY;
                }
        }

Additionally if it detects a host is up; the current host stops doing whatever it was taking over.

2.1.5 Perl Program: Nagios Check System Health

Ironically there has not been as much crossover in my current environment (at least as of this writing) and my hobbyist/home coding/administration life in quite some time. For once I have found something I wrote at my job that has a direct translation into something other admins would find useful; a script that performs and reports multiple checks at once. In this the first part of the series a look at the motivation, helper functions and some core generic functions of the script.

I provision a lot of systems, a lot of them; almost 1/week (although not at that frequency - the frequency varies) - some of these systems are virtual machines, some are cloned virtual machines, some are physical servers while in the rare instance some are just devices of some type (or really dumb servers). Part of the post installation setup is adding systems to the appropriate Nagios monitors. After doing this for over a year I saw a pattern emerge - every system needed to have a set of common checks:

Of course depending upon one's environment that list may need more or less.

There are many solutions to this sort of problem. One could have a config file for every host and simply drop in the new values and the config files sourced under one directory for instance. The only quick solution at the time for my configuration was to be able to wrap multiple checks into one. I looked at how I might do so within nagios and determined I was too lazy to figure out if some sort of dependency relationship could be used; since I am so lazy I looked for a plugin to do this but found it needed an agent - again being lazy I did not want to have to install agents unless there was an absolute need. The answer (as usual) became obvious - write a wrapper.

The script presented within is a first draft, there are known bugs, however, the idea behind this text is to present an idea and method for other admins to adopt so they can formulate their own similar script and be more efficient. Starting Bits

So first up some signals to trap and variables, I commented up a lot to avoid having to explain the details in the text:

        # Signals we are interested in dealing with, the right operand is the 
        # subroutine which handles the given interrupt type
        $SIG{'INT' } = 'interrupt';
        $SIG{'HUP' } = 'interrupt';
        $SIG{'ABRT'} = 'interrupt';
        $SIG{'QUIT'} = 'interrupt';
        $SIG{'TRAP'} = 'interrupt';
        $SIG{'STOP'} = 'interrupt';
        # Globals
        my $USER1="/usr/local/nagios/libexec"; # Be consistent wrt Nagios
        my $CHECK="HEALTH"; # the name of the check; feel free to change
        my $OUTFILE = "/var/tmp/healthcheck.tmp"; # an outfile for later use
        # Where we store cherry picked results; init these to a space in case they
        # are not all collected
        my @LOAD_VALUES = " ";
        my @SYSTIME_VALUE = " ";
        my @ROOTDISK_VALUE = " ";
        # Default values for LOAD, ROOTDISK Usage
        my $DEF_LOAD_WARN = "4,2,2";
        my $DEF_LOAD_CRIT = "5,4,3";
        my $DEF_DISK_WARN = 95;
        my $DEF_DISK_CRIT = 98;
        my $DEF_SNMP_COMMUNITY = "public";
        my $STATUS = 0; # A status var to be returned to nagios
        # Flags
        $DNS = 1; # do check that this host has a DNS entry
        $PING = 0; # don't preping by default since nagios does, switch to 1 if
                        # you want to preping before bothering with the rest
        # Brain dead interrupt handler
        sub interrupt { # usage: interrupt \'sig\'
            my($sig) = @;
            die $sig;
            die;
        }
        # Generic sub: Load a file into an array and send the array back 
        sub loadfile {
            my ($file) = shift;
            my @flist;
            open(FILE, $file) or die "Unable to open logfile $file: $!\n";
            @flist = <FILE>;
            close FILE;
            return(@flist);
        }

So far so good, we setup out LOAD and DISK parameters in addition to arrays to capture returned results. We are relying upon snmp checks for these but note the script could be modified to use SSH etc.

Now it is time to move onto the functions that do the work, first up is a generic return parser to construct what will be sent back to nagios:

        # Handle results status and print a final message with values of collated data
        sub check_exit { # usage: check_exit("message string",RETVAL)
            my ($msg,$ret) = @;
            # determine our status and exit appropriately
            if ($ret >= 3) {
                print "$CHECK UNKNOWN: $msg ";
            } elsif ($ret == 2) {
                print "$CHECK CRIT: $msg ";
            } elsif ($ret == 1) {
                print "$CHECK WARN: $msg ";
            } elsif ($ret == 0) {
                print "$CHECK OK: $msg ";
            } else{
                print "$CHECK UNKNOWN STATE: $msg ";
            }
            # print what we collected - note if one fails we do not collect the rest
            chomp (@SYSTIMEVALUE);
            chomp (@LOAD_VALUES);
            print("@SYSTIME_VALUE, System Load @LOAD_VALUES, Rootdisk @ROOTDISK_VALUE");
            unlink($OUTFILE); # delete the temp file for good
            exit ($ret);      # exit appropriately so nagios knows what to do
        }

The exit function uses the return number to determine the warning level (if any) and passes along an optional message string. Note that regardless of which check failed the function returns all available data. The idea was (when it was written) if disk is low it might be causing a high load etc. The next function greps CRITICAL or WARN from the output file; this is because the script actually calls other Nagios checks which will leave the status string in the output file:

        # Check the outfile in some cases for a SNMP warn or critical
        # send back the appropriate signal for nagios 
        sub check_outfile { # usage: check_outfile
            my @critical = `grep CRITICAL $OUTFILE`;
            if (@critical) {
                return 2;
            }
            my @warn = `grep WARN $OUTFILE`;
            if (@warn) {
                return 1;
            }

            return 0;
        }

Next is the usage, note that not all of the capabilities have been scripted yet so this is a look ahead (kind of) at the next text:

        # ye olde usage message
        sub usage {
            print "Usage: $0 [-u[-H ||[ -lw  -lc  -dw  -dc ]]\n";
            print "Usage: $0 [--nodns][--noping][--snmp \"community [user] [pass]\"\n";
            print "Options:\n";
            print " -H       Check system called  (required)\n";
            print " -lw     Set load warning values\n";
            print "                Default: $DEF_LOAD_WARN\n";
            print " -lc     Set load critical values\n";
            print "                Default: $DEF_LOAD_CRIT\n";
            print " -dw     Set rootdisk warning percent\n";
            print "                Default: $DEF_DISK_WARN\n";
            print " -dc     Set rootdisk critical percent\n";
            print "                Default: $DEF_DISK_CRIT\n";
            print " --nodns        Do not check for DNS resolution\n";
            print " --noping       Do not preping to make sure the host is up\n";
            print "                Note: this will improve performance\n";
            print " --snmp   Set SNMP community name\n";
            print "                Default: $DEF_SNMP_COMMUNITY\n";
            print " -u             Print usage message and exit\n";
        }

Using the usage message as a roadmap the first check is the load check. For this the script simply calls the existing check_snmp nagios check, note the snmp community is an argument:

        # Check load
        sub load { # usage: load($host_or_ip,warn,critical,community)
            my ($host,$warn,$crit,$comm) = @;
            system("$USER1/checksnmp -H $host -C $comm -o \
              .1.3.6.1.4.1.2021.10.1.3.1,.1.3.6.1.4.1.2021.10.1.3.2,\
              .1.3.6.1.4.1.2021.10.1.3.3 -w $warn -c $crit \
                 -l \"Load 1min/5min/10min\"  > $OUTFILE");
            my $r = check_outfile();
            @LOAD_VALUES = `cat $OUTFILE|\
              awk '{ print \$3 \" \" \$5 \" \" \$6 \" \" \$7}'`;
            if ($r > 0) {
                if ($STATUS < $r) {
                    $STATUS = $r;
                }
            }
        }

The function gets the values, stores and finally checks their status. Next in order is a simple one, using snmp again check the root filesystem:

        # Check rootdisk
        sub rootdisk { # usage: rootdisk(host_or_ip,warn,crit,community)
            my ($host,$warn,$crit,$comm) = @;
            system("$USER1/checksnmp -H $host -C $comm \
              -o 1.3.6.1.4.1.2021.9.1.9.1,.1.3.6.1.4.1.2021.9.1.7.1,\
              .1.3.6.1.4.1.2021.9.1.8.1,.1.3.6.1.4.1.2021.9.1.3.1,\
               .1.3.6.1.4.1.2021.9.1.2.1 -w $warn -c $crit > $OUTFILE");
            my $r = check_outfile();
            @ROOTDISK_VALUE = `cat $OUTFILE|\
               awk '{print \$4 \" \" \$5 \" \" \$6}'`;
            if ($r > 0) {
                if ($STATUS < $r) {
                    $STATUS = $r;
                }
            }
        }

With the core checks out of the way it is time for the main loop:

        # MAIN
        # init our default values; then parse input to see if we want to change any 
        my $load_warn = $DEF_LOAD_WARN;
        my $load_crit = $DEF_LOAD_CRIT;
        my $disk_warn = $DEF_DISK_WARN;
        my $disk_crit = $DEF_DISK_CRIT;
        my $snmp_community = $DEF_SNMP_COMMUNITY;
        my $host;
        while ( my $i = shift @ARGV ) {
            if ($i eq '-u') {
                usage();
                exit (0);
            } elsif ($i eq '-H') {
                $host = shift @ARGV;
            } elsif ($i eq '-lw') {
                $load_warn = shift @ARGV;
            } elsif ($i eq '-lc') {
                $load_crit = shift @ARGV;
            } elsif ($i eq '-dw') {
                $disk_warn = shift @ARGV;
            } elsif ($i eq '-dc') {
                $disk_crit = shift @ARGV;
            } elsif ($i eq '--nodns') {
                $DNS = 0;
            } elsif ($i eq '--ping') {
                $PING = 1;
            } elsif ($i eq '--snmp') {
                $snmp_community = shift @ARGV;
            }
        }
        # there is no spoon...
        if (!$host) {
            print "Error: no host specified\n";
            usage();
            exit (1);
        }
        # if we wanna ping go ahead XXX-jrf do we care about stats?
        if ($PING == 1) {
            preflight($host);
        }
        # if we wanna resolve then resolve
        if ($DNS == 1) {
            dns($host);
        }
        # Call checks
        load($host,$load_warn, $load_crit,$snmp_community);
        systime($host,$snmp_community);
        rootdisk($host,$disk_warn,$disk_crit,$snmp_community);
        # were all good - go ahead and exit
        check_exit ("",$STATUS);

2.2.0 The C Programming Language

The C programing level is arguably the most popular systems programming language available to date. It was invented by UNIX pioneers Brian Kernighan and Dennis Ritchie with the goal of being easily ported from hardware platform to hardware platform while still being efficient. In programming terms it can be viewed as being a step above assembly language but (and slightly so) below programming languages that have built in object oriented pragmas.

The C language is also strongly typed; in the previous languages all variables and data structure datas have been implied by context. In C types are declared with the variables so the compiler knows how to treat them.

The C language also requires a compilation stage whereas in Perl and the shell one simply makes the source file an executable, a C program must be built first then the resulting executable object can be executed. C also depends on a large set of libraries using the include directive. Lastly; C also has strong MACRO creatiom capabilities. This text will not cover the details of includes and MACROS: it is assumed the reader already has a basic understanding of compiling and executing a C program.

As a quick refresher however, following is a classic hello world program save in the hello_world.c file:

#include <stdio.h>

        int main (void) 
        {
                printf("Hello world\n");
                return 0;
        }

To compile then execute simply:

        cc hello_world.c -o hello
        ./hello
        Hello world

2.2.1 Small C Programs

Not unlike any programming language the size and power of C programs is not relevant to the amount of code required to create a usable program. So once again the jump off will be small but pretty potent little programs. The difference in C versus previous languages is that C can make excellent use of libraries available.

2.2.1.i Print User Information

4.4-BSD and Net/2-BSD based systems come with a great userinfo program that prints out the GECOS field of a user in a terminal. Unfortunately not all Unix systems ship this program. Time to write one with the ability to specify multiple users. First up the included header files from the system's standard library and a defined constant:

        #include <grp.h>        /* Header file with group info   */
        #include <pwd.h>        /* The password db header        */

        #include <sys/types.h>  /* System internal types         */

        /* The rest are for printing etc. */
        #include <unistd.h>
        #include <stdlib.h>
        #include <string.h>
        #include <stdio.h>

        #define PACKAGE "pwuser" /* The name of the program */

Easy enough, now onto prototypes. In C we need to make sure the parts of the program that need to know about other functions in fact can. Declaring prototypes also provides a sort of road map to how a particular program is constructed:

        void getuserinfo (char *username); /* Get and print user info  */
        void usage (void);                 /* In case someone needs it */

Now it is time for the central algorithm of the program, getting and printing the user information. The program leverages the standard library functions for accessing system databases and assigns them to another standard library data structure; using the data structure it then prints out the data:

        /* Since we do the print here we do not need to return anything */
        void getuserinfo (char *username)
        {
            struct passwd *user_pwd_info; /* The pwd-db entries */
            struct group  *user_grp_info; /* The group entries  */
            char          **user_members; /* List of groups     */

            user_pwd_info = getpwnam(username); /* Access the username */

            /* Ooops - couldn't find them */
            if (!user_pwd_info) {
                printf("Could not find info about user %s\n", username);
                return;
            }

                /* Get group info */
            user_grp_info = getgrgid(user_pwd_info->pw_gid);

                /* Now print it all out in one shot */
            printf("UserID: %d\n"
                "GroupID: %d\n"
                "Username: %s\n"
                "Default Group: %s\n"
                "Home Directory: %s\n"
                "Default Login Shell: %s\n"
                "Misc Information: %s\n",
                user_pwd_info->pw_uid, user_pwd_info->pw_gid,
                user_pwd_info->pw_name, user_grp_info->gr_name,
                user_pwd_info->pw_dir, user_pwd_info->pw_shell,
                user_pwd_info->pw_gecos);
        }

It looks daunting but truly isn't. The easiest of the two functions is the usage print.

        void usage (void)
        {
            printf( PACKAGE " [user1 user2 user3...]\n"
                PACKAGE " usage\n"
                );
        }

Not too difficult at all. Lastly is the main function - as per the norm it is heavily commented for instructional use:

        /* We use argc as the array counter and argv is the array of
           possible usernames requested                              */
        int main (int argc, char *argv[])
        {
                int c; /* Index counter for looping over the names */

                        /* If someone forgot usernames */
                if (argc <= 1) {
                     printf("Syntax error\n"); /* Ooopsee */
                     usage(); /* Print the usage message */
                     return 1; /* let the shell know something is wrong */
                }

                        /* If someone didn't know how to use the program */
                if (strcmp(argv[1], "usage") == 0) {
                        usage(); /* Print the usage message */
                        return 0; /* They wanted to know so exit 0 */
                }

                        /* Loop over the names; call getuserinfo for each one */
                for (c = 1; c < argc; c++) {
                        getuserinfo(argv[c]);

                                        /* Tack on an extra line in between names */
                        if (argv[c + 1])
                                printf("\n");

                }

                return 0; /* We did good - exit 0 */
        }

2.2.1.ii etu v 0.0.1

As stated in the beginning of the C section, the C language can make excellent use of programming libraries. One excellent set of libraries comes from the enlightenment project (http://enlightenment.org/). The enlightenment project has a jpeg thumnailing library called epeg. In the most basic form epeg can take a jpeg file and rescale it (generally smaller). In the following ethumb.c program an extremely simple example of creating a thumbnailer in less than 21 lines of code:

#include "Epeg.h" /* This has to be installed ! */

        int main(int argc, char **argv) /* Point to the argv array this time */
        {
                Epeg_Image *image; /* An empty image instance */

                if (argc != 3) { /* There must be 3 imputs described below */
                        printf("Usage: %s input.jpg thumb.jpg\n", argv[0]);
                        return 1; /* Oops */
                }

           image = epeg_file_open(argv[1]); /* Call epeg to open the file */

                if (!image) { /* If no image error with a -1 */
                        printf("Cannot open %s\n", argv[1]);
                        exit(-1);
                }
  
                epeg_decode_size_set           (image, 128, 96); /* WidthxHeight     */
                epeg_quality_set               (image, 75);      /* Quality          */
                epeg_file_output_set           (image, argv[2]); /* Set the new name */
                epeg_encode                    (image);          /* Encode to new    */
                epeg_close                     (image);          /* Close            */
   
                return 0; /* All conditions normal ... */
        }

Pretty easy! Of course the enlightenment libraries and epeg libs in particular need to be installed in order for it to work. It is easy to see how the ethumb program could be expanded - and it is later on in this section.

2.2.2 File and I/O Examples in C

Reading from and writing to files in C is actually very similar to Perl (or is that the other way around?) with the exception of typing. In the next two examples; only reading of system information is done and reformatted out to the screen.

2.2.2.i lscpu

Linux systems come with a very nice interface which provides a great deal of information about a system using a pure file interface instead of calling system call APIs or using the BSD sysctl interface: a virtual filesystem call /proc which along with process information contains hardware and kernel data. Both of the example programs utilize /proc by reading data out of it and reformatting it for easier (convientent) use. The first of these programs is called lscpu which does exactly what it sounds like:

        ./lscpu 
        Processor Information for argos
        OS: Linux version 2.6.26-1-amd64
        CPU 0 is Processor Type: AuthenticAMD  AMD
        Processor Speed in MHz: 1000.000
        Processor Cache Size: 512
        Processor Speed in Bogomips: 2011.10
        CPU 1 is Processor Type: AuthenticAMD  AMD
        Processor Speed in MHz: 1000.000
        Processor Cache Size: 512
        Processor Speed in Bogomips: 2011.10
        RTC Current Time: 21:09:02      RTC Date: 2009-01-10
        RTC Periodic Frequency: 2048    RTC Battery Status: okay

lscpu lists out cpu, OS and real time clock information. Believe it or not it takes a lot of reading to get this information. First a look at one of the smaller functions that reads some information then a full program listing will follow. Here are the headers and definitions:

        #include <errno.h>
        #include <stdio.h>
        #include <stdlib.h>
        #include <string.h>
        #define MAXLEN 1024

        /* the file handles we will be using */
        static char *pfh[]=
        {
            "/proc/cpuinfo",
            "/proc/sys/kernel/hostname",
            "/proc/sys/kernel/ostype",
            "/proc/sys/kernel/osrelease",
            "/proc/driver/rtc",
        };

        /* Function Prototypes */
        void error_output_mesg(char *locale);
        void get_hostname_info(void);
        void get_ostype_info(void);
        void get_osrelease_info(void);
        void get_cpu_info(void);
        void get_rtc_info(void);

A lot more functions than dealt with so far. Here is the error output function:

        void error_output_mesg(char *locale)
        {
            fprintf(stderr, ("%s: %s\n"), locale, strerror(errno));
            exit (1);
        }

Next a detailed look at the hostname get function, the rest of the functions all work similar to this one except they are scanning in different parts of different files. The pfh structure is used relative to each file needed in each function:

        void get_hostname_info()
        {
            FILE *hostfp;  /* A filehandle for the host info file */
            static char hostch[MAXLEN]; /* A buffer for the file */
            static char hostname[MAXLEN]; /* The resulting hostname */

                /* Try to open the file or fail */
            if((hostfp = fopen(pfh[1], "r")) == NULL)
                error_output_mesg(pfh[1]);

                /* While the file is open scan in the hostname field from it 
           then assign it to the hostname variable; finally print it out */
            while(fgets(hostch, MAXLEN, hostfp) != NULL) {
                sscanf(hostch, "%s", hostname);
                printf("Processor Information for %s\n", hostname);
            }

            fclose(hostfp);
        }

Now for the rest of the helper functions. Note that they are all similar to get_hostname() just gathering up and printing more information:

        void get_ostype_info()
        {
            FILE *osfp;
            static char osch[MAXLEN];
            static char ostype[MAXLEN];

            if((osfp = fopen(pfh[2], "r")) == NULL)
                error_output_mesg(pfh[2]);

            while(fgets(osch, MAXLEN, osfp) != NULL) {
                sscanf(osch, "%s", ostype);
                printf("OS: %s", ostype);
            }

            fclose(osfp);
        }

        void get_osrelease_info()
        {
            FILE *osrfp;
            static char osrch[MAXLEN];
            static char osrelease[MAXLEN];

            if((osrfp = fopen(pfh[3], "r")) == NULL)
                error_output_mesg(pfh[3]);

            while(fgets(osrch, MAXLEN, osrfp) != NULL) {
                sscanf(osrch, "%s", osrelease);
                printf(" version %s\n", osrelease);
            }

            fclose(osrfp);
        }

        void get_cpu_info()
        {
            FILE *cpufp;
            static char ch[MAXLEN];
            char line[MAXLEN];

            if ((cpufp = fopen(pfh[0], "r")) == NULL)
                error_output_mesg(pfh[0]);

            while (fgets(ch, MAXLEN, cpufp) != NULL) {
                if (!strncmp(ch, "processor", 9)) {
                    sscanf(ch, "%*s %*s %s", line);
                    printf("CPU %s", line);
                } else if (!strncmp(ch, "vendor_id", 9)) {
                    sscanf(ch,"%*s %*s %s", line);
                    printf(" is Processor Type: %s ", line);
                } else if (!strncmp(ch, "model name", 10)) {
                    sscanf(ch, "%*s %*s %*s %s", line);
                    printf(" %s\n", line);
                } else if (!strncmp(ch, "cpu MHz", 7)) {
                    sscanf(ch, "%*s %*s %*s %s", line);
                    printf("Processor Speed in MHz: %s\n", line);
                } else if (!strncmp(ch, "cache size", 10)) {
                    sscanf(ch, "%*s %*s %*s %s", line);
                    printf("Processor Cache Size: %s\n", line);
                } else if (!strncmp(ch, "bogomips", 8)) {
                    sscanf(ch, "%*s %*s %s", line);
                    printf("Processor Speed in Bogomips: %s\n", line);
                }
            }
            fclose(cpufp);
        }

        void get_rtc_info()
        {
            FILE *rtcfp;
            static char ch[MAXLEN];
            char line[MAXLEN];

            if ((rtcfp = fopen(pfh[4], "r")) == NULL)
                error_output_mesg(pfh[4]);

            /* Just grab stuff from a certain position and dump it to stdout */
            while (fgets(ch, MAXLEN, rtcfp) != NULL) {
                if (!strncmp(ch, "rtc_time", 8)) {
                    sscanf(ch, "%*s %*s %s", line);
                    printf("RTC Current Time: %s\t", line);
                } else if (!strncmp(ch, "rtc_date", 8)) {
                    sscanf(ch, "%*s %*s %s", line);
                    printf("RTC Date: %s\n", line);
                } else if (!strncmp(ch, "periodic_freq", 13)) {
                    sscanf(ch, "%*s %*s %s", line);
                    printf("RTC Periodic Frequency: %s\t", line);
                } else if (!strncmp(ch, "batt_status", 11)) {
                    sscanf(ch, "%*s %*s %s", line);
                    printf("RTC Battery Status: %s\n", line);
                }
            }
            fclose(rtcfp);
        }

Ironically with all of the functions doing so much work and not needing to return anything the main() function ends up looking pretty boring:

        int main(void)
        {
            get_hostname_info();
            get_ostype_info();
            get_osrelease_info();
            get_cpu_info();
            get_rtc_info();

            printf("\n");

                return 0;
        }

So why use this program when awk or Perl would do just as well? Good question, the program was authored when CPU cycles were still somewhat expensive so C was the choice at the time, nowadays a good shell script can do the job just as well.

2.2.2.ii mmw

Micro Memory Watcher or mmw is a clone of the free program that is slightly prettier. Mmw has a lot more complexity than the lscpu program. In addition to reading and printing some data from /proc it also:

First up in mmw the includes, definitions, prototypes and helper functions:

        #include <ctype.h>
        #include <getopt.h>
        #include <stdio.h>
        #include <errno.h>
        #include <string.h>
        #include <unistd.h>
        #include <stdlib.h>

        #define MAXLEN 256
        #define PROGRAM "mmw" 
        #define VERSION "1.9"
        #define MEMINFO "/proc/meminfo"

        void version(void);
        void usage(void);
        void print_head(unsigned long int fws, char * units, unsigned int dsize);
        void read_meminfo(unsigned int polls, unsigned int interval,
                            unsigned int sf, unsigned int hflag, unsigned int dsize);

        /* simple reusable version print */
        void version() { printf("%s %s\n", PROGRAM, VERSION); }

        /* reusable usage print */
        void usage()
        {
            printf("usage: %s [options args]\n", PROGRAM);
            printf("usage: %s [-h|--human][-i|--interval][-p|--polls POLLS]\n",PROGRAM);
            printf("   [-u|--usage][-v|--version]\n");
            printf("options:\n");
            printf("  -h|--human             Human readable format.\n");
            printf("  -i|--interval SECONDS  Seconds between polls.\n");
            printf("  -p|--polls    NPOLLS   Times to poll.\n");
            printf("  -s|--swap              Poll swap information as well.\n");
            printf("  -u|--usage             Print usage message.\n");
            printf("  -v|--version           Print version and exit.\n");
        }

The first part of the source file should all seem pretty familar. In order to illustrate what the information is, for mmw a tabular format is used since it can poll (similar to vmstat) so a header is needed. Because the data can be of variable size an if/else ladder determines the print width of the fields:

        void print_head(unsigned long int fws, char * units, unsigned int dsize)
        {
            int count;

            static char *header[]=
            {
                "total",
                "free",
                "shared",
                "buffer",
                "cached",
                "swap",
                "sfree",
            };

            printf("Memory Usage in: %s\n", units);

            /* determine the field width for the header */
            if(fws <= 100000) {
                for(count = 0; count <= dsize; count++)
                    printf("%-8s", header[count]);
            } else if(fws <= 100000000) {
                for(count = 0; count <= dsize; count++)
                    printf("%-11s", header[count]);
            } else if(fws > 100000000) {
                for(count = 0; count <= dsize; count++)
                    printf("%-14s", header[count]);
            } else {
                for(count = 0; count <= dsize; count++)
                    printf("%-15s", header[count]);
            }

            printf("\n");
        }

Since the core algorithm is more complex than previous examples it is being saved for last, instead, now a jump into how the main() function looks. There are some constructs here worth study:

There is a lot of commenting to help in this larger than usual main():

        int main(int argc, char *argv[])
        {
                /*
                 * c = index for switch and getopt
                 * hflag = Human readable format?
                 * dsize = default field width size
                 * poll = how many times to poll
                 * interval = interval in seconds to poll
                 */
            int c, hflag, interval, poll, dsize;

            /* Defaults */
            interval = 5;
            poll     = 5;
            hflag    = 0;
            dsize    = 4;

                /* If no arguments are specified do one pass and exit */
            if(argc == 1) {
                read_meminfo(poll, interval, 1, hflag, dsize);
                return 0;
            }

                /* 
                 * Now we do a parsing loop using the GNU getopt capability
                 * A structure is setup containing names of the  switches
                 * and their short letter versions.
                 */
            while (1) {
                static struct option long_options[] =
                    {
                        {"human",       no_argument,        0,  'h' },
                        {"interval",    required_argument,  0,  'i' },
                        {"poll",        required_argument,  0,  'p' },
                        {"swap",        no_argument,        0,  's' },
                        {"version",     no_argument,        0,  'v' },
                        {"usage",       no_argument,        0,  'u' },
                        {0,0,0,0} /* This is a filler for -1 */
                    };

                int option_index = 0; /* the option index counter */

                        /* call getopt long to fill out which options are being used */
                c = getopt_long (argc, argv, "hi:p:svu", long_options, &option_index);

                if (c == -1) break; /* break out when we have counted down */

                switch(c) {
                    case 'h':
                        hflag = 1;
                        break;
                    case 'i':
                        if (isalpha(*optarg)) { /* make sure it is a number! */
                            fprintf(stderr, "Error: interval must be a number\n");
                            usage();
                            exit (2);
                        }
                        interval = atoi(optarg);
                        break;
                    case 'p':
                        if (isalpha(*optarg)) {
                            fprintf(stderr, "Error: poll must be a number\n");
                            usage();
                            exit (2);
                        }
                        poll = atoi(optarg);
                        break;
                    case 's':
                                        /* If we want swap then the dsize needs to be bigger */
                        dsize = (dsize + 2);
                        break;
                    case 'v':
                        version();
                        return 0;
                        break;
                    case 'u':
                        usage();
                        return 0;
                        break;
                    default:
                        usage();
                        exit (2);
                        break;
                }
            }

                /* Run the read_meminfo callback function - recurse until done */
            read_meminfo(poll, interval, 1, hflag, dsize);

            return 0;
        }

Again; looks like a lot but upon closer examination the code is pretty clear. Time for the fun (okay maybe not so fun) part of the core algorithm. Read meminfo does what it says, it reads and reformats data from /proc/meminfo but it does so recursively. Essentially it will call itself until the number of polls it passes to itself has elapsed. This is nothing more than clever programming and saved the author the hassle of having to write loops in other parts of the program. In this particular program recursion saves programmer time and system time, however, often recursion can be leveraged to do both.

The core algorithm is pretty big and needs a lot of setup so instead of the usual print it and comment it is broken up into several chunks with commenting and discussion points after each section:

        /* recursively prints out mem information */
        void read_meminfo(unsigned int polls, unsigned int interval,
                            unsigned int sf, unsigned int hflag, unsigned int dsize)
        {
            FILE *fp;                         /* The file handle for this program */
            static char ch[MAXLEN];           /* Charachter buffer                */
            unsigned long int mem_array[7];   /* Data structure to hold meminfo   */
            unsigned int count;               /* A counter                        */
            long int hdiv;                    /* Used for division                */
            char * units;                     /* The units being printed          */
            short int i;                      /* Positional number                */

            units = "kB"; /* The default units from the file */
            hdiv  = 1;    /* Default value                   */

            /* initialize the array */
            for (i = 0; i <= (dsize + 1); i++)
                mem_array[i] = 0;

            if(polls != 0) {
                sleep(interval); /* Go ahead and sleep unless done */

                        /* make sure we can read the file! */
                if((fp = fopen(MEMINFO, "r")) == NULL)
                    fprintf(stderr, "could not open file %s\n", MEMINFO);

Note that there is a lot of information being sent to the function. Also unlike previous functions there is a lot of setup involved. Next the mem_array data structure is populated using data from the /proc/meminfo file in a similar fashion to getting information for lscpu:

                while(fgets(ch, MAXLEN, fp) != NULL) {
                    if(!strncmp(ch, "MemTotal:", 9)) {
                        sscanf(ch + 10, "%lu", &mem_array[0]);
                    } else if(!strncmp(ch, "MemFree:",8)) {
                        sscanf(ch + 10, "%lu", &mem_array[1]);
                    } else if (!strncmp(ch, "MemShared:", 10)) {
                        sscanf(ch + 10, "%lu", &mem_array[2]);
                    } else if(!strncmp(ch, "Buffers:", 8)) {
                        sscanf(ch + 10, "%lu", &mem_array[3]);
                    } else if(!strncmp(ch, "Cached:", 7)) {
                        sscanf(ch + 10, "%lu", &mem_array[4]);
                    } else if((dsize > 4) && (!strncmp(ch, "SwapTotal", 9))) {
                        sscanf(ch + 10, "%lu", &mem_array[5]);
                    } else if((dsize > 4) && (!strncmp(ch, "SwapFree", 8))) {
                        sscanf(ch + 10, "%lu", &mem_array[6]);
                    }
                }

                fclose(fp); /* all done - close /proc/meminfo */

The next chunk of code deals with the human readable flag; the program must determine the units to print in by easily looking at the value of all the memory. Note it also resets the divisor parameter to be used later on in field width parameters:

                if (hflag) {
                    if(mem_array[0] <= 999) {
                        hdiv = 1;
                    } else if(mem_array[0] <= 999999) {
                        hdiv = 1000;
                        units = "MB";
                    } else if(mem_array[0] <= 999999999) {
                        hdiv = (1000000);
                        units = "GB";
                    } else {
                        hdiv = (1000000000);
                        units = "TB";
                    }
                }

Time for the last piece of the core algorithm; determine the field width needed and print:

                /* If this is the first line print the header */
                if(sf == 1) print_head(mem_array[0]/hdiv, units, dsize);

                /* determine the field width for each printout */
                if(mem_array[0]/hdiv <= 100000) {
                    for(count = 0; count <= dsize; count++)
                        printf("%-8li", mem_array[count]/hdiv);
                } else if(mem_array[0]/hdiv <= 100000000) {
                    for(count = 0; count <= dsize; count++)
                        printf("%-11li", mem_array[count]/hdiv);
                } else if(mem_array[0]/hdiv > 100000000) {
                    for(count = 0; count <= dsize; count++)
                        printf("%-14li", mem_array[count]/hdiv);
                } else {
                    for(count = 0; count <= dsize; count++)
                        printf("%-15li", mem_array[count]/hdiv);
                }

                printf("\n");

Finally - call ourself and close the function out:

                read_meminfo(polls - 1, interval, 0, hflag, dsize);
            }
        }

Here is some sample output of mmw:

        [19:12:33 jrf@argos:~/src/mmw]$ ./mmw 5 5
        Memory Usage in: kB
        total      free       shared     buffer     cached     
        4030544    41724      0          98432      3120940    
        4030544    41740      0          98440      3120960    
        4030544    41756      0          98448      3120968    
        4030544    41748      0          98452      3120964    
        4030544    41872      0          98460      3120952    
        [19:13:10 jrf@argos:~/src/mmw]$ ./mmw -h -i 5 -p 3
        Memory Usage in: GB
        total   free    shared  buffer  cached  
        4       0       0       0       3       
        4       0       0       0       3       
        4       0       0       0       3       

2.2.3 Networking and OS C Programs

For the systems and networking programs it is time to take a look at the level of detail the C program language can get a programmer to; this is especially true on BSD, Linux and Unix systems since C was invented both on and for Unix.

2.2.3.i Making Forks

The various system calls on Unix and Unix-like systems can provide rich detailed information about how an Operating System is performing. The following example is a very small program that is designed to create a false load using forks. It was part of the 4.4 BSD Operating System curriculumn when it was still taught at colleges and corporate campuses in the 1990s.

        #include <sys/wait.h>
        #include <stdio.h>
        #include <stdlib.h>
        #include <unistd.h>

        int
        main(argc, argv)
            int argc;
            char    *argv[];
        {
            register int nforks, i;
            char *cp;
            int pid, child, status, brksize;

            if (argc < 2) { /* If not enough arguments - bail */
                printf("usage: %s number-of-forks sbrk-size\n", argv[0]);
                exit(1);
            }

            nforks = atoi(argv[1]); /* Check that number of forks will work */
            if (nforks < 0) {
                printf("%s: bad number of forks\n", argv[1]);
                exit(3);
            }

            brksize = atoi(argv[2]); /* Check the break size */
            if (brksize < 0) {
                printf("%s: bad size to sbrk\n", argv[2]);
                exit(3);
            }

            cp = (char *)sbrk(brksize); /* Setup cp */
            if ((int)cp == -1) {
                perror("sbrk");
                exit(4);
            }
        
            for (i = 0; i < brksize; i += 1024)
                cp[i] = i;

            while (nforks-- > 0) { /* Spin through fork generation */
                child = fork();
                if (child == -1) {
                    perror("fork");
                    exit(-1);
                }
                if (child == 0)
                    exit(-1);
                while ((pid = wait(&status)) != -1 && pid != child)
                    ;
            }

            return 0;
        }

Note the succincness of the forks generation program. it is small, compact, simple, does not have a great deal of dependencies but is incredibly powerful and has the potential to even be somewhat dangerous.

2.2.3.ii A Tiny Packet Sniffer

Converse to the tiny forks program is a powerful tool indeed, a packet sniffer that utilizes the libpcap library (http://tcpdump.org/). In a more full fledged sniffer there would be a great deal more work; but in the smallest form a sniffer can actually be even smaller than the lscpu program. First the header files and another callback function:

        #include <pcap.h>
        #include <string.h>
        #include <stdlib.h>

        #define MAXBYTECAP 2048 /* The maximum bytes to capture - not packets */

        void pcap_callback(u_char *arg, const struct pcap_pkthdr * pkthdr,
                                                const * packet)
        }
        
                int i = 0;
                int * counter = (int *) arg; /* typecast arg as counter */

                printf("Packet Count: %d\n", ++(*counter));
                printf("Received Packet Size: %d\n", pkthdr->len);
                printf("Payload:\n");
                for (i = 0; i < pkthdr->len; i++) {
                        if (printf("%c ", packet[i]);
                else
                        printf(". ");

                if ((i%16 == 0 && i != 0) || i == pkthdr-> len - 1)
                        printf("\n");
                }

                return;
        }

There are some interesting constructs in the above code. Note that the key is the function is compact and does just what it says - prints out packet data only. Of course it could be massively expanded to include all sorts of information. Also note for brevity there are no prototypes. Sometimes if a program is small enough prototypes are not needed: the code is self explanatory since it is such a small amount. Now the main() function:

        int main() 
        {
                int i = 0, count = 0;
                pcap *descr = NULL;
                char errbuf(PCAP_ERRBUF_SIZE), *device = NULL;
                memset(errbuf, 0, PCAP_ERRBUF_SIZE);

                device = pcap_lookupdev(errbuf); 
                printf("Opening device: %s\n", device);
                descr = pcap_open_live(device, MAXBYTECAP, 1, 512, errbuf);
                pcap_loop(descr, -1, pcap_callback, (u_char *)&count);

                return 0;
        }

If it is not clear, the count controls the number of reads. The callback function runs continously due to the -1. To see how the pcap functions work if libpcap (and the development files) are installed on your system simpy type

man pcap

And read on - there is a great deal one can do with libpcap!

2.2.4 C Program: The Enlightenment Thumbnailing Utility

The Enlightenment Thumbnailing Utility or the short version - etu - is a more fleshed out version of the earlier ethumb program. In this program there is a separate header file and compact well contained functions. Additionally this fuller fledged program can do other image types by utilizing the imlib2 image rendering library. First up the header file:

        #ifndef ETUH /* We need to use this in case later we add other source */
        #define ETUH /* files  to the program and call the header many times */

        #include <errno.h>
        #include <getopt.h>
        #include <unistd.h>
        #include <stddef.h>
        #include <stdio.h>
        #include <stdlib.h>
        #include <string.h>
        #include <dirent.h>
        #include <argp.h>

        #include <sys/types.h>
        #include <sys/stat.h>

        #include <Imlib2.h>

        #include <Epeg.h>

        #define PACKAGE "etu"

        /* 
         * Prototypes - if we were to add other source files we might want
         * to move some of these into the source file where they are used
         * only. Since we only have one source file though we are putting them
         * here to make our roadmap
         */
        void update_image    (char *, char *, int, int, int);
        void update_rescaled (char *, char *);
        void scale_epeg      (char *, char *, int, int, int);
        void scale_imlib2    (char *, char *, int, int, int);
        void usage           (void);
        int  check_handle    (char *);
        char *fullpath       (char *, char *);
        char *gettype        (char *);

        #endif /* We close off our compile time check */

Note the comment about prototypes - it is a very true statement - you should not make available any more information about a program than need be. In the case of etu, since it consists of its build header file and program file showing the prototypes is perfectly acceptable. If one added other source files to the program then it might make more sense to move prototypes into source files that they are exclusively used by and put only shared prototypes in the common header files.

Now onto the main program which even though it does a great deal is not nearly as duanting as it appears to be. Instead of following the order of functions a look at functions by type to gear up towards the more complex ones is a better approach. First up are some utility functions - they are:

        int check_handle(char *dir)
        {
            struct stat statbuf; /* using gnu libs we make sure it is there */

            if (stat(dir, &statbuf) != 0)
                return 1;

            return 0;
        }

        char *fullpath(char *file, char *dir) /* Establish a full path */
        {
            static char path[PATH_MAX];

            strcpy(path, dir);
            strcat(path, "/");
            strcat(path, file);

            return (path);
        }

        char *gettype(char * img_src) /* here we leverage imlib to get the type */
        {
            char *image;
            char *format;

            image = imlib_load_image(img_src);
            imlib_context_set_image(image);
            if (( format = imlib_image_format()) == NULL) {
                    fprintf(stderr, "Internal error\n");
                    return (NULL);
            }

            imlib_free_image();
            return (format);
        }

Those three functions can be considered utility functions or helper functions. They probably could have been somewhere else or even combined but for simplicity they are factored down to the simplest form to allow the program core to be easily read and manipulated by a programmer. The next function is the usage message for the program:

        void usage(void)
        {
            printf( PACKAGE " [options][arguments]\n"
                    PACKAGE " [-D|--daemonize interval]\n"
                    PACKAGE " [-d|--dir   dir][-s|--src][-h|--height height]\n"
                    PACKAGE " [-w|--width width][-q|--quality percent]\n"
                    PACKAGE " [-f|--file  filename][-o|--output filename]\n"
                            "Single file Options:\n"
                            " -f|--file   file  Single input image filename\n"
                            " -o|--output file  Output image filename\n"
                            "Directory Options:\n"
                            " -s|--src dir   Original images directory\n"
                            " -d|--dir dir   Output directory\n"
                            "Global  Options:\n"
                            " -h|--height  size  Height in pixels      default: 96px\n"
                            " -q|--quality level Quality level (1-100) default: 75%\n"
                            " -w|--width   size  Width in pixels       default: 128px\n"
            );
        }

Straigtforward atomic print. Now onto a set of algorithms. Unlike most of the other C programs in this one there is a reliance upon a set of fall through checks that decide exactly which rendering engine to choose. If the image is a jpeg then call out epeg otherwise call out imlib2 - the catch is if the image has already been scaled in the destination directory then skip it. The program also needs to determine if it is a file or directory: which leads to jumping (awkwardly) to the main() function which is broken up - first the headers, declaration and all of the local variables with their defaults:

#include "etu.h"

        int main(int argc, char **argv)
        {
            int  c;           /* opt counter                                 */
            int  interval;    /* interval for daemon mode                    */
            int  dst_height;  /* height of the destination image(s)          */
            int  dst_quality; /* quality of the destination image(s)         */
            int  dst_width;   /* width of the destination image(s)           */
            char *dst_dp;     /* The directory that new images will be in    */
            char *src_dp;     /* Where the source images live                */
            char *src_fp;     /* Individial handle for one input             */
            char *dst_fp;     /* Individual file handle for output           */
            char *type;       /* Image type by string                        */

            /* Defaults */
            interval    = 0;
            src_fp      = NULL;
            dst_fp      = NULL;
            src_dp      = NULL;
            dst_dp      = NULL;
            dst_height  = 96;
            dst_quality = 75;
            dst_width   = 128;
            type        = NULL;

Look at all that stuff... the program has the capability of either rescaling a single image or an entire source directory to destination directory of images. The catch is if it is doing source to destination images it wants to check to see if the image is already in the destination - why? because that means it can also be used as a thumnailing cache engine that runs right from the command line. Now onto the options parser - again GNU getoptlong() and associated data structures are used:

            /* 
             * Options parsing:
             * d -dir     Destination directory for a set of input images
             * h -height  Height of the destination image(s)
             * f -file    Input image file
             * o -output  Output file
             * q -quality Set the quality of the destination image(s)
             * s -src     Source directory of original files
             * w -width   Width of the destination image(s)
             */
            while (1) {
                static struct option long_options[] = {
                    {"daemon",      required_argument,  0,  'D'},
                    {"dir",     required_argument,  0,  'd'},
                    {"height",  required_argument,  0,  'h'},
                    {"file",    required_argument,  0,  'f'},
                    {"output",  required_argument,  0,  'o'},
                    {"quality", required_argument,  0,  'q'},
                    {"src",     required_argument,  0,  's'},
                    {"width",   required_argument,  0,  'w'},
                    {0,0,0,0}
                };

                int option_index = 0;

                c = getopt_long (argc, argv, "D:d:f:h:o:q:s:w:",
                            long_options, &option_index);

                if (c == -1)
                    break;

                switch (c) {
                    case 'D':
                        interval = atoi(optarg);
                        break;
                    case 'd':
                        dst_dp = optarg;
                        break;
                    case 'h':
                        dst_height = atoi(optarg);
                        break;
                    case 'f':
                        src_fp = optarg;
                        break;
                    case 'o':
                        dst_fp = optarg;
                        break;
                    case 'q':
                        dst_quality = atoi(optarg);
                        break;
                    case 's':
                        src_dp = optarg;
                        break;
                    case 'w':
                         dst_width = atoi(optarg);
                        break;
                    default:
                        usage();
                        return 1;
                        break;
                  }
            }

The switch case and getopt bits should look familar - not unlike the mmw program. There are some decisions that have to be made and instead of a lot of text - rely upon solid commenting to explain:

            /* Run through a battery of checks before calling the main updater */
                /* if this is a single image file - just pre-check now and handle  */
                /* determine the type then call the right function to scale it     */
            if ((src_fp != NULL) && (dst_fp != NULL))  {

                type = gettype(src_fp); /* call gettype to get the image type */

                if (strcmp(type, "jpeg") == 0)  { /* if jpeg - use epeg */
                    scale_epeg(src_fp, dst_fp,
                            dst_height, dst_quality, dst_width);
                } else { /* if not jpeg fall through to imlib */
                    scale_imlib2(src_fp, dst_fp, dst_height, dst_quality, dst_width);
                }

                return 0;
              }

                /* It is a directory not just an image - make sure the dir exists! */
            if (check_handle(src_dp) != 0)  {
                fprintf(stderr, "No input directory specified\n");
                usage();
                return 1;
            }

                /* Now we have to make sure that the dest dir is there and if not
               go ahead and create it */
            if (check_handle(dst_dp) != 0)
                if (mkdir(dst_dp, 0755)) {
                    fprintf(stderr,
                        "Could not create directory %s\n", dst_dp);
                    return 1;
                }

One of the obvious things an astute reader may have picked up by now is the daemonize option. In the next section of code - if there is an interval defined: setup and execute a fork which calls the image rendering functions - otherwise call the same rendering functions once:

            if (interval) {
                pid_t pid, sid;

                pid = fork();

                if (pid < 0) {
                    exit (EXIT_FAILURE);
                } else if (pid > 0) {
                    exit (EXIT_SUCCESS);
                }

                umask (0);

                sid = setsid();

                if (sid < 0)
                    exit (EXIT_FAILURE);

                if ((chdir("/")) < 0)
                    exit (EXIT_FAILURE);

                while (1) {
                    update_image(src_dp, dst_dp, dst_height, dst_quality, dst_width);
                    update_rescaled(src_dp, dst_dp);
                    sleep(interval);
                }

                exit(EXIT_SUCCESS);
            } else {
                update_image(src_dp, dst_dp, dst_height, dst_quality, dst_width);
                update_rescaled(src_dp, dst_dp);
            }

            return 0;
        }

In the preceding code notice the setsid() call; in the Perl section there was also a setsid call. When daemonizing setsid() is very much a constant but note how when daemonizing it is used but in the brute force forks program mentioned earlier it is not used on purpose. This is a difference of function and is important to notice in any program. The purpose of etu is to provide a clean quick method to rescale images (usually down) but the forks program is deliberately meant to fork and slam the Operating System as quickly and - if need be - messy as possible. As a programmer you are beholden to one rule and that is the program should do what it was meant to do as best as you can make it. Do not let dogma stand in the way of function - ever. In the case of etu being very safe and deliberate is a good thing ... conversely in the case of forks the opposite is true.

If only it were simple - there is still a lot to do. The update_image() function is the beginning of the end of our exhaustive program. it decides how to deal with each file in the associated directories:

        void update_image(char *images, char *dst, int height, int quality, int width)
        {
            int i;
            char dst_path[PATH_MAX]; /* The destination dir */
            char src_path[PATH_MAX]; /* The source dir      */
            char * type;             /* What type of image? */
            struct dirent *src_dp;   /* Directory           */
            DIR * src_dp_handle;     /* .. and the handle   */

            src_dp_handle = opendir(images); /* Open the directory */

            i = 0; /* init our counter */

                /* Open up the directory and have at it ....*/
            while (src_dp = (readdir(src_dp_handle)))  {
                if (i > -1)  {
                    strncpy(dst_path, fullpath(src_dp->d_name, dst),
                        sizeof(src_dp->d_name) + 1);

                                /* make sure everything is okay and setup the destination */
                    if (check_handle(dst_path) != 0)  {
                        strncpy(src_path,
                            fullpath(src_dp->d_name, images),
                            sizeof(src_dp->d_name) + 1);

                                        /* For each image get the type then call the right 
                                          function */
                        type = gettype(src_path);
                        if (strcmp(type, "jpeg") == 0) {
                            scale_epeg(src_path, dst_path,
                                            height, quality, width);
                        } else {
                            scale_imlib2(src_path, dst_path, height,
                                            quality, width);
                        }
                    }
                }

                i++; /* next one please */
             }

            closedir(src_dp_handle);
        }

In a sense that function is the core algorithm and the last functions are more utility functions, they finally execute the decisions of the directory looper. The next function updates images:

        void update_rescaled(char *images, char *dst)
        {
         int i;
         char dst_path[PATH_MAX];
         char src_path[PATH_MAX];
         struct dirent *dst_img;
         DIR * dst_img_handle;

         dst_img_handle = opendir(dst);
         i = 0;
         while (dst_img = (readdir(dst_img_handle))) {
             if (i > -1) {
                 strncpy(src_path, fullpath(dst_img->d_name, images),
                     sizeof(dst_img->d_name) + 1);

                 if (check_handle(src_path) != 0) {
                     strncpy(dst_path, fullpath(dst_img->d_name, dst),
                         sizeof(dst_img->d_name) + 1);

                     unlink(dst_path);
                 }
             }
             i++;
         }
         closedir(dst_img_handle);
        }

Now on to the last two functions, one function scales and imlib2 image and the next a jpeg only:

        void scale_epeg(char *jpeg, char *dstimg, int height, int quality, int width)
        {
            Epeg_Image * jpeg_image;

            jpeg_image = epeg_file_open(jpeg);

            if (!jpeg_image) {
                fprintf(stderr, "Cannot open %s\n", jpeg);
                exit (1);
            }

            epeg_decode_size_set(jpeg_image, width, height);
            epeg_quality_set(jpeg_image, quality);
            epeg_file_output_set(jpeg_image, dstimg);
            epeg_encode(jpeg_image);
            epeg_close(jpeg_image);
        }

        void scale_imlib2(char *src, char *dst, int height, int quality, int width)
        {
            Imlib_Image in_img, out_img;

            in_img = imlib_load_image(src);
            if (!in_img) {
                fprintf(stderr, "Unable to load %s\n", src);
                exit(1);
            }

            imlib_context_set_image(in_img);

            out_img = imlib_create_cropped_scaled_image(0,0, imlib_image_get_width(),
                                                 imlib_image_get_height(),
                                                 width, height);

            if (!out_img) {
                fprintf(stderr, "Failed to create scaled image\n");
                exit(1);
            }

            imlib_context_set_image(out_img);
            imlib_save_image(dst);
        }

... and we are done. Be sure to check the indexes for full program listings.

2.2.5 C Program: Network Decoder - ndecode

The last C program in this section is a packet decoder. Unlike the previous program involving network data this particular program looks exclusively at the payload of a network packet instead of the header. The ndecode program is actually relatively simple considering what it does. Ndecode achieves simplicity by leveraging the pcap library.

First a look at the top of the file, there are a lot of included files in this example and there are some compiler directives that say to include certain directories if we are on the NetBSD system:

        #include <stdio.h>
        #include <stdlib.h>
        #include <string.h>
        #include <sys/types.h>
        #include <sys/time.h>
        #include <netinet/in_systm.h>
        #include <pcap.h>
        #ifndef NETBSD
        #include <net/ethernet.h>
        #endif
        #include <netinet/in.h>
        #include <netinet/ip.h>
        #include <signal.h>
        #include <math.h>
        #include <unistd.h>
        #include <arpa/inet.h>
        #include <sys/socket.h>
        #include <netdb.h>
        #include <semaphore.h>
        #include <fcntl.h>
        #include <getopt.h>
        #include <errno.h>
        #include <netinet/udp.h>
        #include <net/if.h>
        #ifdef NETBSD
        #include <net/if_ether.h>
        #endif
        #include <sys/ioctl.h>
        #include <time.h>

The reason for the directives is some of the structures used are not located in the same location on NetBSD as other systems.

Next is the PACKAGE macro which is just the name of the program:

#define PACKAGE "ndecode"

And now a boundary, note this is not a hard set boundary, but if the user does not specify the program defaults to capturing a maximum of 2048 packets then exiting:

#define MAXBYTES2CAPTURE 2048

Now how exactly should the program be controlled? That is to ask what options and arguments should be programmed? So far it is known that the there should be an option for the number of polls. The program will be contextual to the interface where packets are read so an option to use an interface other than the first one (which is how libpcap automatically figures out which one to use is done) needs to be added. Finally if there will be options then there should also be a usage print:

        /* simple usage message */
        static void usage()
        {
            printf(PACKAGE " [option][arguments]\n"
                   PACKAGE
                   " "
                   "[-i <interface>][-p <number][-u]\n"
                   "Options:\n"
                   " -i <dev>   Specify the interface to watch\n"
                   " -p <int>   Exit after analyzing int polls\n"
                   " -u         Display help\n");
        }

Note that since pcap is being used not unlike the previous example this program can take filter arguments as well. A function to copy off the filter is needed again:

        /*
         * copy_argv: Copy off an argument vector
         *         except it does a lot of printing.
         * requires: argvector
         */
        static char *copy_argv(char **argv)
        {
            char **p;
            u_int len = 0;
            char *buf;
            char *src, *dst;
            p = argv;
            if (*p == 0)
                return 0;
            while (*p)
                len += strlen(*p++) + 1;
            buf = (char *)malloc(len);
            if (buf == NULL) {
                fprintf(stdout, "copy_argv: malloc");
                exit(EXIT_FAILURE);
            }
            p = argv;
            dst = buf;
            while ((src = *p++) != NULL) {
                while ((*dst++ = *src++) != '\0') ;
                dst[-1] = ' ';
            }
            dst[-1] = '\0';
            return buf;
        }

Before moving onto the meat it is time to setup the main part of the program where arguments are parsed, assigned and the filter is set up. This should look familar:

        int main(int argc, char *argv[])
        {
            struct bpf_program program;
            int i = 0;
            pcap_t *descr = NULL;
            char errbuf[PCAP_ERRBUF_SIZE];
            char *filter = NULL;
            char *pcap_dev = NULL;
            int c;
            bpfuint32 mask;
            bpfuint32 net;
            uint32_t npolls = -1;
            while ((c = getopt(argc, argv, "i:p:u")) != -1) {
                switch (c) {
                case 'i':
                    pcap_dev = optarg;
                    break;
                case 'p':
                    if (optarg != NULL && isdigit(*optarg)) {
                        npolls = atol(optarg);
                        if (npolls < 0) {
                            fprintf(stderr,
                                "Packets must be > than 0\n");
                            return EXIT_FAILURE;
                        }
                    } else {
                        fprintf(stderr, "Invalid packet number\n");
                        return EXIT_FAILURE;
                    }
                    break;
                case 'u':
                    usage();
                    return EXIT_SUCCESS;
                    break;
                default:
                    usage();
                    return EXIT_FAILURE;
                    break;
                }
            }
            /* Got root? */
            if (getuid()) {
                fprintf(stderr, "Must be root user.\n");
                return EXIT_FAILURE;
            }
            memset(errbuf, 0, PCAP_ERRBUF_SIZE);
            /* Strip off any none getopt arguments for pcap filter */
            if (!filter)
                filter = copy_argv(&argv[optind]);
            /* Initialize the interface to listen on */
            if ((!pcap_dev)
                && ((pcap_dev = pcap_lookupdev(errbuf)) == NULL)) {
                fprintf(stderr, "%s\n", errbuf);
                return EXIT_FAILURE;
            }
            if ((descr = pcap_open_live(pcap_dev, 68, 0, 0, errbuf)) == NULL) {
                fprintf(stderr, "%s\n", errbuf);
                return EXIT_FAILURE;
            }
            pcap_lookupnet(pcap_dev, &net, &mask, errbuf);  /* Get netinfo */
            if (filter) {
                if (pcap_compile(descr, &program, filter, 0, net) == -1) {
                    fprintf(stderr, "Error - `pcap_compile()'\n");
                    return EXIT_FAILURE;
                }

                if (pcap_setfilter(descr, &program) == -1) {
                    fprintf(stderr, "Error - `pcap_setfilter()'\n");
                    return EXIT_FAILURE;
                }

                pcap_freecode(&program);
            }
            pcap_loop(descr, npolls, payload_print, NULL);
            /* Exit program */
            printf("Closing capturing engine...\n");
            pcap_close(descr);
            return 0;
        }

Note the main looks very similar to the other program that uses pcap with one exception, in the pcap looper it calls payload_print. Now onto the real meat, the part of the program that actually decodes and prints the packet data:

        /*
         * Call libpcap and decode payload data.
         */
        static void payload_print(u_char * arg, const struct pcap_pkthdr *header,
                      const u_char * packet)
        {
            int i = 0, *counter = (int *)arg;
            printf("Packet RECV Size: %d Payload:\n", header->len);
            for (i = 0; i < header->len; i++) {
                if (isprint(packet[i]))
                    printf("%c ", packet[i]);
                else
                    printf(". ");
                if ((i % 16 == 0 && i != 0) || i == header->len - 1)
                    printf("\n");
            }
            return;
        }

Doesn't look too difficult, here is some sample output:

        # sudo ./ndecode -p 2
        Packet RECV Size: 60 Payload:
        . . 3 . . . . . . . . . . . E . . 
        . . . @ . @ . . . . . . . . . . 
        . . J . F b . . . . ~ l . P . . 
        . . . . . * . . . . . 
        Packet RECV Size: 60 Payload:
        . . . . . . . . 3 . . . . . E   . 
        ( . . @ . k . . . . . . . . . . 
        . . F . J . ~ l . b . . . P . @ 
        . u . . . . . . . . . 
        Closing capturing engine...

And yes, that data will print plain text passwords!