December 2004

Canned posix-sh Bits

Recently I performed the system administration portion of a large migration from a Hewlett Packard UNIX (HP-UX) system to Red Hat Linux (RH Linux). The migration involved the creation and copying of roughly thirty accounts and 80GB of data. The data portion (since it was stored in group accessible common areas) turned out to be trivial relative to the user accounts.

The original plan was simply to manually add the users and transfer the appropriate parts of their former $HOME over, as usual something went wrong. Another problem occurred which cost valuable time, time otherwise to be spent on creating accounts and secure copying files.

So, I wrote some scripts . . .

On the Fly (by the seat of your pants) Scripting

On the spot scripting is a common practice among UNIX users. In this case, circumstances were somewhat different than the norm. Most of the time, a shell script for something as trivial as moving bits is not a tall order, however, correctness and timeliness were the order of the day.

Some quick decisions about what and how had to be scribbled onto note paper. These were formulated in two minutes to the quick list below:

    Steps
        1. Create the accounts
        2. Create the initial passwords
        3. Run chage -d 0 -M 999 on each account (aging policy was
           still in the air at the time and in committee)
        4. Copy the directories
        5. Fixup permissions
    Rules
        - no file deletion, mistakes must be fixed by hand
        - it has to work quickly, as in little testing

The realization was simple, this was not the time or place to write some glorious all in one program or script to take care of the problem now and forever (that project was added the day after ...).

Little Worker Scripts

There were some known items as well, each user-id's $HOME matched their login name (which made things very easy). Additionally, no profiles were being copied over.

The main problem with migration was that even though the accounts all resided under /home, they were in three different groups. The solution for that was simple, create a list of users per group:

        ls -al /home | grep groupname | awk '{print $3}' > groupname.out

Next, time to create the accounts:

        cat groupname.out | add_accounts.sh

        [add_accounts.sh]
        for i in $@
        do
                useradd -g groupname -mk /usr/local/etc/skel \
                        -d /home/$i -c "$i User Account" $i
        done

Note that the GECOS field is a bit dry - on purpose - it is not required on this particular system. An LDAP server already possesses said information. At most, as side work, it was decided to fill out the field in spare time for each user.

Root public-keys were already setup just for the occasion, so now it was a matter of getting the files, which turned out to be slightly trickier than first guessed, due to scp's nature wild cards could not be used, temporary space was used instead:

        mkdir /tmp/homes
        cat groupname.out | scopyhomes.sh

        [scopyhomes.sh]
        #!/bin.sh
        for i in $@
        do 
                scp -r host:/home/$i /tmp/homes
        done 

With the directories now across, it was time to copy in just the regular files:

        cd /tmp/homes
        for i in *
        do
                cd $i
                cp -R * /home/$i
                cd ..
        done

The last step, fixup perms:

        cat groupname.out | fixperms.sh

        [fixperms.sh]
        for i in @
        do
                chown -R $i:groupname /home/$i
        done

So... whats wrong with this picture?

Problems and the Cure

The logic of the afore written material is ... questionable. Any seasoned programmer would toss the bits shown so far out the window and rightly so. They are riddled with potential problems. For instance, what if one of the secure copies hung? A keyboard signal means the loop simply would keep processing potentially missing an account. Why the constant catting of a file? What happens of the filesystem has mystery disappearing file problems? What if part of one script that is essential just does not work?

Filetests

The easiest one to answer is filetests. In any operation involving directories and files, a simple if [ -d $i ]; then or if [ -f $i ]; then would have avoided any potential file disappearance issues.

Canned Traps and Bombs

Another issue is trapping and dying, which can be cured with just a few lines of shell code:

        toppid=$$
        trap "exit 1" 1 2 3 15

Finally, what to do about outright explosions? What if a file is not where it should be? The answer, a generic bomb routine:

        progname=${0##*/}
        ...
        bomb()
        {
                 cat >&2 <<ERRORMESSAGE

        ERROR: $@
        ${progname} ABORTED
                kill ${toppid}
                exit 1
        }      

To use the bomb routine, look at the cp command:

        cp -R * /home/$i || bomb "Cannot copy to $i ... exiting"

Simple enough.

Lessons Learned

Fatefully, nothing went wrong. It all worked, but in retrospect, having the canned routines that should be used with every script handy would have ensured avoiding potential disaster or at least mitigated it to a manageable point. If those simple few lines routines (and others to be sure) had been lying around, a few more safety nets could have been cast.