Dec 2006

Why retval is your friend

Paul Graham would say that hacking is similar to painting - and it is - but it is similar to painting in ways beyond method or technique. One of the items many hacker writers skip over is experience. This text deals with one really big item drawn from experience, why using return values (or internal signals if you will) is important. Using a signal based paradigm is not only a good idea, it makes life easier.

What about retstrings?

Return strings, as proven by practically every piece of interface documentation written - do not have to be strings. Actually, a return value can in fact be whatever the programmer wants the return to be regardless of the language. This is important because some languages follow a return by sanity clause. The Perl language returns whatever the last operation results were - much like the shell. A program written in C/C++ or Java can do the same thing, the only difference is a few more hoops might need to be jumped through to implement a non string return on a function that was designed to return a string. On the flip-side, there is also the guaranteed null return. If a string operation fails for whatever reason, the program builds a real NULL string for the said platform (which should be handled by glibc or libc) and thus guarantees a proper NULL return. The last and perhaps most effective is just - set a string value that is an equivalent to a signal. Using an error string value is literally, done all of the time. A set of predefined string names within the local context of a program simply mean something bad. In short, retval does not mean just numbers.

Does this Make void and no-ops bad?

Nope, one of the big beefs out there is Perl does insist on returning something which - well just may not be wanted. If it can be void then why should a programmer have to say so? The reverse policy should be used instead, if an implicit return is desired, then do so, if not - do not force people to do so. One of the big upsides to the C language is how retvals can be tossed out the window. Usually, automatic returns do no harm. There are cases when input buffering is being parsed by sub routines in Perl and shell scripts that can make forced returns a problem, such as buffer mangling, but in general - harmless. Using void functions is completely legitimate since 9 times out of 10 they are inconsequential.

Case Study: Checker Shell Script

There exist so many shell scripts that check system status that if one had a penny for each they may not be rich but they would be well to do. One of the dangers of shell scripting is in-the-box thinking, especially when on a team. In this case study there are two examples of a very simple shell script that exhibits two different behaviors.

The Bad One

The first script is the bad one, it acts only for itself and does nothing to help anyone else:

...
for i in $#; do
        ssh $i uptime || logger -i -s "$0 could not reach ${i}"
done
...
exit 0

While syslog may be getting monitored somewhere, what if one wanted to use this script from another script?

A Better One

A better script might offer a few alternatives to just logging:

...
sshchk()
{
        errors=0

        for i in $#; do
                ssh $i uptime
                if [ $? -gt 0 ]; then
                        logger -i -s "SSH Connect error to ${i}"
                        errors=$((errors+1))
        done

        return $errors          
}
...
exit $errors

Now at least the caller - shell or script - will know there was a problem.

It may even be permissible to say at least one check failed and do the following:

...
sshchk()
{
        errors=0

        for i in $#; do
                ssh $i uptime
                if [ $? -gt 0 ]; then
                        logger -i -s "SSH Connect error to ${i}"
                        errors=$((errors+1))
        done

        if [ "$error" -gt 0 ]; then
                return 1
        fi
}
...

In which case, something went wrong it just not known how many times.

In C Please

Many ideas in any programming language first come from inline testing. In line testing is a nice way of saying cram it in using ifdefs. A good example of using returns in C is when if then else is not needed or may not even be applicable.

Case Study: On the Side

In the example below, using pseudocode, a mount point is being checked and then an additional check is added, error on group or world readable. It does not matter if it is C, Perl or Java - the idea is the same. One version checks within the existing body of code while another, very succinctly, does not.

Original Version
if MOUNTPOINT
        return 0
endif
In Line Version
if MOUNTPOINT
        if EXEC
                return 1
        if WORLDREAD
                return 1
        if GROUPREAD
                return 1
return 0
Using retval to save the day...
if MOUNTPOINT
        retval=check_mount_perms
        return retval
...
check_mount_perms
        if EXEC
                return 1
        if WORLDREAD
                return 1
        if GROUPREAD
                return 1
return 0
...

So what does the additional function do? It does two things, one it takes the complexity of the check out of the simple mount point check. Next, it offers the ability to add or alter the checks on an as needed basis. What if checking for SGID or GGID were needed? What if just group and user were needed? In the latter version, such checks can be added without obfuscating the mount check too much.

What if more is needed? It is much easier to do this:

if MOUNTPOINT
        retval=check_mount_perms
        return retval
...
check_mount_perms
        if EXEC
                return 1
        if WORLDREAD
                return 1
        if GROUPREAD
                return 1
        if GGID != MYGROUP
                return 1
return 0
...

Versus adding yet another check in the main calling program.

How and When is too Much?

The question of how much message passing versus actually doing work is as old as computational devices. There is no simple answer. The best answer, and I yet again refer to Paul Graham - is do what seems natural. Just keep in mind that well formed returns, whether they are strings, numbers or NULL are ultimately up to the writer - not the machine.

When not to be Prudent

As mentioned earlier there are times when a program does not need to return a value, so when might that be? One good example is just information, the classic usage:

void usage(void) {
        printf("Here is some info...",\n");
}

always applies everywhere. A simple echo command etc. is just fine, even proven known operations, like flag checking, work great without needing explicit return values.

Summary

Return values help users and programmers alike every day. Making prudent use of them as a shell scripter, Perl monger or C programmer just makes it easier for all of us. The key part to remember is judicious use, if a check seems intrusive - just push it into a module and return something - otherwise just try to do the right thing.