Pages: Welcome | Projects

More shell inconsistencies

2019/10/2
Tags: [ shell ] [ Today I learned ] [ wtf ]

Another episode of Shell WTF for those who are interested in the topic[1].

All the following behaviours are well documented and POSIX-compliant, yet to me they look surprising or simply wrong. None of them is new, but I believe all of them are common pitfalls. For sure I can tell I got caught in them recently, even if I'm using the shell since years.

For some reason they're all about error checking.

[1]: note to self: I should make per-tag RSS feeds in PFT!

Failures in a pipe

This is a simple and well known problem, yet I fall for it some days ago. Consider this line, in which two commands are in a pipe and the first of them fails:

$ sh -c 'echo hello; exit 1' | sh -c 'sed s/o$//'; echo $?

The whole pipeline will not fail. In fact this will print:

hell
0

The solution would be set -o pipefail, but it is a bashism, and here we are talking of POSIX shell

A possible work-around, ugly but effective, could be something like this:

# Save in $self the pid of the parent shell.  It needs to be global, as
# the sub-shell within a pipe does not know the parent's pid.
self=$$
die() { kill $self; }

# An output-emitting program that can (and will) fail
faulty() {
    seq 10
    return 1
}

{ faulty || die; } | sed s/3/5/

A shell executing this script will fail with a non-zero error code.

See follow-up discussion on Lobsters

find(1) some fun

Did you know that find(1) doesn't generally report the exit values of sub-processes invoked via the -exec flag? A non-zero exit value is only reported if the + variant of -exec is used.

$ find ~/bin -exec false {} \;
$ echo $?
0
$ find ~/bin -exec false {} +
$ echo $?
1

I believe that a far less surprising behaviour would be to return the number of faulty jobs (0 if none, N != 0 if any). But this odd behaviour is actually the POSIX compliant one! True story.

Inconsistent behaviour with errexit (AKA set -e)

Consider this snippet:

set -e

f() {
    echo within f
    false
    echo after false, within f
}

echo before f
f
echo after f

Because of set -e this will only print:

before f
within f

But if the result of the f function is evaluated, e.g. within an if statement like this:

set -e

f() {
    echo within f
    false
    echo after false, within f
}

echo before f
if f; then
    echo f succeded
fi
echo after f

Then the output is:

before f
within f
after false, within f
f succeded
after f

Which "makes sense" after you read the relevant page in the standard:

The -e setting shall be ignored when executing the compound list following the while, until, if, or elif reserved word, a pipeline beginning with the ! reserved word, or any command of an AND-OR list other than the last.

No, running set -e within the function won't do much, not even if you define the function as a sub-shell (f() ( ...; )).

Basically the invocation context (that you can't change nor detect from within the function - this ain't Perl!) will determine the actual behaviour. Good luck with that.

Of course in our example the echo after false, within f command will succeed, so the failure of false won't be even detected. To have a consistent behaviour, I think you just have to explicitly fail everywhere:

f() {
    test ... || return 1
    echo more commands
}

Which is what you would do in C in fact. Just explicitly check errors, for every single statement.