Kill process on broken pipe from within a bash script

Question

Kill process on broken pipe from within a bash script

Consider the code:

printf '%sn' 1 2 3 4 5 | head -n 2

which has the output:

1
2

My understanding is that when the head process breaks the pipe after reading the two first lines then the printf process catches the broken pipe and exits gracefully.

In a script of mine I use a python application which does not exit gracefully on broken pipes, rather it keeps running until it terminates having completed what it set out to do or terminating for some other reason. Every time it tries to write to stdout and fails it will complain about the broken pipe.

This article explains how to handle broken pipes in python.

I could perhaps encourage the application developers to implement handling of broken pipes. For certain reasons i doubt that they will. I could perhaps fork the application but it is rather complex so would rather not.

The only option I seem to be left with is trying to find a way to make the shell (Bash) kill the process when it tries to write to the pipe. Is this possible? If so, how would I do that?

Asked By: fuumind

||

Source

Answer 1

Yes, python3 has that annoying behaviour whereby it ignores SIGPIPEs and raises exceptions upon write()s failing with EPIPE instead even though broken pipes are meant to be part of life and of normal operation.

You could indeed work around it by having a wrapper that forwards its output and kills python3 with SIGTERM for instance when the output becomes a broken pipe.

bash is not the shell I’d use for this kind of thing though.

You could use perl instead:

perl -e '
  $pid = open CMD, "-|", @ARGV;
  $SIG{PIPE} = "IGNORE";
  while (sysread CMD, $buf, 8192) {
    if (!syswrite STDOUT, $buf) {
      kill "TERM", $pid;
      last;
    }
  }
  close CMD;
  exit($? & 127 ? ($? & 127) | 128 : $? >> 8)' -- your-python-program

If it has to be a shell, zsh would be a better choice:

zsh -c '
  zmodload zsh/system
  coproc {"$0" "$@" <&3 3<&-} 3<&0
  trap "" PIPE
  while sysread -s 8192 buf <&p; do
    syswrite -- $buf || {
      kill $! 2> /dev/null
      break
    }
  done
  wait $!' your-python-program

Example:

$ python3 -uc 'import time; print("foo"); time.sleep(1); print("bar")' | head -n1
foo
Traceback (most recent call last):
  File "<string>", line 1, in <module>
BrokenPipeError: [Errno 32] Broken pipe

$ zsh -c that-code python3 -uc 'import time; print("foo"); time.sleep(1); print("bar")' | head -n1
foo
$ echo $pipestatus
143 0
$ kill -l 143
TERM

Now it seems a bit overkill to have a process spending its time shoving that output through an extra pipe, just to work-around that annoyance of python3.

Another approach could be to prevent python3 from ignoring those signals in the first place.

$ strace -qqqZ -e signal=none -e rt_sigaction -e inject=rt_sigaction:retval=0 python3 -uc 'import time; print("foo"); time.sleep(1); print("bar")' | head -n1
foo

Where strace effectively short-circuits all of python3’s invocations of the rt_sigaction() system call preventing it from installing any signal handler or changing signal disposition (of any signal).

So it also removes the annoying messages you get when stopping python3 scripts with ^C for instance, but it may be dangerous though if your python script does install some signal handlers to clean after itself when killed.

It would be better to do that only for such calls about the SIGPIPE signal, but AFAICT, strace can’t do that. You should be able to achieve it using some $LD_PRELOAD trick though:

$ cat leave-sigpipe-alone.c
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdlib.h>
#include <sys/types.h>
#include <signal.h>

int sigaction(int signum, const struct sigaction * restrict act, struct sigaction * restrict oldact)
{
  static int (*orig_sigaction)(int, const struct sigaction * restrict, struct sigaction * restrict) = 0;

  if (!orig_sigaction)
    orig_sigaction = (int (*)(int, const struct sigaction * restrict, struct sigaction * restrict)) dlsym (RTLD_NEXT, "sigaction");

  if (signum == SIGPIPE) return 0;
  return orig_sigaction(signum, act, oldact);
}
$ gcc -fPIC -shared -o leave-sigpipe-alone.so leave-sigpipe-alone.c -ldl
$ LD_PRELOAD=$PWD/leave-sigpipe-alone.so python3 -uc 'import time; print("foo"); time.sleep(1); print("bar")' | head -n1
foo
$ echo $pipestatus
141 0
$ kill -l 141
PIPE

python3 was silently killed by SIGPIPE as well-behaved executables do.

With strace and without the -f option, the sigaction() calls are only intercepted by the parent process (the one that executes python3). If will carry on intercepting them even if it executes a separate commands but not do it in child commands (unless you pass the -f option). While the LD_PRELOAD trick will affect all child processes and even after they execute separate commands if those commands are dynimically linked.

A cleaner approach that works with the cpython interpreter of the python3 programming language (the one most people use) is to restore SIGPIPE’s default signal disposition in a custom sitecustomize.py:

$ cat ~/lib/python3-leave-sigpipe-alone/sitecustomize.py
from signal import signal, SIGPIPE, SIG_DFL
signal(SIGPIPE,SIG_DFL)
$ PYTHONPATH=~/lib/python3-leave-sigpipe-alone python3 -uc 'import time; print("foo"); time.sleep(0.2); print("bar")' | head -n1
foo
$ echo $pipestatus
141 0

Here, we don’t change the system’s sitecusomize.py but one in a dedicated directory, so we can set the PYTHONPATH variable to that directory only for those python3 scripts that we want killed by SIGPIPE when they attempt write to a broken pipe.

That one doesn’t seem to work with pypy3.

Answered By: Stéphane Chazelas