Turn off buffering in pipe

I have a script which calls two commands:

long_running_command | print_progress

The long_running_command prints progress but I’m unhappy with it. I’m using print_progress to make it nicer (namely, I print the progress in a single line).

The problem: Connection a pipe to stdout also activates a 4K buffer, so the nice print program gets nothing … nothing … nothing … a whole lot … 🙂

How can I disable the 4K buffer for the long_running_command (no, I do not have the source)?

Asked By: Aaron Digulla

||

According to this the pipe buffer size seems to be set in the kernel and would require you to recompile your kernel to alter.

Answered By: second

I don’t think the problem is with the pipe. It sounds like your long running process is not flushing its own buffer frequently enough. Changing the pipe’s buffer size would be a hack to get round it, but I don’t think its possible without rebuilding the kernel – something you wouldn’t want to do as a hack, as it probably aversley affect a lot of other processes.

Answered By: anon

You can use the unbuffer command (which comes as part of the expect package), e.g.

unbuffer long_running_command | print_progress

unbuffer connects to long_running_command via a pseudoterminal (pty), which makes the system treat it as an interactive process, therefore not using the 4-kiB buffering in the pipeline that is the likely cause of the delay.

For longer pipelines, you may have to unbuffer each command (except the final one), e.g.

unbuffer x | unbuffer -p y | z
Answered By: cheduardo

It used to be the case, and probably still is the case, that when standard output is written to a terminal, it is line buffered by default – when a newline is written, the line is written to the terminal. When standard output is sent to a pipe, it is fully buffered – so the data is only sent to the next process in the pipeline when the standard I/O buffer is filled.

That’s the source of the trouble. I’m not sure whether there is much you can do to fix it without modifying the program writing into the pipe. You could use the setvbuf() function with the _IOLBF flag to unconditionally put stdout into line buffered mode. But I don’t see an easy way to enforce that on a program. Or the program can do fflush() at appropriate points (after each line of output), but the same comment applies.

I suppose that if you replaced the pipe with a pseudo-terminal, then the standard I/O library would think the output was a terminal (because it is a type of terminal) and would line buffer automatically. That is a complex way of dealing with things, though.

Answered By: Jonathan Leffler

If it is a problem with the libc modifying its buffering / flushing when output does not go to a terminal, you should try socat. You can create a bidirectional stream between almost any kind of I/O mechanism. One of those is a forked program speaking to a pseudo tty.

 socat EXEC:long_running_command,pty,ctty STDIO 

What it does is

  • create a pseudo tty
  • fork long_running_command with the slave side of the pty as stdin/stdout
  • establish a bidirectional stream between the master side of the pty and the second address (here it is STDIO)

If this gives you the same output as long_running_command, then you can continue with a pipe.

Edit : Wow
Did not see the unbuffer answer ! Well, socat is a great tool anyway, so I might just leave this answer

Answered By: shodanex

Another way to skin this cat is to use the stdbuf program, which is part of the GNU Coreutils (FreeBSD also has its own one).

stdbuf -i0 -o0 -e0 command

This turns off buffering completely for input, output and error. For some applications, line buffering may be more suitable for performance reasons:

stdbuf -oL -eL command

Note that it only works for stdio buffering (printf(), fputs()…) for dynamically linked applications, and only if that application doesn’t otherwise adjust the buffering of its standard streams by itself, though that should cover most applications.

Answered By: a3nm

For grep, sed and awk you can force output to be line buffered. You can use:

grep --line-buffered

Force output to be line buffered.  By default, output is line buffered when standard output is a terminal and block buffered other-wise.

sed -u

Make output line buffered.

See this page for more information:
http://www.perkin.org.uk/posts/how-to-fix-stdio-buffering.html

Answered By: yaneku

Yet another way to turn on line-buffering output mode for the long_running_command is to use the script command that runs your long_running_command in a pseudo terminal (pty).

script -q /dev/null long_running_command | print_progress      # (FreeBSD, Mac OS X)
script -q -c "long_running_command" /dev/null | print_progress # (Linux)
Answered By: chad

You can use

long_running_command 1>&2 |& print_progress

The problem is that libc will line-buffer when stdout to screen and block-buffer when stdout to a file, but no-buffer for stderr.

I don’t think it’s the problem with pipe buffer, it’s all about libc’s buffer policy.

Answered By: Wang HongQin

According to this post here, you could try reducing the pipe ulimit to one single 512-byte block. It certainly won’t turn off buffering, but well, 512 bytes is way less than 4K :3

Answered By: RAKK

I know this is an old question and already had lot of answers, but if you wish to avoid the buffer problem, just try something like:

stdbuf -oL tail -f /var/log/messages | tee -a /home/your_user_here/logs.txt

This will output in real time the logs and also save them into the logs.txt file and the buffer will no longer affect the tail -f command.

Answered By: Marin N.

I found this clever solution: (echo -e "cmd 1ncmd 2" && cat) | ./shell_executable

This does the trick. cat will read additional input (until EOF) and pass that to the pipe after the echo has put its arguments into the input stream of shell_executable.

Answered By: jaggedsoft

In a similar vein to chad’s answer, you can write a little script like this:

# save as ~/bin/scriptee, or so
script -q /dev/null sh -c 'exec cat > /dev/null'

Then use this scriptee command as a replacement for tee.

my-long-running-command | scriptee

Alas, I can’t seem to get a version like that to work perfectly in Linux, so seems limited to BSD-style unixes.

On Linux, this is close, but you don’t get your prompt back when it finishes (until you press enter, etc)…

script -q -c 'cat > /proc/self/fd/1' /dev/null
Answered By: jwd

jq has --unbuffered flag:

Flush the output after each JSON object is printed (useful if you’re piping a slow data source into jq and piping jq’s output elsewhere).

Answered By: seeker_of_bacon

Python has the -u (unbuffered) flag.

$ man python3
[...]
       -u     Force the stdout and stderr streams to be unbuffered.  This option has no effect on the stdin stream.
[...]
Answered By: Walter Tross
Categories: Answers Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.