Turn off buffering in pipe
I have a script which calls two commands:
long_running_command | print_progress
The long_running_command
prints progress but I’m unhappy with it. I’m using print_progress
to make it nicer (namely, I print the progress in a single line).
The problem: Connection a pipe to stdout also activates a 4K buffer, so the nice print program gets nothing … nothing … nothing … a whole lot … 🙂
How can I disable the 4K
buffer for the long_running_command
(no, I do not have the source)?
According to this the pipe buffer size seems to be set in the kernel and would require you to recompile your kernel to alter.
I don’t think the problem is with the pipe. It sounds like your long running process is not flushing its own buffer frequently enough. Changing the pipe’s buffer size would be a hack to get round it, but I don’t think its possible without rebuilding the kernel – something you wouldn’t want to do as a hack, as it probably aversley affect a lot of other processes.
You can use the unbuffer
command (which comes as part of the expect
package), e.g.
unbuffer long_running_command | print_progress
unbuffer
connects to long_running_command
via a pseudoterminal (pty), which makes the system treat it as an interactive process, therefore not using the 4-kiB buffering in the pipeline that is the likely cause of the delay.
For longer pipelines, you may have to unbuffer each command (except the final one), e.g.
unbuffer x | unbuffer -p y | z
It used to be the case, and probably still is the case, that when standard output is written to a terminal, it is line buffered by default – when a newline is written, the line is written to the terminal. When standard output is sent to a pipe, it is fully buffered – so the data is only sent to the next process in the pipeline when the standard I/O buffer is filled.
That’s the source of the trouble. I’m not sure whether there is much you can do to fix it without modifying the program writing into the pipe. You could use the setvbuf()
function with the _IOLBF
flag to unconditionally put stdout
into line buffered mode. But I don’t see an easy way to enforce that on a program. Or the program can do fflush()
at appropriate points (after each line of output), but the same comment applies.
I suppose that if you replaced the pipe with a pseudo-terminal, then the standard I/O library would think the output was a terminal (because it is a type of terminal) and would line buffer automatically. That is a complex way of dealing with things, though.
If it is a problem with the libc modifying its buffering / flushing when output does not go to a terminal, you should try socat. You can create a bidirectional stream between almost any kind of I/O mechanism. One of those is a forked program speaking to a pseudo tty.
socat EXEC:long_running_command,pty,ctty STDIO
What it does is
- create a pseudo tty
- fork long_running_command with the slave side of the pty as stdin/stdout
- establish a bidirectional stream between the master side of the pty and the second address (here it is STDIO)
If this gives you the same output as long_running_command
, then you can continue with a pipe.
Edit : Wow
Did not see the unbuffer answer ! Well, socat is a great tool anyway, so I might just leave this answer
Another way to skin this cat is to use the stdbuf
program, which is part of the GNU Coreutils (FreeBSD also has its own one).
stdbuf -i0 -o0 -e0 command
This turns off buffering completely for input, output and error. For some applications, line buffering may be more suitable for performance reasons:
stdbuf -oL -eL command
Note that it only works for stdio
buffering (printf()
, fputs()
…) for dynamically linked applications, and only if that application doesn’t otherwise adjust the buffering of its standard streams by itself, though that should cover most applications.
For grep
, sed
and awk
you can force output to be line buffered. You can use:
grep --line-buffered
Force output to be line buffered. Â By default, output is line buffered when standard output is a terminal and block buffered other-wise.
sed -u
Make output line buffered.
See this page for more information:
http://www.perkin.org.uk/posts/how-to-fix-stdio-buffering.html
Yet another way to turn on line-buffering output mode for the long_running_command
is to use the script
command that runs your long_running_command
in a pseudo terminal (pty).
script -q /dev/null long_running_command | print_progress # (FreeBSD, Mac OS X)
script -q -c "long_running_command" /dev/null | print_progress # (Linux)
You can use
long_running_command 1>&2 |& print_progress
The problem is that libc will line-buffer when stdout to screen and block-buffer when stdout to a file, but no-buffer for stderr.
I don’t think it’s the problem with pipe buffer, it’s all about libc’s buffer policy.
According to this post here, you could try reducing the pipe ulimit to one single 512-byte block. It certainly won’t turn off buffering, but well, 512 bytes is way less than 4K :3
I know this is an old question and already had lot of answers, but if you wish to avoid the buffer problem, just try something like:
stdbuf -oL tail -f /var/log/messages | tee -a /home/your_user_here/logs.txt
This will output in real time the logs and also save them into the logs.txt
file and the buffer will no longer affect the tail -f
command.
I found this clever solution: (echo -e "cmd 1ncmd 2" && cat) | ./shell_executable
This does the trick. cat
will read additional input (until EOF) and pass that to the pipe after the echo
has put its arguments into the input stream of shell_executable
.
In a similar vein to chad’s answer, you can write a little script like this:
# save as ~/bin/scriptee, or so
script -q /dev/null sh -c 'exec cat > /dev/null'
Then use this scriptee
command as a replacement for tee
.
my-long-running-command | scriptee
Alas, I can’t seem to get a version like that to work perfectly in Linux, so seems limited to BSD-style unixes.
On Linux, this is close, but you don’t get your prompt back when it finishes (until you press enter, etc)…
script -q -c 'cat > /proc/self/fd/1' /dev/null
jq
has --unbuffered
flag:
Flush the output after each JSON object is printed (useful if you’re piping a slow data source into jq and piping jq’s output elsewhere).
Python has the -u
(unbuffered) flag.
$ man python3
[...]
-u Force the stdout and stderr streams to be unbuffered. This option has no effect on the stdin stream.
[...]