Pseudo files for temporary data

I often want to feed relatively short string data (could be several lines though) to commandline programs which accept only input from files (e.g. wdiff) in a repeated fashion. Sure I can create one or more temporary files, save the string there and run the command with the file name as parameter. But it looks to me as if this procedure would be highly inefficient if data is actually written to the disk and also it could harm the disk more than necessary if I repeat this procedure many times, e.g. if I want to feed single lines of long text files to wdiff. Is there a recommended way to circumvent this, say by using pseudo files such as pipes to store the data temporarily without actually writing it to the disk (or writing it only if it exceeds a critical length). Note that wdiff takes two arguments and, as far as I understand it will not be possible to feed the data doing something like wdiff <"text".

Asked By: highsciguy


Use a named pipe. By way of illustration:

mkfifo fifo
echo -e "hello worldnnext linenline 3" > fifo

The -e tells echo to properly interpret the newline escape (n). This will block, ie, your shell will hang until something reads the data from the pipe.

Open another shell somewhere and in the same directory:

cat fifo

You’ll read the echo, which will release the other shell. Although the pipe exists as a file node on disk, the data which passes through it does not; it all takes place in memory. You can background (&) the echo.

The pipe has a 64k buffer (on linux) and, like a socket, will block the writer when full, so you will not lose data as long as you do not prematurely kill the writer.

Answered By: goldilocks

In Bash, you can use the command1 <( command0 ) redirection syntax, which redirects command0‘s stdout and passes it to a command1 that takes a filename as a command-line argument. This is called process substitution.

Some programs that take filename command-line arguments actually need a real random-access file, so this technique won’t work for those. However, it works fine with wdiff:

user@host:/path$ wdiff <( echo hello; echo hello1 ) <( echo hello; echo hello2 )

In the background, this creates a FIFO, pipes the command inside the <( ) to the FIFO, and passes the FIFO’s file descriptor as an argument. To see what’s going on, try using it with echo to print the argument without doing anything with it:

user@host:/path$ echo <( echo hello )

Creating a named pipe is more flexible (if you want to write complicated redirection logic using multiple processes), but for many purposes this is enough, and is obviously easier to use.

There’s also the >( ) syntax for when you want to use it as output, e.g.

$ someprogram --logfile >( gzip > out.log.gz )

See also the bash man page "process substitution" section and the Bash redirections cheat sheet for related techniques.

Answered By: Mechanical snail

wdiff is a special case due to it requiring 2 filename arguments, but for all commands that only require 1 argument and which stubbornly refuse to take anything but a filename argument, there are 2 options:

  • The filename ‘-‘ (that is, a minus sign) works about 1/2 of the time. It seems to depend upon the command in question and whether the developer of the command traps that case and handles it as expected. e.g.

    $> ls | cat –

  • There is a psuedo file named /dev/stdin that exists in linux and can be used were a filename is absolutely required by a command. This is more likely to work since it is does not require any special filename handling from the command. If a fifo works, or the bash process substitution method works then this should also work and is not shell specific. e.g.

    $> ls | cat /dev/stdin

Answered By: dabuntu
Categories: Answers Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.