Pipe null terminated file paths twice to same output, but second time sorted by basename

So, let’s say I have a script that uses find to print paths with null-terminated filenames.

I also want to print another version of the output where each path is sorted by its basename. I want to achieve this without running find again and by piping both outputs to the same command as if they were a single output.

I assume this can be done using Perl, but I’m not very familiar with it.

Here are some important points to consider:

  1. The initial output should be passed through unchanged and in real-time. In other words, during the first run, the command should simply output the lines as they are to the second command while capturing them for sorting once the input (e.g., find) completes.

  2. After find (or any other program that outputs zero-terminated paths) completes, the script should sort each received path by its basename. The sorting should ignore everything before the last / character, unless the path ends with a trailing slash. In the case of a trailing slash, the sorting should be based on the field before it to ensure that directories with trailing slashes are not grouped together. Once the sorting is complete, the sorted output should be printed after the original output.

How can this be achieved easily?

Asked By: Eduardo Perez

||

With perl, that could look like:

find . ... -print0 |
  perl -MFile::Basename -0 -lne '
    print; push @files, $_;
    END {
      print $_->[0] for sort {$a->[1] cmp $b->[1]} map {[$_, basename$_]} @files
    }' | ...

With:

  • -MFile::Basename uses the File::Basename Module for its basename() function
  • -0: sets the input record separator to the NUL byte
  • -n: sed -n mode where the -expression is run for each input record.
  • -l: sets the output record separator to the same thing as the input record separator (NUL here) and strip the input record separator from the end of the record so $_ contains the contents of the record in the -expression.
  • for each record, we print it (followed by the output record separator), then push it onto a @files list.
  • in the END, we loop over a sorted list of a Schwartzian transform of the list.
Answered By: Stéphane Chazelas
Categories: Answers Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.