man pages global search regex `.*word1.*word2.*` does not find a page that contain both words

I want to use man more efficiently. I’ve decided to try --regex option. However:

~$ man --regex -K '.*textdomain.*perl.*'
--Man-- next: Locale::Messages(3pm) [ view (return) | skip (Ctrl-D) | quit (Ctrl-C) ]
^C
~$ man --regex -K '.*perl.*textdomain.*'
No manual entry for .*perl.*textdomain.*

The source (for first find which man opens) which I extracted produces empty grep results both ways but finds words (textdomain, perl) separately:

~/Documents$ grep '.*textdomain.*perl.*' Locale::libintlFAQ.3pm
~/Documents$ grep '.*perl.*textdomain.*' Locale::libintlFAQ.3pm

In the file there is perl (e.g. line 74) before textdomain (e.g. line 120). Why man --regex -K '.*perl.*textdomain.*' can’t find it when reversed order of words does? grep shows neither sequence is on one line. How actually man --regex -K works? Should .* find new lines or not? I guess for the last one answer is "it depends on the system" (based on https://stackoverflow.com/questions/11924480/search-in-man-page-for-words-at-the-beginning-of-line).

Asked By: Alex Martian

||

In the man implementation from man-db as found on most GNU/Linux distributions, the search is line-based and case insensitive by default (unless you pass the -I / --match-case option).

$ man -Iw --regex -K '.*textdomain.*perl.*'
No manual entry for .*textdomain.*perl.*
$ man -w --regex -K '.*textdomain.*perl.*'
/usr/share/man/man3/Locale::libintlFAQ.3pm.gz
/usr/share/man/man3/Locale::TextDomain.3pm.gz
/usr/share/man/man3/Locale::libintlFAQ.3pm.gz
/usr/share/man/man3/Locale::TextDomain.3pm.gz
$ zgrep '.*textdomain.*perl.*' /usr/share/man/man3/Locale::libintlFAQ.3pm.gz
$ zgrep -i '.*textdomain.*perl.*' /usr/share/man/man3/Locale::libintlFAQ.3pm.gz
Locale::TextDomain::FAQ - Frequently asked questions for libintl-perl

To search for man pages that contain both perl and textdomain case sensitively, you could do (on a GNU system):

xargs -rd 'n' -a <(
  man -IKw perl | sort -u | xargs -rd 'n' zgrep -l textdomain --
  ) man -l --

perl is searched with man -IKw which returns the paths of the man pages containing it, textdomain is searched in those files with zgrep, and the resulting list of paths is given to man -l.

A man-pages-with-all-words script could be written as:

#! /bin/zsh -
die() {
  print -rC1 -u2 - "$@"
  exit 1
}
(( $# )) || die "Usage: $ZSH_SCRIPT <word> [<word>...]"

typeset -U pages
pages=(
  ${(f)"$(man -IKw -- "$1")"}
)
shift
while (( $#pages && $# )); do
  pages=(
    ${(f)"$(zgrep -lF -- "$1" $pages)"}
  )
  shift
done
if (( $#pages )); then
  man -l -- $pages
else
  die "No matching man page found."
fi

Or the same with POSIX sh syntax:

#! /bin/sh -
die() {
  [ "$#" -eq 0 ] || printf '%sn' "$@"
  exit 1
}
[ "$#" -gt 0 ] || die "Usage: $0 <word> [<word>...]"

# tune sh split+glob: split on (sequences of) newline + no glob
IFS='
'; set -o noglob

pages=$(man -IKw -- "$1" | LC_ALL=C sort -u)
shift

while [ "$#" -gt 0 ] && [ -n "$pages" ]; do
  pages=$(
    zgrep -lF -- "$1" $pages
  )
  shift
done
if [ -n "$pages" ]; then
  man -l -- $pages
else
  die "No matching man page found."
fi
Answered By: Stéphane Chazelas
Categories: Answers Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.