On Mon, Mar 18, 2024 at 11:54:42AM +0100, Manny wrote: > Searching the whole DB for a whole word requires using --regex and > then using word boundaries. So to find pages that reference the TZ > environment variable, this *should* work (in principle): > > $ man -aK --regex '\<TZ\>' > > It appears to work because it finds many pages. But it misses the > “tree” package (/usr/share/man/man1/tree.1.gz). > > $ zgrep 'TZ' /usr/share/man/man1/tree.1.gz > \fBTZ\fP Timezone for timefmt output, see \fBstrftime\fP(3). > > As you can see, the nroff language intereferes with matching the > regular expression as “TZ” is surrounded by code. Users of man-db > obviously do not intend to have their regex matched against nroff > code. Thus operations are being performed in the wrong order. The > regular expression matching needs to happen on nroff-decoded text.
In principle I certainly agree that this would be more usable, but I've considered this in the past and given up as making it perform well would have been very difficult. There's a note about this in man(1), under the description of -K: Note that this searches the sources of the manual pages, not the rendered text, and so may include false positives due to things like comments in source files, or false negatives due to things like hyphens being written as "\-" in source files. Searching the rendered text would be much slower. -- Colin Watson (he/him) [cjwat...@debian.org]