Re: [Groff] Parsing specific section of man page

Ingo Schwarze Sun, 22 Jan 2012 07:57:37 -0800

Hi,

Siteshwar Vashisht wrote on Sun, Jan 22, 2012 at 07:29:23PM +0530:


> I am working on a shell, and the idea is to show suggestions for
> command line options on pressing tab. So let's say on pressing
> "ls<TAB>", I will get all the command line options for ls
> command.  Now a good idea is to fetch these options from manual
> pages.

OUCH.

No, please don't.
This is a misguided, terrible idea in so many ways.

 1) Do not bloat the shell.
    If there are two programs in userland that you don't want to
    be large, complicated, fragile, and slow, it's init(8) and
    the shell.  The shell is a critical component of the system,
    it is not the right place for bells and whistles.  And you
    don't want yet more non-standards-compliant features in the
    shell.  Shells diverge too much from each other and have too
    much bloat already.

 2) Do not bloat *any* utility program with unrelated bells
    and whistles.  That's just not the Unix way.
    Write small, focussed tools that make life easy when used
    in concert.
    For finding command line options, there is man(1).
    Most programs have a one-line usage() message, too.
    There is no need to do the same thing again, in a third place.

 3) Options typically are a dash and one character.
    If any frequently needed option is called
    --hey-look-i-invented-a-fantastically-long-name,
    that's a bug in the program; fix it, call it -l
    if it's really needed for every second invocation.
    If you need it twice a year, fine, type it in.
    No problem to solve here, move on.

 4) Whatever you do, it will be crude guesswork.
    Different systems document options in different ways.
    Heck, different *programs* document options in different ways.
    There are different languages for producing documentation,
    both the source and the output look different.
    Some systems install manual source code only, some
    install preformatted pages only, some install them gzip'ed,
    some don't, some cache them, some don't, some install them
    in this directory, some in another, some use man.conf(5),
    some use manpath(1), i could go on and on...
    Even if documentation authors try to follow some usual
    way, many get it wrong and the results are hard to parse;
    after all, most documentation authors are programmers in
    the first place and don't know or care that much about the
    finer points of formatting documentation.  In any case,
    you don't want crude guesswork in the shell.
    Or even if you accept that, it will maybe work half of
    the time on your own computer, but almost nowhere else.

 5) How are you going to find out whether you are even looking
    at the right manual?  There may be different versions of
    ls(1):  The native one, the UCB one, the GNU one, and then
    there may be a POSIX manual which doesn't correspond to any
    actual implementation.  Different packages are installed
    everywhere, and the file system layout is different
    everywhere, as well.

 6) You are trying to do semantic analysis with groff(7).
    Bad idea, don't do that, that's not at all what Roff
    is about, and Roff is doing a terrible job in that area -
    simply because it's beside the point, it was designed to
    produce high-quality typesetting output.
    If you want to do semantic analysis, use tools written
    for semantic analysis, like the mandoc(3) libraries
    written by Kristaps Dzonsons.  But *PLEASE*, don't link
    those into a shell...

 7) Even though the mdoc(7) language, allowing concise
    semantic annotations, was invented more than 20 years
    ago (and already used for the Net/1 4.4BSD release),
    most manuals still use the old-fashioned man(7) format
    nowadays - which doesn't even provide specific markup
    for command line options!  So, what are you even going
    to parse for?  You have almost nothing to work on in the
    first place, not even mandoc(3) will be able to help you
    much with man(7) input, even if you have the source code...

 8) ...

I'm sure i could go on like this.
But why should i?
Why kill a dead horse again and again to get it even more dead?

This is utterly hopeless.
Just drop the whole project.

Yours,
  Ingo

Re: [Groff] Parsing specific section of man page

Reply via email to