"G. Branden Robinson" <[email protected]> writes:
> At this point I must ask that you direct me to a specific document that
> you think will actually be adversely affected by this change.
Yeah, looking at this in more detail shows that way more authors even than
I had realized have stopped using two spaces after sentences. If they use
one space after a sentence in POD source, they're not affected by this
change (or arguably even helped at line breaks) because they're already
going to break *roff's detection of sentence boundaries. I found a bunch
of examples in prose with a quick search, but nearly all of them were by
authors that used one space after sentences and thus are unaffected by
this proposed change.
Once one excludes those, and authors who use the style of putting the
period outside the quotes (more common in Europe as I recall), I agree
that there aren't a ton of examples.
Here are the ones that I found on my local system that I believe would be
adversely affected:
Perl/Critic/DEVELOPER.pod:what's wrong." The explanation can be either a
string with further
SQL/Translator/Schema/Constraint.pm:then returns "1." The argument is
evaluated by Perl for True or
SQL/Translator/Manual.pod:with "sqlt." Here are the scripts and a description
of what they each
Sub/Exporter.pm:"import." In addition to the normal exporter configuration, a
few named
WWW/Mechanize.pm:A value of C<0> means "no history at all." By default, the
max stack depth
I didn't do a very sophisticated search (only Perl modules, a very simple
search pattern), so I doubtless missed some. The search that seemed the
most effective was:
grep -r '^[^ ].*\." [^ ]' /usr/share/perl5
(That first bracketed section contains a space and a literal tab.)
I found it easier to search the actual POD source, not the generated man
pages, which confuse matters with other sorts of markup. Pod::Man will
pass through those lines essentially verbatim, so these should be valid
examples of man pages that I think would then be misformatted.
Note that this search looks specifically for prose examples, since that's
what I was looking for, and will exclude some code examples that you may
also be interseted in. You can find the code as well by dropping the
initial anchoring, but then you match a lot of actual source code instead
of POD, and the search problem becomes harder than I have the brainpower
to deal with this late in the evening.
> Incidentally, here's more context of the "hit" in our search.
> ---snip---
> $ sed -n '423,438p' /usr/share/perl5/HTML/Tree/AboutTrees.pod
> So, for example, when HTML::TreeBuilder builds the tree for the above
> HTML document source, the object for the "body" element has these pieces of
> data:
> * element name: "body"
> * nodes it contains:
> the string "I've got "
> the object for the "em" element
> the string "!"
> * its parent:
> the object for the "html" element
> * bgcolor: "#d010ff"
> Now, once you have this tree of objects, almost anything you'd want to
> do with it starts with searching the tree for some bit of information
> in some element.
> ---end snip---
> I don't know what indentation is _supposed_ to mean in POD, so I can't
> form an opinion as to whether Burke abused the input language or not.
> This looks more like an attempt at an itemized list (with nesting) than
> a code display. Even so, I reiterate that he _didn't set off_ the land
> mine you suspect I am planting.
It's a code display and a fairly common use of one in POD if you want a
specific formatting layout that POD's very limited list support doesn't
allow for.
> And even within the remaining 10%, it appears one can still get pretty
> quote-happy without adverse effect.
If, like Sean, you put the punctuation outside the quotes instead of
inside it, sure. Then the string `."` doesn't occur at the end of a
sentence, and of course you're not affected. :)
> Have I given you enough evidence to update your Bayesian priors?
Somewhat (I was certainly wrong about the straight frequency of `."`!) but
not entirely, mostly because I still think the problem you're trying to
fix is practically nonexistent (it's not *just* use of `."` in code blocks
in a man page, it's specifically use followed by two spaces that would be
erroneously detected as the end of a sentence), whereas I was able to find
several examples (albeit fewer than I expected!) of normal prose use of
quotation marks that would be misformatted by your proposed change.
> `nf` is a mnemonic for "no fill [mode]", not "no formatting".
Yes, I do understand; I simply think it's a sufficiently poor design
choice that it would be worth changing. You are entitled to not care about
that opinion, and indeed probably shouldn't since I'm not volunteering to
do the changing. :)
I certainly agree that this would be a lot more disruptive and would
probably have all sorts of negative consequences that I am not thinking
of. It was a fairly idle thought and not one with a lot of value. I
generally agree with your maintenance philosophy that groff is a program
of sufficient age and expected reliability that it's best not to meddle
too much with its long-standing historical behavior.
>> But at least, in the current system, an author who *wants* to get this
>> right can get 99% of the way there by adopting a policy of adding a
>> line break after each sentence. If I understand the proposal correctly,
>> the `cflags 0 "` change would make the problem much worse, since there
>> would no longer be a way to tell *roff to consistently space a sentence
>> ending with `."`, even by adding line breaks after sentences.
> True, but who's doing this?
This is a fair point; I'm not sure if anyone at all is doing this in POD.
I only recently added that recommendation to the Pod::Man documentation
and maybe more people will see it, or come over from other communities
where this is common, but realistically, at this stage in Perl's lifetime,
probably not. I admit that I don't even do it myself, mostly because I
have a thing about consistency and quailed from the thought of
reformatting all of my many existing POD documents in that style, even
though I think it's strictly better for reasons entirely apart from
formatting, such as avoiding spurious changes in diffs due to rewrapping
paragraphs.
I know of multiple examples of groups that do write all their
documentation with a newline after each sentence (my group at work, for
instance), but the ones I am personally aware of use Markdown and rST and
thus aren't very relevant to this discussion. POD is an odd corner niche
that is mostly legacy at this point, so in that sense nothing I do with
POD probably matters *that* much.
Nonetheless, it bothers me to remove a tool to write more semantically
correct man pages, mostly because I think you're doing it to solve a
problem that is, so far as I can tell, almost entirely theoretical. I
admit that's probably also how my objection looks to you, but maybe the
above examples will help make it *slightly* less theoretical!
I admit that I'm primarily arguing against this change because it feels
wrong to me and I can see some concrete things that it would break, not
because I think many people in the Perl community care (or will even
notice) given how many people already write POD with one space after
sentences and live with (or don't even notice) the inconsistent sentence
formatting if a sentence happens to end at the end of a line and is then
reflowed. I've gotten some complaints about that, hence the Pod::Man
documentation change, but not a ton.
I strongly suspect you and I care more about this issue than the sum of
the caring of at least half of the man page authors in the world. :)
> I'm not averse to adding an escape sequence that explicitly means
> "sentence ends here", or that makes the existing (and long-toothed) `\)`
> GNU troff extension "sticky" until the end of the word, which would
> achieve the same end.
Yeah, I'm dubious about adding more markup unless it comes with some sort
of machine learning model that adds the markup automatically. People
writing *roff directly already have other options to deal with this issue,
so it mostly just moves the problem up to *roff generators that are even
less prepared to deal with it. Pod::Man has no sentence boundary
generation algorithm at all, and I'm not eager to add or maintain one.
> Nevertheless I think Unicode input is the horse to bet on in the long
> term.
This is probably true, but at least the Perl community doesn't seem very
interested in adopting it. I could only find a single POD document that
used Unicode quotes on my system. (Ironically, they seem to be more common
in source code!)
--
Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>