On Sun, 24 Sep 2006, Daniel Burrows wrote:

On Sat, Sep 23, 2006 at 05:38:52PM +0200, Tomas Pospisek <[EMAIL PROTECTED]> 
was heard to say:
On Fri, 22 Sep 2006, Daniel Burrows wrote:

[...]
There are two real problems [with parsing and rendering lists in
package descriptions - tpo]:
[...]
(2) The bulletting scheme cannot distinguish between a full stop at the
    beginning of a legitimate line of text and a syntactically
    significant full stop.  Since these periods may occur after other
    text or after two spaces, the standard description algorithm will
    not strip this text -- meaning that aptitude will massively mangle
    a description that displays just fine in the standard formatting.

    Worse, you need to know about aptitude's formatting to get this right;
    full-stops after indentation are just fine everywhere else, but not
    if you've written a bulletted list.  I presume that most Debian
    developers aren't even aware aptitude exists, and even if they do,
    they
    aren't going to be intimately familiar with its description parsing
    algorithm.  I can't really defend a feature that's likely to lead to
    surprising and undesired results.

    This actually occurred in the example that led to bug #373888, and
    at a quick check there's one other package (flac) that has this
    problem.

    I wasn't able to find any satisfactory solution to this problem.

What I do not understand here, is how come that a '.' at the beginning of
a line has syntactic influence on aptitude.

The Debian policy [1] distinguishes the following line starts:

* 1 space + whatever       -> paragraphs
* 1 space + dot            -> blank line
* 1 space + dot + whatever -> reserved / to be defined
* 2 spaces                 -> verbatim

Thus a line starting with:

* 1 dot            -> is a syntax error in the description
                   -> thus of no business to aptitude
* 1 space + dot    -> blank line -> of no business to aptitude
* 2 spaces + 1 dot -> per aptitudes list definition not part of a list
                      since the text of a list entry must be at least as
                      far indented as the first bullet and thus start
                      with at least 4 spaces off to th
                   -> thus render verbatim
* 3 spaces + 1 dot -> same as 2 spaces + 1 dot
* 4 spaces + 1 dot -> in case we're in a list entry (which is
                      determined by the previous line) then render the
                      dot as part of the list entry. Otherwise verbatim
* >4spaces + 1 dot -> dito, except that you'll probably want to ident it
                      with the additional spaces

 All correct.

* 1 space + bullet + whatever -> standard rendering - no list entry
* 2 spaces + bullet + whatever -> list entry

 Ah.  The problem is that aptitude parses indented regions using the
standard grammar, but with N spaces stripped from each line before
applying the grammar.  This allows multi-paragraph list entries, like:

 * One list entry
 * Another list entry.
   .
   Here is some more text describing the list entry.

 The result of treating the last line as part of the list entry is that
it will be indented to exactly the same level as the rest of the entry,
and it will be word-wrapped instead of being treated as verbatim text.

 Using the standard grammar also allows verbatim text in lists.  This
was another reason for always stripping exactly one space after the
bullet; it allows you to write

 *  Homepage: http://some.url.example.com

 and have it not get word-wrapped. (previously this would be considered
a normal bullet point with 2 spaces of indentation beyond the bullet)

 Removing the ability to write multi-paragraph list entries is not
appealing to me.

Mh. Well I understand a bit better now, thanks a lot. So the conclusion is that the grammar/parser needs a change. However:

 Now that I look at this from a fresh perspective,
though, there is one other choice.  I could consider text following
a bullet that's at the same indentation level and separated only by
blank lines to be part of the bullet, as in:

 * One list entry
 * Another list entry.
.
   Here is some more text describing the list entry.

This is very ugly. Why not allow the grammar as it was and instead say that:

   * One list entry
     .
     Some more text

only this form will be accepted as a blank line and none other. That means all other cases such as:

   * ...
   * bla
     ...

etc. will be part of a paragraph?

 I think the path of least resistance, especially for getting this into
policy, is to have a backwards-compatible syntax.  aptitude's list
handling was supposed to be backwards-compatible, but I made a mistake
in handling full stops that made it not backwards-compatible.  I'm not
aware of any other technical problems, so I'll re-enable it in the next
release (once I implement the change above).

In case you'd like my suggested behaveour above, then the *only* change wrt older description listers would be that the old listers would render the description:

  * bla bla
    .

as is. Whereas aptitude would render it as:

  * bla bla
[newline]

and that would, besides of the word wraping and bullet highlighting feature you've allready implemented, be the only change of behaveour.

If we can prove that ...

And to come back to my suggested part of the fix of the problem(-range):
one (among other) ways to go would be to check *all* the descriptions and to make sure that they render well and if not to "fix" them [2]. This does not seem to be a infeasible thing. Once this is done, then there doesn't seem to be a technical hurdle to put the list syntax into the Debian policy?

all existing packages are rendered correctly, even with the change of syntax above then the need for backward compatiblity will be remain for an empty set of packages. I can see your point about backward compatibilty, but IMHO it's not really sensible to require it even if no existing package will be affected any more...

Currently there are 1729 packages that contain bullets that will be interpreted as such by aptitude and thus would need to be checked.

Using a bit of scripting it should be possible to do the later in a day or so.
*t

--
--------------------------------------------------------
  Tomas Pospisek
  http://sourcepole.com -  Linux & Open Source Solutions
--------------------------------------------------------


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to