On Sun, 24 Sep 2006, Daniel Burrows wrote:
On Sat, Sep 23, 2006 at 05:38:52PM +0200, Tomas Pospisek <[EMAIL PROTECTED]>
was heard to say:
On Fri, 22 Sep 2006, Daniel Burrows wrote:
[...]
There are two real problems [with parsing and rendering lists in
package descriptions - tpo]:
[...]
(2) The bulletting scheme cannot distinguish between a full stop at the
beginning of a legitimate line of text and a syntactically
significant full stop. Since these periods may occur after other
text or after two spaces, the standard description algorithm will
not strip this text -- meaning that aptitude will massively mangle
a description that displays just fine in the standard formatting.
Worse, you need to know about aptitude's formatting to get this right;
full-stops after indentation are just fine everywhere else, but not
if you've written a bulletted list. I presume that most Debian
developers aren't even aware aptitude exists, and even if they do,
they
aren't going to be intimately familiar with its description parsing
algorithm. I can't really defend a feature that's likely to lead to
surprising and undesired results.
This actually occurred in the example that led to bug #373888, and
at a quick check there's one other package (flac) that has this
problem.
I wasn't able to find any satisfactory solution to this problem.
What I do not understand here, is how come that a '.' at the beginning of
a line has syntactic influence on aptitude.
The Debian policy [1] distinguishes the following line starts:
* 1 space + whatever -> paragraphs
* 1 space + dot -> blank line
* 1 space + dot + whatever -> reserved / to be defined
* 2 spaces -> verbatim
Thus a line starting with:
* 1 dot -> is a syntax error in the description
-> thus of no business to aptitude
* 1 space + dot -> blank line -> of no business to aptitude
* 2 spaces + 1 dot -> per aptitudes list definition not part of a list
since the text of a list entry must be at least as
far indented as the first bullet and thus start
with at least 4 spaces off to th
-> thus render verbatim
* 3 spaces + 1 dot -> same as 2 spaces + 1 dot
* 4 spaces + 1 dot -> in case we're in a list entry (which is
determined by the previous line) then render the
dot as part of the list entry. Otherwise verbatim
* >4spaces + 1 dot -> dito, except that you'll probably want to ident it
with the additional spaces
All correct.
* 1 space + bullet + whatever -> standard rendering - no list entry
* 2 spaces + bullet + whatever -> list entry
Ah. The problem is that aptitude parses indented regions using the
standard grammar, but with N spaces stripped from each line before
applying the grammar. This allows multi-paragraph list entries, like:
* One list entry
* Another list entry.
.
Here is some more text describing the list entry.
The result of treating the last line as part of the list entry is that
it will be indented to exactly the same level as the rest of the entry,
and it will be word-wrapped instead of being treated as verbatim text.
Using the standard grammar also allows verbatim text in lists. This
was another reason for always stripping exactly one space after the
bullet; it allows you to write
* Homepage: http://some.url.example.com
and have it not get word-wrapped. (previously this would be considered
a normal bullet point with 2 spaces of indentation beyond the bullet)
Removing the ability to write multi-paragraph list entries is not
appealing to me.
Mh. Well I understand a bit better now, thanks a lot. So the conclusion
is that the grammar/parser needs a change. However:
Now that I look at this from a fresh perspective,
though, there is one other choice. I could consider text following
a bullet that's at the same indentation level and separated only by
blank lines to be part of the bullet, as in:
* One list entry
* Another list entry.
.
Here is some more text describing the list entry.
This is very ugly. Why not allow the grammar as it was and instead say
that:
* One list entry
.
Some more text
only this form will be accepted as a blank line and none other. That means
all other cases such as:
* ...
* bla
...
etc. will be part of a paragraph?
I think the path of least resistance, especially for getting this into
policy, is to have a backwards-compatible syntax. aptitude's list
handling was supposed to be backwards-compatible, but I made a mistake
in handling full stops that made it not backwards-compatible. I'm not
aware of any other technical problems, so I'll re-enable it in the next
release (once I implement the change above).
In case you'd like my suggested behaveour above, then the *only* change
wrt older description listers would be that the old listers would render
the description:
* bla bla
.
as is. Whereas aptitude would render it as:
* bla bla
[newline]
and that would, besides of the word wraping and bullet highlighting
feature you've allready implemented, be the only change of behaveour.
If we can prove that ...
And to come back to my suggested part of the fix of the problem(-range):
one (among other) ways to go would be to check *all* the descriptions
and to make sure that they render well and if not to "fix" them [2].
This does not seem to be a infeasible thing. Once this is done, then
there doesn't seem to be a technical hurdle to put the list syntax into
the Debian policy?
all existing packages are rendered correctly, even with the change of
syntax above then the need for backward compatiblity will be remain for an
empty set of packages. I can see your point about backward compatibilty,
but IMHO it's not really sensible to require it even if no existing
package will be affected any more...
Currently there are 1729 packages that contain bullets that will be
interpreted as such by aptitude and thus would need to be checked.
Using a bit of scripting it should be possible to do the later in a day or
so.
*t
--
--------------------------------------------------------
Tomas Pospisek
http://sourcepole.com - Linux & Open Source Solutions
--------------------------------------------------------
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]