I would say a single newline has no semantic meaning. Far too often, people
just put a newline for formatting reasons. Identically, no-break-space has
no semantic meaning - it's just formatting.

It doesn't make sense to preserve single newlines, since the output would
have wildly different line length anyway. To properly handle line-formatted
paragraphs, you'd have to detect that it is line-formatted before
translating, store how long the longest line is, then after translating
break the output into lines of max that length.

Multiple newlines in a row should constitute a paragraph break.

-- Tino Didriksen


On 8 November 2017 at 13:25, Francis Tyers <[email protected]> wrote:

> Very many times people come to me and ask why they get
> a different number of lines out of lt-proc than they
> put in.
>
> The answer is invariably that there is some multiword
> that is gobbling up a newline.
>
> My question is: Is this ever the right thing to do ? I
> struggle to come up with use cases for this. I'm not
> sure how hard it would be to fix. But I thought I'd start
> a discussion.
>
> Fran
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to