On Mon, Mar 31, 2014 at 07:44:07PM -0400, Peter Schaffter wrote: > Here's the bare bones version of the algorithm I was thinking of > when I proposed improving line formatting by getting groff to > shoulder the burden for some of the work we do manually. It's > written out in brute-force pseudo-code; should be pretty clear. > To get a feeling for formating alternatives I implemented a simplified version of the Knuth-Plass algorithm.
The idea is, to pack charakters as dense as possible and to minimize the maximum gap at the end of any line. In a second pass intra- and interword spacing could then be adjusted along the lines of the heuristic proposed by Peter Schaffter. To make tests realistic I implemented a primitive hyphenation algorithm that sometimes works :-) Character widths were taken from /usr/share/groff/1.21/font/devps/HR For speed up, I first use a greedy algorithm and employ its result as an upper bound for any gap in the dynamic programming solution. However, the speed-up achieved that way seems to be minimal. I tested with an English paragraph containing 775 words (before hyphenation): "In olden times, when wishing still did some good, there lived a king whose daughters were all beautiful, ...". This is a stand-alone program, just for a feasibility test. It reads a text (which is considered one paragraph), and runs the formatter. Results: -------- Lines of code 370 plus 184 for my linked list class. The core algorithm takes just 95 lines of code. Break-objects generated: 823 or 0.9 per hyphenated syllable. After formatting, these objects go into a list of garbage and can be reused for the next paragraph. Running time under Ubuntu on Intel(R) Atom(TM) CPU N455 @ 1.66GHz: Less than one millisecond when compiled with g++ -O4. I used a line length of 100 times the width of the space-character. The maximum gap at the end of any lines was: greedy: 10.6 spaces KP: 7.1 spaces I think, the experiment shows that running time is no issue at all. Kind regards, ulrich