On 21/02/2025 08:00, David Wright wrote:
I dragged the mouse
across the Males table and dumped it in a file.
David, I recall you mentioned xpdf in your messages. It allows to select
rectangular regions. Sometimes it is convenient since this strategy does
not depend on order of objects inside PDF files.
Other PDF viewers allows to conveniently select contiguous spans of
text, e.g. end of some line and beginning of next one. Unfortunately
enough PDF files have pieces of text put in almost random order. At
least in Firefox selection may work in a quite peculiar way skipping
some fragments and adding visually unrelated ones.
So selection of text in PDF files may strongly depend on viewer.
P.S. "pdftotext -layout" in some cases is better than without "-layout".
When text file has properly aligned columns, instead of "quoting" some
spaces, it may be better to add TAB characters at certain positions on
each line. Perhaps LibreOffice Calc even has GUI to select column widths
during importing of text files.