On 7/27/25 9:09 AM, Greg Wooledge wrote:
On Sun, Jul 27, 2025 at 07:33:36 -0500, Richard Owlett wrote:
Now I have to relearn how to extract specific content from spreadsheets.
Something I haven't done in close to two decades.
What I usually ended up doing was opening the spreadsheet in Libre Office,
then saving it as a "CSV" (comma-separated values) file. I'm not aware
of any way to do that purely from the command line.
I was thinking along that line.
My first sub-task will be to delete 29 of 37 columns as irrelevant ;}
I've just noticed that LibreOfficeCalc has option to save as a dBase
file - should solve a number of potential problems.
Once you have a CSV file, there are a plethora of tools you can use
to extract pieces of it. CSV is a set of plain text formats with some
punctuation characters serving as field and record delimiters. The
exact punctuation characters in use will vary, so you will need to
examine the file manually at first, to see what you're dealing with.
There are also settings you can use within Libre Office, or whatever
program you used to produce the CSV file, to select your preferred
delimiters.
If you give us the URL of a spreadsheet (or a CSV file) and tell us
precisely what parts of it you want to extract, I'm certain someone
here will be able to cobble together a program to do it, in some
programming language, possibly even one you've already got installed.
A CSV version of the 8 column version of the spreadsheet could likely be
pretty-printed by a half-dozen BASIC DO loops.
All of this will be *significantly* easier than unspecified PDF file
manipulations.