On 2/20/25 11:20 AM, debian-u...@howorth.org.uk wrote:
Richard Owlett <rowl...@access.net> wrote:
I wish to extract CSV formatted data from a PDF document. [1]
Page ES-7 has a weekly grocery list for males grouped by age.
I need only the first and last columns.
Can someone point me in a suitable direction?
TIA
[1] https://www.fns.usda.gov/cnpp/thrifty-food-plan-2006
Table ES-1. Thrifty Food Plan market baskets, quantities of food
purchased for a week, by age-gender group, 2006
If you look at
https://www.fns.usda.gov/cnpp/thrifty-food-plan-2021 instead, you can
find the underlying data in spreadsheet form (.xlsx). Perhaps that will
be an adequate substitute?
You just demonstrated that "Murphy's Law" holds ;<
I click on the link you quoted in my default browser and a PDF is
displayed [actually my original starting point months ago].
If I use my alternate browser {Firefox instead of SeaMonkey} I get to
chose which of several files to view. {one of them is an .xlsx file}
Murphy gets a second jab in.
The 2006 version has the data I want in a slightly different layout that
the 2021 version. The first is a better match for how I do things ;/
Also the PDF structure of the two links react slightly differently when
selecting with mouse movements/clicks. The 2006 version seems to allow
me to select only what I want. [ 2021 version grabs everything between
first and last click. 2006 appears to select only the columns of interest]
Can't spend time right now to verify first impression. Will know more
this weekend.
*THANK YOU*