On 2/20/25 11:20 AM, debian-u...@howorth.org.uk wrote:
Richard Owlett <rowl...@access.net> wrote:
I wish to extract CSV formatted data from a PDF document. [1]
Page ES-7 has a weekly grocery list for males grouped by age.
I need only the first and last columns.

Can someone point me in a suitable direction?

TIA

[1] https://www.fns.usda.gov/cnpp/thrifty-food-plan-2006
      Table ES-1. Thrifty Food Plan market baskets, quantities of food
       purchased for a week, by age-gender group, 2006

If you look at
https://www.fns.usda.gov/cnpp/thrifty-food-plan-2021 instead, you can
find the underlying data in spreadsheet form (.xlsx). Perhaps that will
be an adequate substitute?



You just demonstrated that "Murphy's Law" holds ;<

I click on the link you quoted in my default browser and a PDF is displayed [actually my original starting point months ago].

If I use my alternate browser {Firefox instead of SeaMonkey} I get to chose which of several files to view. {one of them is an .xlsx file}

Murphy gets a second jab in.
The 2006 version has the data I want in a slightly different layout that the 2021 version. The first is a better match for how I do things ;/

Also the PDF structure of the two links react slightly differently when selecting with mouse movements/clicks. The 2006 version seems to allow me to select only what I want. [ 2021 version grabs everything between first and last click. 2006 appears to select only the columns of interest]

Can't spend time right now to verify first impression. Will know more this weekend.

*THANK YOU*



Reply via email to