On Sun 23 Feb 2025 at 22:13:55 (+0700), Max Nikulin wrote:
> On 22/02/2025 05:02, David Wright wrote:
> >
> > With mupdf, I don't even
> > know how to copy, as the mouse just drags the page around.
>
> I have not tried it, but...
> https://manpages.debian.org/bookworm/mupdf/mupdf.1.en.html#Right~
On 22/02/2025 05:02, David Wright wrote:
On Fri 21 Feb 2025 at 09:53:46 (+0700), Max Nikulin wrote:
P.S. "pdftotext -layout" in some cases is better than without
"-layout".
I think the results are roughly comparable with my scrapings,
for this document at least. Perhaps both pdftotext and xpd
On 2025-02-23, Max Nikulin wrote:
>
> I am sure there should be ready to use tools that extract tables from
> PDF and from aligned text. Out of curiosity I tried to create a small
> python script to process text you attached earlier. It does not try to
For previously created python wheels ther
On 22/02/2025 05:02, David Wright wrote:
With mupdf, I don't even
know how to copy, as the mouse just drags the page around.
I have not tried it, but...
https://manpages.debian.org/bookworm/mupdf/mupdf.1.en.html#Right~2
On Fri 21 Feb 2025 at 09:53:46 (+0700), Max Nikulin wrote:
When text fi
On Fri 21 Feb 2025 at 17:13:17 (-0500), Cindy Sue Causey wrote:
> On Fri, 2025-02-21 at 21:20 +, debian-u...@howorth.org.uk wrote:
> > For me, FF opens a normal web page and tries to download a PDF file as
> > well. Cheeky thing! For both the 2006 and 2021 pages. I can't be
> > bothered trying
fxkl4...@protonmail.com wrote:
> in discussions about pdf utilities i've don't recall atril being mentioned
> it's become my goto viewer
perhaps because it is normally a part of the MATE
desktop?
i've been using it for years and so far no major issues
that i've noticed, but i'm also not doing
On 2025-02-21, David Wright wrote:
>> >
>> > I get:
>> >
>> > Access Denied
>> > You don't have permission to access
>> > "http://www.fns.usda.gov/cnpp/thrifty-food-plan-2006"; on this server.
>> > Reference #18.dd831002.1740148075.35e89c97
>> >
>> > https://errors.edgesuite.net/18.dd831002.174
On Fri, Feb 21, 2025 at 03:59:55PM -0600, David Wright wrote:
> On Fri 21 Feb 2025 at 21:20:45 (+), debian-u...@howorth.org.uk wrote:
[...]
> > > I get:
> > >
> > > Access Denied
> > > You don't have permission to access
> > > "http://www.fns.usda.gov/cnpp/thrifty-food-plan-2006"; on this se
in discussions about pdf utilities i've don't recall atril being mentioned
it's become my goto viewer
On Fri, 2025-02-21 at 21:20 +, debian-u...@howorth.org.uk wrote:
> Greg wrote:
> > On 2025-02-21, David Wright wrote:
> > >
> > > > > > [1] https://www.fns.usda.gov/cnpp/thrifty-food-plan-2006
> > > > > > Table ES-1. Thrifty Food Plan market baskets,
> > > > > > quantities
> > > > > >
On Fri 21 Feb 2025 at 21:20:45 (+), debian-u...@howorth.org.uk wrote:
> On Fri 21 Feb 2025 at 14:30:08 (-), Greg wrote:
> > On 2025-02-21, David Wright wrote:
> > >
> > >> > > [1] https://www.fns.usda.gov/cnpp/thrifty-food-plan-2006
> > >> > > Table ES-1. Thrifty Food Plan market ba
On Fri 21 Feb 2025 at 09:53:46 (+0700), Max Nikulin wrote:
> On 21/02/2025 08:00, David Wright wrote:
> > I dragged the mouse
> > across the Males table and dumped it in a file.
>
> David, I recall you mentioned xpdf in your messages. It allows to
> select rectangular regions. Sometimes it is conv
Greg wrote:
> On 2025-02-21, David Wright wrote:
> >
> >> > > [1] https://www.fns.usda.gov/cnpp/thrifty-food-plan-2006
> >> > > Table ES-1. Thrifty Food Plan market baskets, quantities
> >> > > of food purchased for a week, by age-gender group, 2006
> >
> > I don't read PDFs /in/ the br
On 2025-02-21, David Wright wrote:
>
>> > > [1] https://www.fns.usda.gov/cnpp/thrifty-food-plan-2006
>> > > Table ES-1. Thrifty Food Plan market baskets, quantities of food
>> > >purchased for a week, by age-gender group, 2006
>
> I don't read PDFs /in/ the browser: it downloads it i
On 21/02/2025 08:00, David Wright wrote:
I dragged the mouse
across the Males table and dumped it in a file.
David, I recall you mentioned xpdf in your messages. It allows to select
rectangular regions. Sometimes it is convenient since this strategy does
not depend on order of objects inside
On Thu 20 Feb 2025 at 13:52:06 (-0600), Richard Owlett wrote:
> On 2/20/25 11:20 AM, debian-u...@howorth.org.uk wrote:
> > Richard Owlett wrote:
> > > I wish to extract CSV formatted data from a PDF document. [1]
> > > Page ES-7 has a weekly grocery list for males grouped by age.
> > > I need only
On 2/20/25 11:20 AM, debian-u...@howorth.org.uk wrote:
Richard Owlett wrote:
I wish to extract CSV formatted data from a PDF document. [1]
Page ES-7 has a weekly grocery list for males grouped by age.
I need only the first and last columns.
Can someone point me in a suitable direction?
TIA
[
Am Donnerstag, 20. Februar 2025, 15:08:27 CET schrieb Richard Owlett:
> I wish to extract CSV formatted data from a PDF document. [1]
> Page ES-7 has a weekly grocery list for males grouped by age.
> I need only the first and last columns.
>
> Can someone point me in a suitable direction?
>
> TIA
Richard Owlett wrote:
> I wish to extract CSV formatted data from a PDF document. [1]
> Page ES-7 has a weekly grocery list for males grouped by age.
> I need only the first and last columns.
>
> Can someone point me in a suitable direction?
>
> TIA
>
> [1] https://www.fns.usda.gov/cnpp/thrifty
Try pdftotext.
--
John Hasler
j...@sugarbit.com
Elmwood, WI USA
On 2025-02-20 14:08, Richard Owlett wrote:
I wish to extract CSV formatted data from a PDF document. [1]
Page ES-7 has a weekly grocery list for males grouped by age.
I need only the first and last columns.
Can someone point me in a suitable direction?
TIA
[1] https://www.fns.usda.gov/cnpp/thr
21 matches
Mail list logo