Re: [Pan-users] pan cmd line things

Duncan Tue, 03 Sep 2013 05:20:49 -0700

markballard posted on Mon, 02 Sep 2013 15:47:14 -0400 as excerpted:

> I'm using linux (amd64/gentoo) and pan r0.139.


Greetings, fellow gentooer! =:^)

(~amd64 here, with layman and several overlays, and running a number of 
masked for testing prereleases and live-9999 ebuilds, including kde 4.11 
live-branch 4.11.49.9999 from the kde overlay, and pan-9999, my own ebuild 
based upon the (masked-as-live) pan-9999 ebuild available in the gentoo 
tree.)

> a couple things about cmd line/"--no-gui --verbose" for nzb files:
> 
> a)  I have primary and one fallback set in servers.xml but if I put an
> invalid port in for primary before starting pan (to test if cmd line
> mode handles multiple servers like gui apparently does) pan seems to
> ignore there's a backup and just sits there not d/l anything.

The standard way to temporarily disable a server is to set its 
connections to 0.  That disables it, without removing it entirely.

In theory, what pan does (at least in GUI mode, barring bugs) with an 
invalid port but non-zero connections set, is attempt connection to that 
server, but those tasks will remain in the task list, as it can't contact 
the server to execute them.  It SHOULD fallback to the other server, but 
the attempted connection to the first server will remain in the task list 
until pan is restarted, or until the task is manually removed.

Of course in no-gui mode, there's no GUI task list editing...

Meanwhile, there aren't a lot of users actually using no-gui mode, and 
last I knew, some bits of it were actually broken and their would-be 
entries in the pan --help output were commented out.  I remember reading 
the git-whatchanged entry for that commit when it happened.  But that was 
basically simply commenting out the --help entries for functionality that 
had been broken for awhile, and I don't think that has made it to an 
actual release yet, so I believe the broken no-gui functionality still 
appears in the help output for the latest release, which I believe still 
is the 0.139 version you say you're using.

If you happen to be a coder[1], I'm sure patches would be welcomed to fix 
the broken functionality, and even if not, having someone that's actually 
testing no-gui mode is definitely welcomed, since apparently that 
functionality had been broken for some time and nobody had actually 
caught and reported it.

But I'm not sure whether the no-gui functionality you were using was 
actually part of the known-broken bit that has been commented out, or 
not.  You'd have to unmask and try the -9999 version to be sure, and bug 
report based on it, since I know SOME of the no-gui breakage is already 
known and the options thus removed from the --help output, but I haven't 
followed it closely enough, not using that functionality myself, to be 
sure whether your bits are there, or not.

> b)  ctl-q shows in gui as how to close pan but I can't figure out how to
> properly close pan when using --no-gui ("q" nor ctl-q stop it, I have to
> ctl-c).

Yes.  ctrl-q is a very common X-based-app "quit" keyboard shortcut -- I 
believe the standard one for gtk/gnome apps.

But being primarily an X-app standard keyboard shortcut, it's not for non-
gui mode, which is actually designed to be run "headless", as part of a 
script invoked via cron job or perhaps from a context menu entry as an 
action associated with say the *.nzb filetype.

For that, no "quit" keyboard shortcut should be needed.  If you're 
testing the command at the commandline prior to scripting it and it 
hangs, that's what ctrl-c is for.  Otherwise, you'd use the standard kill/
killall/pkill/etc commands to send an appropriate signal, just as you 
would with any other CLI command that goes haywire.

> c)  I'm not finding a way to have pan not redownload part of a rar set
> that already exists.  for example I start an nzb d/l, have to stop pan,
> it has its tasks.nzb but it lists everything from the original nzb.  so
> to avoid "_copy" when I restart I have to edit those articles out of
> tasks.nzb (I could open queue and remove the existing ones there but in
> --no-gui that's not available; plus imm it's kind of clunky anyway
> having to manually tell pan in effect something was already d/l so don't
> d/l again).

Well, that's actually due to two factors.  The first is as already (sort-
of) stated, that pan's no-gui mode is designed to be invoked on-demand 
for a particular task, which it completes and then exits.  The assumption 
is that no-gui mode won't be terminated in the middle of a task and then 
restarted... people would use the GUI for that sort of thing.

The second has to do with how pan works with its message cache.  By 
default, the cache is only 10 MB, which is clearly smaller than many 
binaries these days... or even over a decade ago.

Story time! =:^)

I remember shortly after the turn of the century when I first started 
using pan after switching from MS and Outlook Express (I switched to 
Linux when MS did eXPrivacy, as that crossed a line I simply wasn't going 
to cross), and an ad-ware binary downloader of some sort I've long 
forgotten the name of now, how frustrated I was at how pan worked, 
because I expected to be able to download binaries to cache and then go 
thru them and save them off to permanent storage after they were all in 
local cache already.

As I had been accustomed to doing on the MS side, I setup pan to cache a 
bunch of binaries then went off to do something else (sleep or work, 
generally), expecting to come back to find them all downloaded and ready 
for me to go thru and save off locally.  But when I came back only about 
10% of the messages I'd told it to download were shown as cached!

So after some grumbling, particularly about the stuff that was no longer 
available as it had expired in the mean time, I set it up to do again, 
and WATCHED it this time.  Then I watched in amazement as pan proceeded 
to delete the messages it had JUST downloaded, without my even getting a 
chance to read them!

Of course what was actually happening was that pan, still set to a 10 MB 
cache, was deleting the oldest messages in ordered to make way for the 
new ones, all the while keeping the cache under 10 MB.

Needless to say, when I figured that out, I quickly upped the cache to 
the then maximum allowed 1 GB, and my first pan bug report (and possibly 
my first Linux bug report actually filed) was a request to up the cap to 
the 4 GB or so that was at that time the size of the dedicated partition 
I was using for pan cache.

I remember Charles Kerr (pan lead dev back then) deciding to up the cap 
to the then huge (considering the size of disks back then) size of 20 
gig, tho he kept the default 10 MB size.  But I was happy, as with the 
appropriate cache size set, I could then use the full partition I had 
dedicated for pan cache, and messages no longer disappeared from cache 
before I even had a chance to read them. =:^)  I don't believe there 
actually is a maximum cache size cap at all, today.  If you have the 
space, I /think/ a setting of 100 Tebibytes would work in terms of cache 
size, tho that's untested and my cache remains only in the tens of gigs 
today.

(Additionally, it's worth noting that pan currently stores its entire 
threading tree in memory, reconstructing that tree every time it starts, 
making either the read-in-time (minutes for a several gig cache on 
spinning rust, here, with a cold disk cache, tho switching to SSD lowered 
that to less than minute again) or the size of the tree in memory (a 
bottleneck people running 32-bit pan run into frequently these days, 
since AFAIK 32-bit apps are limited to 2 GiB memory usage with the 
default kernel settings, while that can be raised to 4 GiB with the 
appropriate kernel config, even that's too small on some groups) being 
the practical cap there, likely to be reached WELL before a 100 TiB pan 
message cache would fill up!)

As with your use of no-gui mode today, the problem back then was one of 
assumptions.  As a user, my assumption was that what I downloaded to 
cache would stay there until I deleted it.  But pan had been created with 
rather different assumptions about cache, assumptions that are still in 
the defaults today, altho thankfully, those defaults can be changed to 
better fit the way *THIS* particular user works. =:^)  

Pan's assumption (or more correctly, that of its devs) back then (and 
still by default) was that users would save files off directly instead of 
saving them to cache first, then browsing the local cache from within 
pan, as I was used to doing on the MS side.

But the good thing was that with a little tweak to the cache size and 
ultimately to the max cache size, my assumptions could be met as well.

Anyway, how that relates to your issue is...

If messages happen to still be cached, pan won't redownload them, since a 
cached file for that message-id already exists (pan uses the message-id 
as the filename, which works since they're supposed to be globally 
unique, making it simple to see whether a particular message is already 
cached or not).

Which means if you set your cache size high enough, having a task attempt 
to run over again, with at least part of the component messages already 
cached, will at least not actually redownload those messages.

But you'd still have the _copyN problem, since that's separate from 
whether the messages are actually cached or not.

Again, it's a matter of differing assumptions.  You're obviously making 
different assumptions about how no-gui mode is supposed to work, than the 
pan devs did.

But as with my case all those years ago, the tweaks necessary to support 
the differing assumptions should be pretty small.

The bottleneck there is actual developer time to implement the necessary 
patches.  Heinrich has been the developer that has done the most work 
with pan recently, including the commit that commented out the broken 
bits of no-gui mode from the help output.  But he hasn't had near the 
time recently to work on pan that he had for the six months to a year 
that he was so active, so not that much is happening ATM.

If you happen to be a coder, and DO have some time, that would be 
WONDERFUL, since then we'd actually have a coder working on pan that 
actually uses no-gui mode enough to give it the testing and support it 
has been lacking all these years, and pan could become a major gui-less 
news client as well, as befits a news client which has always had the 
goal of being the Pimp-Ass-Newsclient (thus the name PAN, which until the 
rewrite some years ago, was most properly always in ALL CAPS as it was 
and remains an acronym, tho of course the pimp-ass part is not so 
acceptable today, so "pan" it has become, with pretty much only the folks 
who have been around for awhile or who have read my remarks about it 
here, knowing to what "pan" actually refers).

> or maybe the trouble is I can't find a way to stop pan gracefully in cmd
> line mode before entire nzb d/l has finished?  ie if I could it would
> remove from tasks.nzb/queue what's already been d/l?

Assumptions... and lack of people actually using no-gui mode, again...

> otoh wrt nzb imm it would still be handy if pan removed anything in the
> queue/tasks,nzb when it sees the filename is already there.

As mentioned, at least it does that with cache, not re-downloading 
messages already cached.  But you won't see it in action much with the 
default 10 MB cache size. =:^(

Also, I'd guess it'd need to be made a commandline option.  Because in 
some cases filenames are very common (for instance, grabNNNN.jpg as a 
default name for stills grabbed from a video), and people might actually 
want the _copyN files where they're *NOT* duplicates.

Alternatively, make it download and save to a _copyN file by default, 
then do a size and perhaps checksum comparison, and delete the _copyN 
file if they're identical.  That's safe enough to be default behavior 
(particularly if checksum-checked), since I can't imagine people actually 
wanting the _copyN files if they really are dups -- in that case if they 
wanted a copy they'd do a local copy.

Then the commandline option could be used to turn on the rather less safe 
always overwrite mode, for people who were more concerned about disk 
space usage and a whole slew of _copyN files even if NOT identical, than 
they were about potentially losing an earlier non-duplicate download 
simply because it had the same name.

> d)  a minor thing is there's no "time/files remaining" kind of
> notification using "--no-gui --verbose".

As I said, no-gui mode is designed to be scripted and run headless, or 
for trouble-shooting, possibly from the CLI shell, and thus wouldn't 
normally need progress output.  However, were progress output available, 
--verbose would indeed be the logical method of enabling it.

Again, no-gui mode has really been quite neglected as few use it.  So 
someone actually using it to report bugs is definitely welcomed, but 
unless you're a coder that can provide patches as well, it might take 
awhile to get anything more than punt-and-comment-out-the-corresponding-
help-output fixes.

But if you happen to be a coder that can actually provide the patches 
too, well then, you're fixes should be even *MORE* welcomed! =:^)

In this specific case, I definitely see no reason why a reasonable no-gui 
mode progress output patch, with that output activated by the --verbose 
switch, wouldn't be welcomed.  (Of course the usual caveats about coding 
style and etc apply, but I guess coders should take that as a given.)

---
[1] Coder:  Unfortunately I'm not a coder, tho I know enough about it to 
sort of follow some patches and to occasionally do my own, as well as to 
follow the gist of many developer discussions.  I sometimes describe that 
as a "sysadmin level" of understanding, but it's basically the same level 
of understanding a typical gentooer might have, tho I've been on gentoo 
for a decade early next year as well, and I know my own understanding has 
continued to grow in that time, so I suppose I've a bit more experience 
and depth of understanding there too than the typical gentooer.

Meanwhile, my claim to any sort of authority here is simply due to the 
fact that I've been active on-list for over a decade now (and I believe 
it's fair to say, the MOST active in reasonable replies, for much of that 
time), and as such I function as sort of an institutional memory for both 
users and real developers, as I generally reply, as here, not only direct 
answers, but with more detailed explanations of why it works that way, as 
well.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


_______________________________________________
Pan-users mailing list
Pan-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/pan-users

Re: [Pan-users] pan cmd line things

Reply via email to