Lacrocivious Acrophosist posted on Sun, 25 Sep 2011 19:50:33 +0000 as excerpted:
> Graham Lawrence <gl00637@...> writes: > > >> Duncan, I'm using Pan 0.133, and thank you for the very detailed >> response. I hope you just pasted most of it and didn't have to type it >> all, because since I ran >> >> strace -feopen pan 2>&1 | grep -v 'icons\|cursors' | grep /home/g > >> pan.debug >> >> I think I know what the problem is, and it has nothing to do with Pan. >> strace was very consistent in its output. After the initial preamble >> for each task, it generated nothing but this pattern >> >> [pid 5641] > open("/home/g/.pan2/article-cache/part20of78.2Wn&pF6dRhmMn7DI5Klr@...", >> O_RDONLY) = -1 ENOENT (No such file or directory) >> >> and it does this for every part in every task. There is nothing in >> /home/g/.pan2/article-cache/ whose name even begins with p. >> >> My neighbor recommended newsgroups to me and offered to share his >> account with me if I would split the cost with him. The first time I >> used it all was well, except it posted a warning to the effect that >> such downloads could only be made to a single computer, which I >> dismissed at the time as an aberration, I was only downloading to one >> computer. This time around is my second use, and now the penny has >> dropped. My neighbor's computer is the single computer referred to, >> and it has blocked downloading to mine. >> >> I am very sorry to have wasted your time on this. FWIW, that was NOT a waste of time! =:^) If you note, I didn't really post any solutions, because I didn't know what the problem was. The entire goal was diagnostics, and it seems we've diagnosed the problem (tho see below), so regardless where it ended up being, the post was anything BUT a waste of time. =:^) Meanwhile, however, that trace confirmed a suspicion of mine that it was related to the cache. You may be right about the the root of the issue being server access restrictions, or not. I've seen very similar issues when pan had permissions issues, when the cache involved a bad symlink to a directory in an unmounted filesystem[1], etc, which is why I immediately suspected a caching issue of some sort. A caching issue of some sort was confirmed for sure, but we do NOT yet know for sure what's causing it. The most recent such situation here was when I tried out the new binary posting code in HMueller's experimental git tree. Here's how it happened. Some time ago, the pan of the time was noted to have what was arguably a security issue. Attachments that were posted as executable, pan was saving as executable as well. If someone clicked them and they WERE a virus or the like, they'd try to run. The list discussion decided there wasn't a good reason for that, and pan has for some time now removed the executable bit, if set, when saving a file (tho there was a regression at some point, for a version or two). IIRC it was me that pointed out that pan does follow the umask it inherits from its environment, and as such, before the bug was fixed, a user could set the umask in pan's environment to something like 0137, and pan would behave accordingly, stripping executable bits entirely (plus stripping the writable bit for group and not allowing access at all for other). The problem with that, as you will likely have guessed if you understand UNIX file permissions, was with directories. As long as all the directories pan needed were pre-created and permissions set appropriately, allowing directory entry (the same bit that's the executable bit on files), all was fine, since it didn't need to actually create the dir. But if one of pan's dirs didn't exist, with the 0137 umask I had set, it would create the dir fine, but couldn't actually enter it to work with the files it wanted to put there! Well, I've been using pan for years, and this didn't bother me as long as pan kept the same directory structure, since it was already created. But, the new binary posting code used a new posting-cache directory as scratch-space for encoding, etc, so guess what problem I ran into as soon as I tried actually using the new code? Right, it created the dir, but couldn't reach into it, and I got *VERY* puzzling errors... until I asked on-list about it, and someone's answer wasn't right on, but was sufficiently close to jog my memory of setting the umask. Upon investigation, sure enough, that was my problem! That happened only probably a couple months ago (I could check the archive to see when I posted the thread asking about it, if I /really/ needed the date, but it's not that important), so it's relatively fresh in my memory. So you see, I've had a bit of experience with strace -feopen pan myself, and I sort of recognized the symptoms of cache error, but most folks won't run into that sort of issue as they won't have anything like as complex a pan setup as I do, so I was somewhat doubtful of my instincts. But sure enough, straced ENOENT errors on what should be cached files confirms it. Now we just need to figure out for sure what the problem is. So, before you go labeling it a server restriction and give up, do double- check your cache dir permissions. If for whatever reason the executable/ dir-entry bit is turned off for whatever permission level pan runs at on your system (likely, it runs as your normal user), that would do it. Turn it back on for all pan's directories. Similarly, check SELinux permissions and the like if you run that, and user quotas for that partition, if you have them. Also either ensure that your cache isn't a dead symlink if you are using a symlink in that path, and that the appropriate partitions are mounted, and that they're NOT mounted read-only or some such. Meanwhile, there's another bit of info that may be helpful. It's not yet in the latest release (0.135), let alone 0.133, but someone just a week or so ago requested a -v (verbose) switch for pan, such that when it's downloading from nzbs, it prints to STDOUT the actual files it's downloading. HMueller has been on the ball and AFAIK already implemented the request, but it's only in his git repo, ATM. Having that output would be very useful indeed in this sort of case, and may well have eliminated the need for the whole thread. But, whether you want to hassle the whole live git repo compile-from-source thing is another question entirely. Presumably not, if you're still on 0.133, but the feature is actually available now, along with binary posting, auto- actions based on score (optionally automatically mark-read or delete low scored posts, cache or download-and-save high-scored posts), if you're willing to jump thru the necessary hoops. Finally, I've no idea what sort of news account you and your neighbor split, but it sounds like it could be a monthly-pay, unlimited per-month, deal, which is why they are so particular about multiple access. FWIW, unless one or both of you download *HUGE* amounts, it may be worthwhile at least considering block accounts. With these you purchase X gigs of data for Y money (dollars/euro/yen/whatever), and can use it until it runs out. No monthly charges. No expiration (unless of course you lose track of the login info or the news provider goes belly up). One of the interesting things about these sorts of accounts, besides not having to hassle the monthly payments if you're not using that much, is that it's actually in the provider's interest to make it easy for you to use it up, so you have to purchase more. As such, they don't tend to have NEARLY the restrictions that some of the others do, and you'd very likely be able to login from separate IPs at the same time, as long as both had valid login info, of course, because all they care about is the bandwidth you use, and the sooner you use it up, the sooner you have to buy more. There's two providers I know of that offer this. Astraweb.com is one. Blocknews.net is the other. Astraweb has 25 GB for (US) $10, or 180 GB for $25. Header downloads, etc, are NOT counted toward the block. Blocknews has blocks ranging from 5 GB for $2.75 to the astra-news comparable 25 GB for $8.50 and 200 GB for $21.59, to 500 gig for $51.49, 1024 GB for $91.39, and a massive 3072 GB (3 TiB) for $239.99! Headers *ARE* counted but traffic is discounted 10% to allow for headers. Thus, if you tend to grab headers for use with other providers or do a LOT of header downloading compared to bodies, astraweb would be better, but if you minimize your header downloads, between that, the 10% traffic discount, and blocknews' lower per-gig at the higher end (<7.82 cents/gig for the 3 TiB pkg, 10-11 cents a gig for the 200 and 500 GB pkgs, just under 9 cents a gig for the 1 TiB), you would be better off there. Astranews doesn't make their server list public, but blocknews has two server (farms), iad (Washington DC area), US, and Amsterdam area, Netherlands. FWIW, the Amsterdam area is home to MANY European news providers, apparently due to a rather friendlier legal situation for news there than most other locations, Europe or North America. Consider that those prices are unexpiring blocks, and talk to your friend about how much both of you download. Given the prices, unless you're downloading > 25 gigs/mo, it's very likely to be cheaper getting the blocks, and you'll probably more or less break-even thru a hundred gigs or so. If you're paying for giganews now, you'd probably be saving even more with the block accounts, unless their prices have come down substantially, recently, but giganews /is/ widely acknowledged as the gold standard of news providers and thus gets away with charging more for it. Whether it's worth it could be argued either way, but some people definitely consider it so. Of course, if you're downloading half a TiB or more a month, the unlimited monthly accounts are likely well worth it. But, that's a *LOT* of traffic for an individual, and if you're doing that, the account has likely already been flagged for TOS-abuse-watch. As they say, YMMV, but if I can save you a bit and decrease your chances of being TOSed at the same time... > Speaking only for myself, I can say that Duncan definitely did not waste > his time. That strace tutorial is going in my 'how to do stuff I could > never remember without a cheat-sheet' file ;-) > > Once again, Duncan has taught me to do things for which I previously had > only 'nodding knowledge'. Thanks Duncan. If I'd have known you were going to do that, I might have thrown in a paragraph or two dealing with the other -eXXXX options. FWIW, -efile can be useful, giving all file actions not just file-opens, but that gets a bit more difficult to read as well, since the opens list the file names but the other file actions generally don't; they use the file-numbers (the result of the open, = 6 in my example) instead. But that lets you see how long the file is open and what other files are opened before it's closed (tho if the file number isn't increasing, that means the same number is being reused repeatedly to open and close many files in sequence, and that's visible from the opens only), what the app actually reads/writes to the file and seek behavior within the file, etc. The other system calls are memory and etc; not really as useful to non- programmers as the file actions tend to be. Well, the network class of calls can be interesting, but by far the most oft used here is -efile, and within it, -eopen suffices MOST of the time, since generally what I'm after is something to do with files, since they're exposed enough for the information to be useful to me as a user and admin, even if I'm not a programmer (unless you include shell scripting in "programmer", it's more a sysadmin type skill to me, and I believe most, tho it can often sort of do the job of a program, for one skilled enough at it). Meanwhile, I too generally have to work out grep's OR behavior, each time. (I know to enclose it in '' to keep the shell from interfering, but I always seem to forget whether I still have to back-slash-escape the | ors as \| , or not. But at least I know enough to try it one way, and if it doesn't work, try it the other, without having to go to the manual for it each time, just try it both ways.) Otherwise, I'll often simply pipe a bunch of grep -v single-term-commands together using shell pipes, since I know how they work. And it took me awhile to remember the 2>&1 bit as well, but I've apparently done it enough now that it's beginning to sink in. =:^) But if it makes a useful cheat-sheet that's far simpler than the manpages while hitting all of the most-used bits, thus well demonstrating the 80/20 rule[2], have at it! =:^) --- [1] I have a dedicated cache partition for my binary pan instance (I run multiple pan instances, each with its own config, using the PAN_HOME var to point each at its config with a wrapper script). The rather long- winded explanation can be found in the list archives, probably several times as I believe I've posted it more than once over the years. [2] http://en.wikipedia.org/wiki/80/20_rule -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman _______________________________________________ Pan-users mailing list Pan-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/pan-users