Beartooth posted on Sun, 21 Apr 2013 13:57:10 +0000 as excerpted: > My .pan2 is running close to 400 MB, and I'm sure most of it is an aged > accretion of cruft; I'd like to edit it down to somewhere between a > tenth and a quarter of that. Is there an easy way, that will do no harm?
Well, "easy" is relative... and "do no harm" is relative as well, but yes, there's a way. FWIW, my pan text-instance directory (.pan2, except it's pointed elsewhere, here) is a gigabyte, here, but that's because I deliberately set no expiration on my various text groups and a multi-gig cache size, so nothing expires in those groups. I have messages in some groups going back years, some of them on servers or in groups that no longer publicly exist. What I'd recommend doing first is using a graphical tool such as filelight or fsview (both kde tools but gnome and others probably have similar, looks like pysize is one such more universal tool), opening it to your ~/.pan2 dir. These tools show a graphical representation of files and (nested) directories by size, so it's dead easy to see what specific files are taking up the most room, and just how much room they are taking up as a percentage of the whole. For instance, here, filelight tells and shows me that the article-cache subdir is taking up 92% of all the space used by my pan text instance data dir, 975 MB out of that gig I mentioned. The groups subdir is taking up another 6% (67 MB), leaving 2% for the small stuff, but it's the groups subdir that has the largest files, with the largest single file being groups/gmane.linux.gentoo.devel , which is taking up 27+ MB on its own, about 2% of that 1-gig total and nearly half of the groups subdir, all by itself! The four biggest files following that are 5-7 MB each, before they get too small for filelight to show them unless I dive into the groups subdir itself, making it the working dir on which percentages are based, etc. It's thus immediately obvious that with the article-cache being 92% of the total, if I wanted to reduce the total substantially, I'd *HAVE* to shrink my article cache. But of course as I said I'm not doing that here, as I'm effectively archiving those articles in pan. But the story for most people should be quite different. Pan's default cache size is 10 MB, so unless you set pan's cache to something well over the default, or unless there's a bug and pan's not deleting files when the cache gets too big, if as you say your .pan2 dir is 400-ish MB, deleting the entire 10 MB default cache won't do you much good. Which is where the graphical filesize/directorysize tools help out, as it becomes immediately obvious what's taking up the space, and you can then either ask about that or simply do a backup, then delete the working copy and see if its loss fits your idea of "do no harm", or not. (If you find it harmful, you can simply restore from that backup you made before the delete, thus my specific mention of the backup.) Alternatively, here's a functional description of the various files and subdirs and what they do, so you can figure out for yourself whether losing that will be a big deal or not: Subdirs: article-cache: This is where pan stores the whole articles it has downloaded. By default, this cache is limited to 10MB in size, so articles will be relatively temporarily stored here. If you do primarily text groups, 10 MB might be a few days to a couple months worth of articles in cache. If you do primarily huge binaries, ISO images and the like, obviously 10 MB won't hold much at all, just the parts pan's downloading and assembling to decode and save right then. (The control for cache size is in pan prefs, near the bottom of the behavior tab, in the article-cache seciton. However, if your pan is old enough, you won't have it there, and will have to edit it directly in preferences.xml using a text editor. article-drafts: This holds draft articles you saved before sending (and with new enough pan, an autosave as well, but it gets reused with every article you compose, so...). It could be quite big if you saved a bunch of them and haven't cleaned it up recently, and thus might be a candidate for cleaning. If there's lots of files in here, try ordering them by date or size and deleting either the oldest or largest. downloaded-attachments: AFAIK, this dir (if you have it at all) is an old one that should be safe to delete as pan no longer uses it by default. But to be sure, back it up before deleting, just in case. encode-cache: This one's used by pan as temporary workspace for the (relatively) new binary-upload feature. If your pan is too old to have that, you shouldn't have this dir, either. But it should be empty or nearly empty unless pan crashed in the middle of an encode step, as pan should clean it out when its done. If it's not empty (and you're not in the middle of a binary upload session), you should be able to delete the files here without damage. groups: This subdir IS IMPORTANT, as files within it contain pan's header cache, one file per group. These files MAY get somewhat large -- as I mentioned, that's where my largest individual files are located in my pan text instance data dir, but as long as you don't do like me and set unexpiring, they shouldn't grow without limit (unless you have filesystem corruption or something). **HOWEVER**, they **MAY** be QUITE large for the most active binary groups, particularly on servers with decent binary retention (into the months or years). I'd not be surprised to see the groups subdir files for active binary groups exceeding 100 MB in size, unless you have expiry set short enough to counteract that. But of course you CAN delete the groups subdir files for groups you no longer visit and are no longer subscribed to, without issue, since they're just wasting space... Also of special mention is the relatively new Sent file, corresponding to the "pseudogroup" within current pan. If you send a lot of messages, this file could get pretty big over time. It's worth noting that you can open these files in a text editor and look around if you're curious. They are well commented at the top with an explanation of what is there and its format. Just don't save any changes unless you know what you are doing... or are prepared to lose the header data for that group (maybe with a backup, just in case) if you screw up the edit. ssl_certs: This subdir will likely contain very small hash-data files for each of your servers that you have configured to use SSL. However, these should be small indeed, only a few bytes each. (Of course it's worth noting that on many filesystems, a file takes space in "blocksize" chunks, with "blocksize" often being either 1024 or 4092 bytes (1 or 4 KB). So these very small files, six bytes each here, will normally still take 1024 or 4096 bytes of space on most filesystems including ext*. Still, it'd take either a big bug or a *LOT* of configured servers in ordered to make this dir big enough to get out of the noise at all.) That's the subdirs, here's the files appearing in .pan2 itself: Score: scorefile. If yo use scores you don't want to delete this. If you only assign scores using pan's GUI and you do it a lot, this file could be pretty big, as pan's GUI isn't very efficient at storing the scores it creates. It's possible to manually edit the file to make it far more efficient, without losing any scores, but that's beyond the scope of this message, and in any event I'd suggest that given past history that you leave it alone unless it's getting to be a REAL problem, because I know it's more complex than you're normally prepared to deal with. accels.txt: The "old-style" keyboard-accels file, possible but difficult to hand-edit, as while it's a text-file, it's a machine-ordered menu dump that has little/no human logic to it. AFAIK it's still honored if pan finds it, but I believe pan prefers the pan.hotkeys file (new-style), now, so it can probably be deleted without issue, if you have the new file. (But as usual, if you've customized your hotkeys, make a backup first before trying the delete, just in case.) downloads.stats: This should be a small file consisting of a comment line and a number, that number being the bytes downloaded since the last stats reset. The file will only exist with newer pan, since the feature that uses it is still relatively new. group-preferences.xml: This file contains a record of most or all groups you've visited, since doing so sets some group prefs for that group. While you probably don't want to delete the file itself, as doing so would delete all your group prefs, hand editing should be possible as long as you're careful, and may be desirable, since you can remove entries for groups you no longer visit and don't care to retain the preferences for. newsgroups.dsc: This file contains the newsgroup descriptions as downloaded from your servers whenever you refresh the group list. However, most groups don't have a good description anyway, so the descriptions list is of limited value, and once you have your set of subscribed groups and don't change them much or visit unsubscribed groups much any more, this is a good deletion candidate. However, as mentioned it'll probably reappear when you next update your group list again. But of course if you seldom do that, since you already have your list of subscribed groups and aren't generally interested in new ones anyway, the file might stay gone for quite some time. newsgroups.xov: IMPORTANT! This file contains a record of the groups you've visited and a per-server listing of the highest article number pan knows about for each group. Thus, you don't want to disturb the entries for groups you actively visit. However, the format is simple enough, one group per line, that you can delete whole lines for groups that you're no longer interested in, if you want. newsgroups.ynm: Semi-important: This file tracks per-group posting permissions: posting allowed (default/y), not allowed/read-only (n), or a moderated group (m). I believe pan rebuilds this file when you update the group list, so it's not irreplaceable, but you don't want to go randomly deleting it either, as pan could then get quite mixed up if you try to post to a moderated or read-only group, until you do update the group list again. newsrc*: IMPORTANT! There should be one of these files per server. They track read messages. If a newsrc file for a server goeUnvisiteds missing, pan will lose this information and will show all messages on that server as unread once again (tho it's actually a bit more complex than that, since the read status from multiple servers carrying the same groups interact). It's possible to manually edit the newsrc files without /too/ much trouble if you're careful. Unsubscribed groups will have an exclamation point (!) appended, while subscribed groups will have a colon (:) appended. If you've visited the group, there will be a space, and the article numbers for that group and server that you have marked as read. Some people may be interested in removing the tracking for groups they no longer visit, by removing the space and number sequence. pan.hotkeys: This is the new-style keyboard-accels file. It's easier to hand-edit if desired as there's comments and it's actually logically ordered, but changing the assignment in pan prefs is preferred. If you have custom keyboard-accels configured you'll want to keep this file, but you might consider removing accels.txt, above, if you have both. This new-style version is relatively recent, however, so older pan installations may not have this file, only the older one. posting.xml: IMPORTANT! This file contains your posting profiles. Obviously you don't want to remove it unless you don't care about them, but it's reasonably easy to hand-edit, if you're careful not to break the xml. However, it should remain reasonably sized unless you go hog wild with hundreds/thousands of profiles. preferences.xml: IMPORTANT! This file contains pan's general preferences including the cache size preference mentioned above. It's reasonably easy to edit as long as you're careful not to break the XML. This file should remain pretty close to the same size (near 9 KB) always, tho individual changes will change it by few bytes. servers.xml: IMPORTANT! This file contains your server configuration. Again, it's reasonably easy to edit as long as you don't break the XML, and indeed, hand-editing this file is the only way to get some settings. (It's possible to set an arbitrary server rank here, for instance, while pan's GUI is limited to primary and backup. Similarly, per-server expiry can be set to an arbitrary number of days, instead of the far more limited options the GUI gives you. Finally, it's possible to set an arbitrary number of connections that pan will try to use if the server allows it here, while due to GNKSA, the GUI limits the maximum number of connections to 4. That can be useful for paid accounts that allow 20, 30, 50... connections, altho once you get into the double-digits, unless you're lucky enough to have a gigabit link to the internet, it becomes increasingly likely that more connections simply increase overhead and thus slow you down, instead of increasing download speed. Again, this file should remain reasonably small, unless you go hog wild configuring hundreds/thousands of servers... tasks.nzb: This is a standard *.nzb file, containing pan's list of uncompleted downloads. (It only stores Message-IDs, not group refresh task data, which I believe is lost when pan exits.) The file will thus be larger when you have a long list of downloads queued up, but should shrink to pretty small (just the standard nzb xml schema info, basically, 194 bytes, here) when there aren't any articles queued for download. I'd certainly investigate any files or subdirs other than those listed above, since they're likely to be from something other than pan... -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman _______________________________________________ Pan-users mailing list Pan-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/pan-users