K Shen posted on Fri, 19 Oct 2012 07:08:57 +0100 as excerpted: > Hi,
Hello. =:^) Before we get into the message, let me remind you to please turn off the HTML. Being a pan user you probably already know how annoying it can be, seeing that in pan... which many here use for this list, via gmane.org's list2news service. > I am using pan newsreader for Windows to read news for several years > now, but in the past month or so, I have started to see regular crashes > of pan when reading a newgroup with a large number of articles.[...] > without such problems previously [...] traffic [may] have increased > [...] > > fault module name is libcairo-2.dll. After a few crashes, I have > noticed that the crash happens when the memory used by pan.exe is > around 1,800,000KB. [...] > > I have just had another crash, while reading in the headers. [T]he > Commit memory for pan.exe was 1,896,692KB. > > I have been using a 32 bit x86 Windows XP laptop with 2G of real memory > up to 3-4 months ago, which was replaced by a 64 bit x86-64 Windows 7 > laptop with 4G of real memory. This was about 1-2 months before I > noticed the crash problem, and I don't know if this new configuration is > important for the crashes (I have not seen the crashes on the old > laptop). > > Does anyone know if the crash is caused by the amount of memory > used/number of headers? Is there any known reason why the crash seem to > happen when the memory used by the process is around 1.8-1.9G? Short version: You're very likely running into the infamous 32-bit memory limits that are the reason the computing world is moving to 64-bit. Rather longer version: In general, the single-byte-addressable flat- address-space limit of a 32-bit system is 4 GB. However, this is the total of the "virtual" address space, which must be split between several different uses, primarily between user-space and kernel-space, with the most common split being 2:2 user/kernel, two gigs each, user low, kernel high.[1] AFAIK, many MS 32-bit consumer/home/pro kernels have a hard 2G/2G split (tho the server editions generally use PAE[2] mode as Linux does, and thus have a far higher limit, even for 32-bit). That's also the default split on Linux 32-bit kernels, but of course it's source available and can be rebuilt using one of the other available options. These include a 3G/1G user/kernel split option, a 4G/4G option that actually dedicates a separate 4-gigs to each and switches between them every time it switches user-mode/kernel-mode (lower efficiency, but if your 32-bit app needs >3 gigs...), and the 64-gig max PAE mode[2], also less efficient due to the additional layer of indirection it uses. Switching to a 64-bit kernel does allow the /kernel/ to natively access memory above the 4 GB barrier, but if you're running 32-bit apps, they're still limited to their old sub-4-gig, and possibly sub-2-gig, size. I don't know enough about Windows to know how it manages user-space limits. In theory, I /believe/ 32-bit apps should have access to a full 4 gigs of virtual userspace (they do on Linux when running on a 64-bit kernel), but it's very likely that there remains either an MS kernel enforced 2-gig barrier, or the default compiler options used when building an app make that assumption, maybe both. Getting back to pan, on large groups with many millions of headers, pan does unfortunately use gigs of memory, because at present, it builds a tree of all that header and threading data in memory. This is actually rather better than it used to do... I remember when pan would run into trouble at 100k-200k headers! One of the things done to help manage memory usage since then, is that now it does string-combining for repeated strings such as author and subject, keeping only one copy of an author name string in memory and reducing the others to references to the first, for instance, and keeping only one copy of the subject line for multi-part posts, which it auto-combines and displays as a single entry. For many years (since well before Charles left), there has been talk of switching to a database backend of some type, perhaps sqlite-based, to track all this data, so only a relatively small bit of it would need to be in memory at once. However, Charles left as lead dev before it was ever implemented. I suspect he wasn't familiar with coding for databases and they're notoriously hard to get correct for the unexperienced, with crash and data-loss bugs being extremely common, so he was hesitant. Then pan was basically abandoned code for a couple years, then adopted by someone who could maintain it but didn't have the time to really add new features, and only recently (a year or so ago) has Heinrich Mueller come along, with all the new features he has implemented at such a furious pace! And he's working on the disk-backed database backend, but as I said, databases are notoriously HARD to get right the first time, so even when he does have something out to test, it's quite likely it'll be some time before that code is actually reasonably stable. Meanwhile, you appear to still be running a 32-bit pan on your 64-bit MS kernel, once pan hits 1.8-1.9 gigs, along with various other overhead that pushes it over the 2-gig barrier into what would often be kernel space on a 32-bit system and is apparently still reserved as kernel space unavailable to your 32-bit pan, on your now 64-bit system. Actually, for the biggest groups on servers with a high retention (giganews is known for this, some of the others have the problem too on they heaviest traffic groups), even an 64-bit 8-gig system can run into problems trying to get and process ALL headers. Someone calculated what it would take to handle them and posted the results at one point, and IIRC, it was something over 16 gigs, 17-ish, I think. FWIW I have 16 gig now (tho I haven't done binary groups in years), so it'd push even my system into swap some. So you're kind of between a rock and a hard place. Until Heinrich comes out with that database backend I've seen him mention a few times, your options include switching to a 64-bit pan, continuing with the N-days header thing, or trying something else that HAS implemented a database backend. It's /possible/ there's some options you can tweek to let you access a full 4 gigs with a 32-bit pan, but that's ultimately likely to run into the same issues as well. You REALLY either the still being coded pan database backend (Heinrich would have to tell you its status, he could be barely started, or just about ready to pop the announcement, I simply don't know), or a 64-bit pan and likely 8 or 16 gigs RAM, or to find another news harvesting alternative other than pan that already has such a database backend. It's really that simple. Of course since you didn't post the server and group name (not that I blame you, the group name can be... rather private info to be posting), it's also possible that it's not that big after all, and that you're running into some other problem. However, pan /is/ known to have this problem especially on 32-bit, and that close to the 2-gig barrier on a group you did say was heavy traffic, chances are it really /is/ the memory barrier you're hitting. Unfortunately... --- [1] It's not relevant here but complicating matters further is the fact that the top of the 32-bit space, often the half-gig to gig, with high- graphics-memory machines it can near two-gigs, is reserved for legacy 32- bit PCI device hardware I/O address usage, even on 64-bit machines. For machines with 3+ gigs of physical RAM, this presents a problem as the PCI hardware I/O area masks any physical memory located at these reserved addresses. The solution is to remap this otherwise hidden physical memory above the 4GB barrier, but for a number of years many BIOSes didn't come with this option, and people with these machines who upgraded to 4 GB simply wasted between a quarter gig and a full gig of RAM, as it was hidden behind the PCI hardware IO area and thus unusable. That's also why say an 8-gig physical-ram machine will often count up to 9 or 10 gigs in its POST (power-on-self-test) -- it's remapping up to two gigs up above the PCI hardware IO memory hole. http://en.wikipedia.org/wiki/3_GB_barrier [2] PAE, Physical Address Extension: http://en.wikipedia.org/wiki/Physical_Address_Extension -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman _______________________________________________ Pan-users mailing list Pan-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/pan-users