walt posted on Fri, 15 Mar 2013 19:35:02 -0700 as excerpted: > Well (as usual for a dedicated nerd) I've changed so many things at the > same time that I can't tell which change may (if there is a bug) have > caused this behavior:
LOL. I'm fighting one of those ATM as well. (FWIW totally unrelated to pan; apparently either kde 4.10.[0,1] or the live-git kernel 3.9-pre I'm running has a resource leak of some sort and at some point will no longer start any new processes, altho current processes continue to run fine -- it's not main memory as that continues to be normal, and I'm on 64-bit, so the memory zones that could be the problem on 32-bit shouldn't be an issue. It /might/ only trigger after a suspend (to ram) and resume, but I'm not sure... and it takes a day or two to trigger, so while I haven't seen it the last couple days I'm not sure whether that's because I've been rebooting into new kernels every day to try to see if it's fixed, or just because I've not been giving it time to trigger...) > I'm running the latest git [d7bd6aa1] as usual. I was fetching a > gazillion headers from my primary news server when I decided to test > pan's recent header-compression feature. ... Which has certainly had its share of "development issues", but seems to finally be working fine for me the last couple weeks... > I opened the 'edit servers' dialog and selected the 'XZVER' option (I > already know this particular server supports XZVER) and clicked 'okay'. > > Well, pan's entire gui interface locked up instantly, including the > 'edit servers' dialog box, and the network traffic halted at the same > instant. > > However, I saw that the pan process was still consuming 100% of (one) > CPU even while the pan gui remained completely unresponsive. > > Well, I thought, strace should give me some info, so I sicked strace on > the pan process -- and strace gave me absolutely nothing even while the > pan process was consuming 100% of (one) CPU. > > WTF? [The following may well be well known info for you, but for the benefit of others for whom it isn't, and for me too, since the process of explaining it solidifies my own grasp of the concept... =:^) ] The thing to remember with strace is how it works... by inserting itself between the normal application and the kernel in ordered to trace system calls (the reason for that "s" in strace... "system" here generally referring to kernel). If the currently executing logic makes no such system calls, either because it's in a tight userspace-only loop for because the currently executing logic simply doesn't make any such calls, there's no "s" to strace! While that might seem perfectly obvious to a coder who knows all about the services provided by the system and when and how they're called, to a normal user, or even a relatively advanced gentoo sysadmin user such as myself, wrapping one's head around that does take a bit, in part because we're so used to seeing the thousands of system calls going by that a normal app typically invokes, typically fast enough that the process of printing them out itself is the bottleneck in an straced process, that not being a coder particularly familiar with the process, it's easy enough to fall into a trap of thinking what we're seeing is the activity of the whole app, NOT just the system calls that are the only thing we're ACTUALLY seeing reported. Sometimes, just for perspective, it's interesting to strace only open calls (-feopen is commonly used here) of a typical desktop process. Seeing just how many library, font, icon, config... files a typical X- based process attempts to open, and how fast it actually happens, is mind- blowing in itself. And that's just the tip of the system-call iceberg, which itself is just the tip of the iceberg of all the app is doing, which given a modern multi-tasking system, is just the tip of the iceberg of all the system is doing! It really gives a person some perspective on how fast a system really operates these days... sort of like looking up at the night sky in a dark area and realizing that the vastness one sees is only a hint... after seeing some of the images produced by Hubble, etc... We humans are just the dust mite on the speck of dust that is the earth in the solar system, itself a speck of just in a galaxy, itself a speck of dust... while computer processes are arguably a few recursions short, in its own way that vastness of scale is similarly mindblowing to think about. But what's REALLY mindblowing is to realize that never-the-less, individual humans still actually program all those apps, and it's both possible and in some contexts routine to reverse engineer the machine code back into assembler, and step thru the functionality at an individual machine instruction level, instruction by instruction. But back to present contextual reality... Taking that high level theory back down to our particular case, however, the lack of such system calls in our pan instance while it's in theory downloading a bunch of headers is still both alarming and an important clue as to the problem, since typically during the header download there'd at minimum be the usual network access calls as well as memory allocation activity going on (and probably more), so an entire lack of strace activity really *DOES* indicate a serious problem, in the form of a userspace-only loop that's tight enough it's not making any system calls at all! > Next I sicked gdb on the pan process and found that pan was evidently > stuck in some infinite loop involving the glib sockets and istream libs, > but I lack the expertise to take this any further. ... But that loop, while involving glib sockets and istream libs, is all userspace... no "s" calls to strace! > As a postscript, I examined my ~/.pan2/servers.xml and found that pan > had saved my 'XZVER' choice correctly before freezing up. > > I'm thinking (maybe?) that pan needs to open a new socket for the > compressed- header istream instead of trying to read from the old > (uncompressed) socket connection? That has been my experience as well, altho I didn't try while it was actually DOWNLOADING headers. In my case, some weeks ago when I was working on getting back to binaries and was taking it a step at a time, uncompressed plain-text connection working -> compressed header plain- text connection working -> ssl-encrypted connection working, I noted (I think in the ssl context at that point) that pan continued using the existing open connections as it had; and setting pan offline and back online didn't fully break existing idle connections, so pan stayed in clear-text until I actually quit and restarted it. So that would presently be a bug, less serious in my case as pan didn't lockup because I didn't attempt the change while it was actually DOWNLOADING something, but still a problem. The ultimate fix, therefore, would be to have pan specifically terminate existing connections and renegotiate them, whenever either the ssl/ cleartext or compressed/uncompressed headers options are toggled. (Obviously, if pan's doing full message download at the time, not header download, compressed-headers should in theory be togglable without terminating the existing connections. However, it may be simpler to simply terminate and restart all connections whenever such a setting changes, regardless.) Meanwhile, easy workaround for this bug! As the saying goes, if it hurts when you bang your head against the wall, QUIT DOING THAT! =:^) Pan may not be able to handle it automatically just yet, but that doesn't mean you have to change the settings when pan's active. Ensure it's idle, change the settings, restart pan to be sure, THEN use the new settings. =:^) > This is all beyond my ken and I'm off to bed now... Yeah, when there's an active bug to trace I can't sleep properly either. Finally getting to sleep properly after tracing it down as far as I know how... is nice! The sleep of someone who worked hard to attain a goal, finally attained it, and can now sleep in peace without it nagging at him any more! =:^) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman _______________________________________________ Pan-users mailing list Pan-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/pan-users