Dear folks, I am trying to establish a clustermatic 5 setup on a 2.6.9 custom built kernel backported to a stock Mandriva 2006 build (with all of the latest patches applied as of Saturday)
No problem on the headnode kernel or the CM5 host utils booting. However, the slaves *intermittently* do not properly copy the libs over and I get vmadump: mmap failed: /lib/tls/libc.so.6 or vmadump: mmap failed: /lib/tls/libc-2.3.5.so Now, the first is a symlink to the other. Also, strace on a simple binary (e.g. mkdir, shows that it is indeed trying to load *that* version of the C lib 1st.) I've messed around with taking that out of the path and linking libc.so.6 to various other libc*so*'s in /usr/lib or /lib, with the same results. It will sometimes boot, sometimes not. This looks like a random library ordering issue. Or, perhaps a timing issue where something that is being called in the C lib is causing vmadump to burp. It's happening in the node_up stage tho' if it happens. *Sometimes* the nodes will boot OK. ---- Note: I have a happily running CM5 setup on several other machines with FC4 as the core OS and basically the same custom CM5 kernel on top - it's something funky with the M2006 C libraries AFAICS. Threading perhaps? Not sure. I have other reasons for going with M2006. I didn't fancy backporting the basic bproc code to a 2.6.12* or 2.6.15 kernel, so I simply used (custom rebuilt the same as on the FC4 clusters) the 2.6.9 kernel from CM5. Do let me know if you have any ideas. Thanks! Kind regards Derek Jones. _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf