On Friday 14 June 2013, Niko Tyni wrote: > On Sun, Jun 09, 2013 at 11:23:01PM +0300, Niko Tyni wrote: > > On Fri, Jun 07, 2013 at 02:23:43PM +0300, Niko Tyni wrote: > > > I can reproduce the SIGSEGV at the end of the main test suite > > > (#711213) on amd64. The armel problem might well be related, > > > as the log ends at the same point. > > > > I'm somewhat further now: what happens is that > > register_auth_provider() in modperl_util.c calls > > > > apr_pool_pre_cleanup_register(pool, NULL, > > cleanup_perl_global_providers); > > > > once in the parent process, then another time in a child. For > > some reason that I do not understand yet, the > > cleanup_perl_global_providers() function resides at a different > > memory location (with a 0x2c000 offset or so) on the second > > time. The first location has at that point become an invalid > > memory address, resulting in a SIGSEGV when libapr calls the > > registered cleanup functions and jumps into the old location. > > Another progress report. I now mostly understand what's happening. > Contrary to the above, all the interesting stuff happens inside the > parent process. > > Cc'ing the apache2 maintainers; any ideas? See below. > (The jump to an invalid address is crashing armel buildds so it's a > rather big problem ATM. See #711167, where this has diverged.) > > First, apache2 main() calls read_config() (from main.c:624), which > loads all the modules. Loading mod_perl installs the pre_cleanup > hook cleanup_perl_global_providers() as above. > > Then, there's a loop starting at main.c:704 that has this comment: > > /* This is a hack until we finish the code so that it only > reads * the config file once and just operates on the tree already > in * memory. rbb > */ > > and calls apr_pool_clear(pconf), which unloads the modules and > should do all the cleanup AIUI. A bit later, at main.c:724, > ap_read_config() is called again, and under some conditions (when > stack limit is 'unlimited' and the number of modules is > suitable?), mod_perl gets loaded at a different place than the > first time. However, the earlier installed pre_cleanup hook is > still in place, so we jump into an out-of-bounds location (where > cleanup_perl_global_providers() used to reside) in the end when > the cleanups are actually called.
The problem is that MP_CMD_SRV_DECLARE2(authz_provider) and MP_CMD_SRV_DECLARE2(authn_provider) register the cleanup against parms->server->process->pool which lives longer than the pconf pool and therefore the load time of the mod-perl shared object. It should probably use parms->pool (which is pconf) instead. In general, everything mod_perl does should be undone by the clearing/destruction of pconf, because the the .so will be unloaded after that. server->process->pool can be used to store things that need to be preserved beyond the unloading/loading of the .so, however there is now also a higher level api for that (ap_retained_data_create). Registering a cleanup with server->process- >pool is always bad from a module because the code may move. Now, if there is a good reason that the above functions use server- >process->pool, we need to figure out a way to fix that. But the original commit of that code has no comment with respect to the pool requirement. Therefore I think it may be simply a bug and you should test it with a cleanup against pconf, first. > So I suppose mod_perl should somehow register a "module uninstall > hook" that calls apr_pool_cleanup_run(..., > cleanup_perl_global_providers, ...) [or apr_pool_cleanup_kill(), > not sure] to remove the to-be-unloaded pre_cleanup hook. I haven't > found a way to do that yet. If you register a pool cleanup with pconf, it will be called before the .so is unloaded. Cheers, Stefan -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org