Edit report at https://bugs.php.net/bug.php?id=63380&edit=1

 ID:                 63380
 Updated by:         tstarl...@php.net
 Reported by:        tstarl...@php.net
 Summary:            Allocation via libxml does not use PHP's per-request
                     allocator
 Status:             Assigned
 Type:               Bug
 Package:            XML related
 Operating System:   Linux
 PHP Version:        master-Git-2012-10-29 (Git)
 Assigned To:        tstarling
 Block user comment: N
 Private report:     N

 New Comment:

Do you know of a specific case where request-local allocations could bleed into 
mod_perl and cause memory corruption?

I have reviewed all of libxml's global variables and ensured that they cannot 
be made to hold pointers to request-local allocations. The hooks are disabled 
at post-deactivate via a thread-local variable, so a perl request will not get 
request-local pointers from xmlMalloc() whether it runs in its own thread 
concurrently, or in the same thread as PHP but at a later time. TSRM_FETCH() 
should give default global variables, with local request allocation disabled, 
even if it is called from a thread where PHP has never been used.

Either way, I would be happy to make this configurable, off by default, since 
the robustness of the solution depends on details of global pointer storage in 
libxml which may change in the future. So my patch does introduce a maintenance 
burden, with a risk of dangling pointers if that maintenance is not kept up to 
date. I'm just not keen on having the documentation say that there are known 
issues with interaction with other Apache modules unless that is truly the case.


Previous Comments:
------------------------------------------------------------------------
[2012-12-05 11:08:41] rricha...@php.net

There is a major problem with doing this and why I didn't end tying into the 
PHP 
memory allocator. Depending upon setup, it is extremely likely to be able to 
hit 
memory corruption and/or mix memory allocations between modules. i.e. using 
mod_perl and mod_php will cause PHP to override the libxml memory handling 
functions (which are global) and bleed into mod_perl (or any others that are 
using libxml2) causing any number of results (crashes, security issues, etc..). 
The only way to be able to do something like this would be to make it compile 
time option which is disabled by default allowing those who know their 
environment intimately can utilize this at their own risk, Don't know if you 
want to write a patch for that or not. Otherwise I don't see any way this could 
safely be added,

------------------------------------------------------------------------
[2012-10-29 21:55:03] tstarl...@php.net

https://github.com/php/php-src/pull/223

------------------------------------------------------------------------
[2012-10-29 03:25:17] tstarl...@php.net

Description:
------------
Allocation via libxml does not use PHP's per-request allocator. So any memory 
used by libxml will not be accounted against memory_get_usage() or memory_limit.

At Wikimedia we use libxml DOM trees to store wikitext parse trees, because 
they are more compact in memory than the equivalent pure-PHP data structures. 
When these parse trees are cached, the memory requirements can become 
excessive, and the memory is typically not returned to the system after request 
termination. Using xmlMemSetup() to set hook functions which use PHP's 
per-request allocation functions will allow us to more effectively monitor and 
limit the use of libxml in production.

I've developed a patch and will submit it to GitHub as a pull request.

Test script:
---------------
$doc = new DOMDocument;
for ( $i = 0; $i < 1000000 ; $i++ ) {
    $doc->appendChild($doc->createElement('foo'));
}
print memory_get_usage()."\n";


Expected result:
----------------
312331440 (with debug and ZTS)

Actual result:
--------------
694256


------------------------------------------------------------------------



-- 
Edit this bug report at https://bugs.php.net/bug.php?id=63380&edit=1

Reply via email to