Ext3h opened a new issue, #48190: URL: https://github.com/apache/arrow/issues/48190
### Describe the enhancement requested Right now there are integrations for `jemalloc` and `mimalloc` as alternative memory pools, but both of them only use the ***global*** heap APIs of the corresponding libraries. While that does work in general, there's still a significant performance benefit to using explicitly isolated heaps. I.e. when recording 100+ Parquet files in parallel in a single process on a system with 100+ logical processors, even `mimalloc` still performs by >10% better when using a dedicated heap per thread compared to sharing a single global heap. The main benefit here comes from the fact that this library was already properly tuned towards avoid any memory allocation pattern that couldn't be served by the backing arena, but those optimizations don't universally apply when too many threads interfere. This does come at the cost of a slight increase in total memory consumption, but in return a significant reduction of page table updates (due to hitting the systems allocators as well as page faults) which are getting quite costly with growing core counts. Given that `BaseMemoryPoolImpl ` is hidden in the implementation, implementing a new heap with correct alignment, reporting etc. is unnecessarily hard for the user of the external API. A copy of `MimallocAllocator` - which is explicitly using the `mi_heap_*` instead of the `mi_*` allocation functions should be provided directly by arrow. Unlike the `MimallocAllocator` which is intended to be used as a shared or singleton-like global instance, the intended use of this pool is to explicitly create a distinct pool per NUMA region or potentially even per `ParquetFileWriter`. ### Component(s) C++, Parquet -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
