Nice Job!
As a database, we really need to manage memory usage more reasonably by 
ourselves.


--
此致!Best Regards
陈明雨 Mingyu Chen

Email:
[email protected]



At 2019-09-10 22:16:43, "Zhao Chun" <[email protected]> wrote:
>Hi all
>
>I want to add a chunk allocator, following is design.
>
>## Motivation
>In the case of high concurrency testing, many threads are waiting to be
>applied and released in memory, and a large part of them are released by
>Chunk in MemPool. One of the reasons for this is that MemPool is used
>everywhere in code. On the other hand, the memory usage of these chunks is
>relatively large 4K - 512K. This large amount of memory make TCMalloc
>easily exceed the free memory reserved for each thread and needs to be
>applied to the central memory.
>Therefore, I implemented a demo ChunkAllocator to keep the released Chunk,
>avoiding frequent allocate from or release to TCMalloc. Using this demo to
>test the same high concurrency case, the throughput is more than doubled.
>The throughput has increased from 280 QPS to 650 QPS. So based on this, I
>want to implement a ChunkAllocator to reduce the allocation and release
>operations of Chunk from system allocator, thus improving the performance
>of the system.
>## Design
>How to manage free Chunks? The size of the Chunk is power-of-two, so we can
>maintain a separate free chunk list for each size. When the Chunk is no
>longer used, it will be placed in the free list of the corresponding size.
>When allocating a new Chunk, it will first try to find it from the
>corresponding size free list. If it can't find it, try to allocate a new
>Chunk from the system allocator.
>In order to avoid the Chunk Allocator's lock conflict which will affect
>system performance, we need to reduce the collision domain. The idea here
>is to maintain an Chunk Arena for each CPU core. When allocating, try to
>allocate memory from the corresponding Chunk Arena.
>For memory limitations, there are two options. One is to set a limit on the
>total amount of memory that can be allocated; and the other is to set a
>limit on the maximum amount of free memory that is reserved. In order to be
>compatible with the current system behavior, I intend to limit only the
>total amount of reserved memory. This only fails when the system memory is
>completely drained, which is consistent with the current behavior. The
>larger the reserved free memory limit is, the better it will result in a
>better cache hit, but it will also lead to excessive free memory, causing
>other modules hard to allocate memory.
>What system allocator is used? malloc vs mmap? Currently, Malloc is used.
>If we change to mmap and do not change the system
>parameters(vm.max_map_count), it may cause the memory allocating to fail
>even if there is memory. We can implement these two types system allocator,
>and then leave a configure to choose which way to complete the system
>memory allocation. And configure malloc as default
>future work:
>All large memory applications in the system can be applied through Chunk
>Allocator, so that the Chunk Allocator can be changed from the reserved
>limit to the memory allocating limit.
>## Structure
>```
>Struct Chunk {
>    Uint8_t* data;
>    Size_t size;
>    // core id from which this chunk was allocated
>    Int core_id;
>};
>// Keep free chunk for each CPU core
>Class ChunkArena {
>Public:
>    // Pop a free chunk from correspoding fres list
>    // Return true if success with valid chunk saved in "chunk"
>    Bool pop_free_chunk(size_t size, Chunk* chunk);
>
>    // push a free chunk in this arena for later use
>    Void push_free_chunk(const Chunk& chunk);
>};
>Class ChunkAllocator {
>Public:
>    // Allocate memory in size, size must be power-of-two.
>    // Return Status::OK() if success, and allocated chunk info will be
>saved in chunk
>    Status allocate(size_t size, Chunk* chunk);
>
>    Void free(const Chunk& chunk);
>};
>```
>Allocate process:
>1. Get the current core_id
>2. Try to apply for an idle Chunk from the corresponding Arena. If
>successful, return the corresponding Chunk.
>3. Try to get free Chunk from Arena corresponding to other cores. If
>successful, return to Chunk
>4. Assign Chunk from the system allocator
>Release process:
>1. Determine if there is enough cache capacity, and if so, place the chunk
>in the idle queue for the corresponding Arena.
>2. Call the system release function to release the resource
>
>I create an issue in github[1], look forward to your feedback.
>
>1. https://github.com/apache/incubator-doris/issues/1776

Reply via email to