Hi Julien, > On 9 Jun 2022, at 09:30, Julien Grall <[email protected]> wrote: > > From: Hongyan Xia <[email protected]> > > The idea is to split the range into multiple aligned power-of-2 regions > which only needs to call free_heap_pages() once each. We check the least > significant set bit of the start address and use its bit index as the > order of this increment. This makes sure that each increment is both > power-of-2 and properly aligned, which can be safely passed to > free_heap_pages(). Of course, the order also needs to be sanity checked > against the upper bound and MAX_ORDER. > > Testing on a nested environment on c5.metal with various amount > of RAM. Time for end_boot_allocator() to complete: > Before After > - 90GB: 1426 ms 166 ms > - 8GB: 124 ms 12 ms > - 4GB: 60 ms 6 ms
On a arm64 Neoverse N1 system with 32GB of Ram I have: - 1180 ms before - 63 ms after and my internal tests are passing on arm64. Great optimisation :-) (I will do a full review of code the in a second step). > > Signed-off-by: Hongyan Xia <[email protected]> > Signed-off-by: Julien Grall <[email protected]> Cheers Bertrand
