On Tue, Sep 27, 2022 at 04:14:21PM +0200, Jan Beulich wrote:
> extract_lsb_from_nodes() accumulates "memtop" from all PDXes one past
> the covered ranges. Hence the maximum address which can validly by used
> to index the node map is one below this value, and we may currently set
> up a node map with an unused (and never initialized) trailing entry. In
> boundary cases this may also mean we dynamically allocate a page when
> the static (64-entry) map would suffice.
> 
> While there also correct the comment ahead of the function, for it to
> match the actual code: Linux commit 54413927f022 ("x86-64:
> x86_64-make-the-numa-hash-function-nodemap-allocation fix fix") removed
> the ORing in of the end address before we actually cloned their code.
> 
> Signed-off-by: Jan Beulich <[email protected]>

Acked-by: Roger Pau Monné <[email protected]>

> ---
> Really the shift value may end up needlessly small when there's
> discontiguous memory. Within a gap, any address can be taken for the
> node boundary, and hence neither the end of the lower range nor the
> start of the higher range necessarily is the best address to use. For
> example with these two node ranges (numbers are frame addresses)
> 
> [10000,17fff]
> [28000,2ffff]
> 
> we'd calculate the shift as 15 when 16 or even 17 (because the start of
> the 1st range can also be ignored) would do. I haven't tried to properly
> prove it yet, but it looks to me as if the top bit of the XOR of lower
> range (inclusive) end and higher range start would be what would want
> accumulating (of course requiring the entries to be sorted, or to be
> processed in address order). This would then "naturally" exclude lowest
> range start and highest range end.

I'm not familiar with the logic in the NUMA code, seems like a
possible optimization.  It might be good to include in which way a
bigger shift is beneficial.

Thanks, Roger.

Reply via email to