On 12/4/25 12:45, David Hildenbrand (Red Hat) wrote:
On 12/4/25 20:36, Linus Torvalds wrote:
On Thu, 4 Dec 2025 at 09:40, Shuah Khan <[email protected]> wrote:

This commit has impact on all architectures, not a narrow scoped
powerpc only thing -  it enables HAVE_GIGANTIC_FOLIOS on x86_64
and changes the common code that determines MAX_FOLIO_ORDER in
include/linux/mm.h

So I suspect your bisection might not have worked out, and there might
be two different things going on.

In particular, hugepages were broken in 6.18-rc6 due to commit
adfb6609c680 ("mm/huge_memory: initialise the tags of the huge zero
folio").

That was then fixed for rc7 (and obviously final 6.18) by commit
5bebe8de19264 ("mm/huge_memory: Fix initialization of huge zero
folio"), but the breakage up until that time was a bit random.


Both my systems were running rc6 - I was stuck in a state
where I was able to rebase to rc7 and then 6.18, but could
never build either one.

End result: if you ever ended up bisecting into that broken range
between those two commits, you would get failures on some loads (but
not reliably), and your bisection would end up pointing to some random
thing.

But as mentioned, that particular problem would have been fixed in rc7
and in final 6.18, so any issues you saw with the final build would
have been due to something else.

Can I ask you to try to re-do the bisection, but with that commit
5bebe8de19264 applied by hand - if it wasn't already there - every
time you build a kernel that has adfb6609c680?

When I suspected rc6 to be the problem, I booted rc5 and compiled 6.18
after reverting 39231e8d6ba based on config file changes between rc5
and rc6.


Right, that's what I also proposed in [1].

I cannot make sense of how 39231e8d6ba could possibly trigger it given that it 
only affects the value of MAX_FOLIO_ORDER --- which is primarily used for 
safety checks and snapshot_page(), nothing that could explain changed 
application behavior, really.

But while Shuah is retesting, I'll go have a yet another look.

I retested on both systems on 6.18 making sure I have 5bebe8de19264
and 39231e8d6ba in there. I cloned linux_next and built it on both.

I didn't see any problems on 6.18. Having said that, It might make
sense to hold off on including 39231e8d6ba in 6.18 so there is more
time to test beyond 2 rc cycles. That is for you all to decide.

thanks,
-- Shuah

Reply via email to