On Sun, 4 May 2025, John Paul Adrian Glaubitz wrote:

> > > What exactly is broken with the QEMU emulation in Alpha? I don't know of 
> > > any
> > > bugs, but it could be that you have run into the nasty stack alignment 
> > > issue
> > > in the kernel that was fixed in Linux 6.14.
> > 
> >  This was with QEMU in the user emulation mode, causing intermittent 
> > failures across the GCC testsuite, so unrelated to any Linux kernel 
> > issues.  Perhaps the system emulation mode works better, but the GCC 
> > testsuite doesn't rely much on syscall emulation and the nature of the 
> > failures didn't indicate this aspect of the user emulation mode mattered 
> > here.
> 
> >From my personal experience, qemu-user has various issues that don't exist
> on qemu-system. So, if you're experiencing a qemu-related bug in qemu-user,
> it's always worth verifying it with qemu-system.

 No time to verify odd configurations here.  However of all the syscalls 
most GCC test cases only rely on exit_group(2) and kill(2) and ones that 
did fail intermittenly were purely arithmetic, so honestly I doubt the 
emulation mode matters.

 Yes, I know what the shortcomings of QEMU are, having worked with and 
contributing to the project for 15+ years now.  My interest with the 
project has faded though after a series of arguments with a short-lived 
MIPS backend maintainer.

 NB it was me who diagnosed the stack alignment bug in the Linux kernel 
(Ivan made the fixes), so I've been fairly aware of its existence.

> >  I have reported it at the time and this has led to Magnus being kind 
> > enough, following your request, to let me use his BWX Alpha system for 
> > verification instead, where no intermittent failures were observed, so 
> > again no Linux kernel bugs mattered here (this was last year, well before 
> > the fix) and it was QEMU clearly at fault.
> 
> Could you point me to the bug report in question? I would like to look into
> it and see if it is alpha-specific.

 No actual bug report, just the mention in a discussion, but I'm fairly 
sure you were cc-ed, so you should be able to chase it.  It was around if 
not along with my GCC patch submission for `-msafe-partial' option back in 
Nov last year.  Please feel free turning it into a proper bug report 
against QEMU.  Somehow I feel there won't be a rush of volunteers to fix 
it though.

> > > >  What I was not aware of is the situation with the Alpha backend and 
> > > > the 
> > > > need to put out fires there.  That non-BWX issue with Linux kernel's 
> > > > RCU 
> > > > algorithms was a nasty surprise to me, one I could have dealt with 
> > > > before 
> > > > with less time pressure if I knew about it.
> > > 
> > > What RCU issue are you talking about? I can only stress that to use Linux 
> > > on
> > > Alpha, you *must* use kernel 6.14 or later with CONFIG_COMPACTION disabled
> > > otherwise you will run into all kinds of issues.
> > 
> >  The very RCU issue that prompted the removal of non-BWX support from the 
> > kernel last year and then this whole effort of mine.
> 
> Aha, I wasn't aware that the original cause for the removal of non-BWX support
> was due to issue with RCU. I thought the original motivation was that non-BWX
> Alpha doesn't support byte-access which Linus called a design mistake.

 The lack of byte accesses in the architecture isn't itself a problem, 
though indeed an engineering challenge.  What the problem has been is the 
original replacement RMW sequences GCC has produced cause data races that 
triggered with RCU code.  It was discussed at the time of non-BWX Alpha 
support removal from the Linux kernel, which you raised an objection 
against.

 HTH,

  Maciej

Reply via email to