Hi Richard, thanks again for the feedback. We have sent an updated version, which removes the redundant store operations (https://gcc.gnu.org/pipermail/gcc-patches/2025-May/684096.html). This one is a series as we needed to update sbitmap with a function that checks if a range in the bitmap is set. It could be done in other ways, but this seems cleaner.
On Thu, May 8, 2025 at 5:55 PM Richard Sandiford <richard.sandif...@arm.com> wrote: > > Konstantinos Eleftheriou <konstantinos.elefther...@vrull.eu> writes: > > During the base register initialization, when we are eliminating the load > > instruction, we were calling `emit_move_insn` on registers of the same > > size but of different mode in some cases, causing an ICE. > > > > This patch uses `lowpart_subreg` for the base register initialization, > > instead of zero-extending it. We had tried this solution before, but > > we were leaving undefined bytes in the upper part of the register. > > This shouldn't be happening as we are supposed to write the whole > > register when the load is eliminated. This was occurring when having > > multiple stores with the same offset as the load, generating a > > register move for all of them, overwriting the bit inserts that > > were inserted before them. With this patch we are generating a register > > move only for the first store of this kind, using a bit insert for the > > rest of them. > > That feels wrong though. If there are multiple stores to the same offset > then it becomes a question of which bytes of which stores survive until > the load. E.g. for a QI store followed by an HI store followed by an SI > store, the final SI store wins and the previous ones should be ignored. > If it's QI, SI, HI, then for little endian, the low two bytes come from > the HI and the next two bytes come from the SI. The QI store should > again be ignored. > > So I would expect this to depend on which store is widest, with ties > broken by picking later stores (i.e. those earlier in the list). > > I'm also not sure why this is only a problem with using lowparts. > Wouldn't the same issue apply when using zero_extend? The bytes are > fully-defined for zero_extend, but not necessarily to the right values. > This is an issue with zero_extend too. Zero-extending was a wrong fix in the first place, it just masked the issue that we had in an older PR. Thanks, Konstantinos