Eric Botcazou <ebotca...@adacore.com> writes: >> This isn't intended to change the behaviour, just rewrite the >> existing logic in a different (and hopefully clearer) way. > > Yes, I agree that it's an improvement. A few remarks below. > >> diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c >> index ca6cced..7c0acf5 100644 >> --- a/gcc/rtlanal.c >> +++ b/gcc/rtlanal.c >> @@ -3601,31 +3601,28 @@ subreg_get_info (unsigned int xregno, machine_mode >> xmode, unsigned int offset, machine_mode ymode, >> struct subreg_info *info) >> { >> - int nregs_xmode, nregs_ymode; >> - int mode_multiple, nregs_multiple; >> - int offset_adj, y_offset, y_offset_adj; >> - int regsize_xmode, regsize_ymode; >> - bool rknown; >> + unsigned int nregs_xmode, nregs_ymode; >> >> gcc_assert (xregno < FIRST_PSEUDO_REGISTER); >> >> - rknown = false; >> + unsigned int xsize = GET_MODE_SIZE (xmode); >> + unsigned int ysize = GET_MODE_SIZE (ymode); >> + bool rknown = false; >> >> /* If there are holes in a non-scalar mode in registers, we expect >> - that it is made up of its units concatenated together. */ >> + that it is made up of its units concatenated together. Each scalar >> + unit occupies at least one register. */ > > Why using "scalar unit" while the first sentence uses "unit"? Are they > different units?
No, the same. I should probably have updated both. I added "scalar" to make it clear that we were talking about units in the GET_MODE_NUNITS sense rather than the UNITS_PER_WORD sense (i.e. number of scalar values rather than "unit" as an abstraction of "byte"). > What's the status of the new sentence? Assertion? > Expectation? Assertion. It was a necessary but not sufficient condition to satisfy the pre-patch: gcc_assert (hard_regno_nregs[xregno][xmode] == (hard_regno_nregs[xregno][xmode_unit] * GET_MODE_NUNITS (xmode))); > I'd also make the first sentence grammatical. Well, I think it's probably grammatical, but how about: If the register representation of a non-scalar mode has holes in it, we expect the scalar units to be concatenated together, with the holes distributed evenly among the scalar units. Each scalar unit must occupy at least one register. >> @@ -3651,18 +3646,17 @@ subreg_get_info (unsigned int xregno, machine_mode >> xmode, nregs_ymode = hard_regno_nregs[xregno][ymode]; >> >> /* Paradoxical subregs are otherwise valid. */ >> - if (!rknown >> - && offset == 0 >> - && GET_MODE_PRECISION (ymode) > GET_MODE_PRECISION (xmode)) >> + if (!rknown && offset == 0 && ysize > xsize) >> { >> info->representable_p = true; >> /* If this is a big endian paradoxical subreg, which uses more >> actual hard registers than the original register, we must >> return a negative offset so that we find the proper highpart >> of the register. */ >> - if (GET_MODE_SIZE (ymode) > UNITS_PER_WORD >> - ? REG_WORDS_BIG_ENDIAN : BYTES_BIG_ENDIAN) >> - info->offset = nregs_xmode - nregs_ymode; >> + if (REG_WORDS_BIG_ENDIAN != WORDS_BIG_ENDIAN && ysize > >> UNITS_PER_WORD + ? REG_WORDS_BIG_ENDIAN >> + : byte_lowpart_offset (ymode, xmode) != 0) >> + info->offset = (int) nregs_xmode - (int) nregs_ymode; >> else >> info->offset = 0; >> info->nregs = nregs_ymode; > > This part is not OK, it turns straightforward code into convoluted code. This actually was one of the more important changes :-) In combination with later patches, the idea is to move away from UNITS_PER_WORD tests when endianness is regular (all big or all little) and only do them when the distinction between bytes and words makes a real difference. The specific motivating examples were SVE predicate registers, which occupy VL*2 bytes for some runtime VL. They are smaller than a word when VL<4, word-sized when VL==4, and bigger than a word when VL>4. We therefore can't calculate: GET_MODE_SIZE (ymode) > UNITS_PER_WORD at compile time. This is one of the patches that avoids forcing the issue unless the answer really matters. >> @@ -3717,7 +3703,7 @@ subreg_get_info (unsigned int xregno, machine_mode >> xmode, info->representable_p = true; >> info->nregs = nregs_ymode; >> info->offset = offset / regsize_ymode; >> - gcc_assert (info->offset + info->nregs <= nregs_xmode); >> + gcc_assert (info->offset + nregs_ymode <= nregs_xmode); >> return; >> } >> } > > This part is not OK, the assertion is intended to be on INFO. This was a cheap way of avoiding having to add a cast. I'll add the cast instead. >> @@ -3736,47 +3722,39 @@ subreg_get_info (unsigned int xregno, machine_mode >> xmode, } >> } >> >> - /* This should always pass, otherwise we don't know how to verify >> - the constraint. These conditions may be relaxed but >> - subreg_regno_offset would need to be redesigned. */ >> - gcc_assert ((GET_MODE_SIZE (xmode) % GET_MODE_SIZE (ymode)) == 0); >> + /* Set NUM_BLOCKS to the number of independently-representable YMODE >> + values there are in (reg:XMODE XREGNO). We can view the register >> + as consisting of this number of independent "blocks", where each >> + block occupies NREGS_YMODE registers and contains exactly one >> + representable YMODE value. */ >> gcc_assert ((nregs_xmode % nregs_ymode) == 0); >> + unsigned int num_blocks = nregs_xmode / nregs_ymode; > > I find the "exactly" slightly confusing here, because it can make you think > that it contains the number of bytes of a YMODE value; a possible better > wording would be "a single" instead of "exactly one". OK. >> - if (WORDS_BIG_ENDIAN != REG_WORDS_BIG_ENDIAN >> - && GET_MODE_SIZE (xmode) > UNITS_PER_WORD) >> - { >> - HOST_WIDE_INT xsize = GET_MODE_SIZE (xmode); >> - HOST_WIDE_INT ysize = GET_MODE_SIZE (ymode); >> - HOST_WIDE_INT off_low = offset & (ysize - 1); >> - HOST_WIDE_INT off_high = offset & ~(ysize - 1); >> - offset = (xsize - ysize - off_high) | off_low; >> - } >> - /* The XMODE value can be seen as a vector of NREGS_XMODE >> - values. The subreg must represent a lowpart of given field. >> - Compute what field it is. */ >> - offset_adj = offset; >> - offset_adj -= subreg_lowpart_offset (ymode, >> - mode_for_size (GET_MODE_BITSIZE > (xmode) >> - / nregs_xmode, >> - MODE_INT, 0)); >> - >> - /* Size of ymode must not be greater than the size of xmode. */ >> - mode_multiple = GET_MODE_SIZE (xmode) / GET_MODE_SIZE (ymode); >> - gcc_assert (mode_multiple != 0); >> - >> - y_offset = offset / GET_MODE_SIZE (ymode); >> - y_offset_adj = offset_adj / GET_MODE_SIZE (ymode); >> - nregs_multiple = nregs_xmode / nregs_ymode; >> - >> - gcc_assert ((offset_adj % GET_MODE_SIZE (ymode)) == 0); >> - gcc_assert ((mode_multiple % nregs_multiple) == 0); >> + /* Calculate the number of bytes in each block. This must always >> + be exact, otherwise we don't know how to verify the constraint. >> + These conditions may be relaxed but subreg_regno_offset would >> + need to be redesigned. */ >> + gcc_assert ((xsize % num_blocks) == 0); >> + unsigned int bytes_per_block = xsize / num_blocks; >> + >> + /* Get the number of the first block that contains the subreg and the >> byte + offset of the subreg from the start of that block. */ >> + unsigned int block_number = offset / bytes_per_block; >> + unsigned int subblock_offset = offset % bytes_per_block; >> >> if (!rknown) >> { >> - info->representable_p = (!(y_offset_adj % (mode_multiple / >> nregs_multiple))); + /* Only the lowpart of each block is >> representable. */ >> + info->representable_p >> + = (subblock_offset >> + == subreg_size_lowpart_offset (ysize, bytes_per_block)); >> rknown = true; >> } >> - info->offset = (y_offset / (mode_multiple / nregs_multiple)) * >> nregs_ymode; + > > This part is OK. > >> + if (WORDS_BIG_ENDIAN != REG_WORDS_BIG_ENDIAN) >> + info->offset = (num_blocks - block_number - 1) * nregs_ymode; >> + else >> + info->offset = block_number * nregs_ymode; >> info->nregs = nregs_ymode; >> } > > Why was the test on the mode size dropped? For WORDS_BIG_ENDIAN != REG_WORDS_BIG_ENDIAN targets? In practice the old code didn't handle the case in which a single word spans more than one register: if xmode was bigger than a word, ymode was smaller than a word, and the number of registers in a ymode was smaller than the number of registers in a word, we would need to take "normal" endianness into account to resolve the subword register offset while using REG_WORDS_BIG_ENDIAN for the word component. Instead the old code reversed the endianness relative the size of ymode, regardless of whether ymode was bigger than a word or smaller than a word. In other words, the assumption seems to have been that REG_WORDS_BIG_ENDIAN is effectively "endianness across multiple registers" and there is no need to subdivide register offsets into words and subwords. In practice that was OK, since AFAICT no target with WORDS_BIG_ENDIAN != REG_WORDS_BIG_ENDIAN had subword-sized registers. This in turn means that "block endianness" is always word endianness for these targets. But I suppose we should have an assert, even though it was already an implicit assumption... :-) Thanks, Richard