Re: [PATCH] rs6000, update vec_ld, vec_lde, vec_st and vec_ste, documentation

Carl Love Wed, 03 Jul 2024 10:44:52 -0700


On 7/3/24 2:36 AM, Kewen.Lin wrote:

Hi Carl,

on 2024/6/27 01:05, Carl Love wrote:

GCC maintainers:

The following patch updates the user documentation for the vec_ld, vec_lde, 
vec_st and vec_ste built-ins to make it clearer that there are data alignment 
requirements for these built-ins.  If the data alignment requirements are not 
followed, the data loaded or stored by these built-ins will be wrong.

Please let me know if this patch is acceptable for mainline.  Thanks.

                           Carl

----------------------------------------------------
rs6000, update vec_ld, vec_lde, vec_st and vec_ste documentation

Use of the vec_ld and vec_st built-ins require that the data be 16-byte
aligned to work properly.  Add some additional text to the existing
documentation to make this clearer to the user.

Similarly, the vec_lde and vec_ste built-ins also have data alignment
requirements based on the size of the vector element.  Update the
documentation to make this clear to the user.

gcc/ChangeLog:
        * doc/extend.texi: Add clarification for the use of the vec_ld
        vec_st, vec_lde and vec_ste built-ins.
---
  gcc/doc/extend.texi | 15 +++++++++++----
  1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index ee3644a5264..55faded17b9 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -22644,10 +22644,17 @@ vector unsigned char vec_xxsldi (vector unsigned char,
  @end smallexample

Note that the @samp{vec_ld} and @samp{vec_st} built-in functions always

-generate the AltiVec @samp{LVX} and @samp{STVX} instructions even
-if the VSX instruction set is available.  The @samp{vec_vsx_ld} and
-@samp{vec_vsx_st} built-in functions always generate the VSX @samp{LXVD2X},
-@samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions.
+generate the AltiVec @samp{LVX}, and @samp{STVX} instructions.  The

This change removed "even if the VSX instruction set is available.", I think 
it's
not intentional?  vec_ld and vec_st are well defined in PVIPR, this paragraph is
not to document them IMHO.  Since we document vec_vsx_ld and vec_vsx_st here, it
aims to note the difference between these two pairs.  But I'm not opposed to add
more words to emphasis the special masking off, I prefer to use the same words 
to
PVIPR "ignoring the four low-order bits of the calculated address".  And IMHO we
should not say "it requires the data to be 16-byte aligned to work properly" in
case the users are aware of this behavior well and have some no 16-byte aligned
data and expect it to behave like that, it's arguable to define "it" as not work
properly.

Yea, probably should have left "even if the VSX instruction set isavailable."

I was looking to make it clear that if the data is not 16-bye alignedyou may not get the expected data loaded/stored.


So how about the following instead:

   Note that the @samp{vec_ld} and @samp{vec_st} built-in functions always
   generate the AltiVec @samp{LVX}, and @samp{STVX} instructions even
   if the VSX
   instruction set is available. The instructions mask off the lower
   4-bits of
   the calculated address. The use of these instructions on data that
   is not
   16-byte aligned may result in unexpected bytes being loaded or stored.

+instructions mask off the lower 4 bits of the effective address thus requiring
+the data to be 16-byte aligned to work properly.  The @samp{vec_lde} and
+@samp{vec_ste} built-in functions operate on vectors of bytes, short integer,
+integer, and float.  The corresponding AltiVec instructions @samp{LVEBX},
+@samp{LVEHX}, @samp{LVEWX}, @samp{STVEBX}, @samp{STVEHX}, @samp{STVEWX} mask
+off the lower bits of the effective address based on the size of the data.
+Thus the data must be aligned to the size of the vector element to work
+properly.  The @samp{vec_vsx_ld} and @samp{vec_vsx_st} built-in functions
+always generate the VSX @samp{LXVD2X}, @samp{LXVW4X}, @samp{STXVD2X}, and
+@samp{STXVW4X} instructions.

As above, there was a reason to mention vec_ld and vec_st here, but not one for
vec_lde and vec_ste IMHO, so let's not mention vec_lde and vec_ste here and 
users
should read the description in PVIPR instead (it's more recommended).

The goal of mentioning the vec_lde and vec_ste built-ins was to give theuser a pointer to built-ins that will work as expected on unaligneddata. It will probably save them a lot of time an frustration if theyare given a hint of what built-ins they should look at. So, how aboutthe following:


   See the PVIPR description of the vec_lde and vec_ste for loading and
   storing
   data that is not 16-byte aligned.

                   Carl

Re: [PATCH] rs6000, update vec_ld, vec_lde, vec_st and vec_ste, documentation

Reply via email to