pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.

LGTM.



================
Comment at: clang/lib/Headers/avx2intrin.h:3474
+///   IF __M[j+31] == 1
+///     result[j+31:j] := Load32(__X+(i*4))
+///   ELSE
----------------
probinson wrote:
> pengfei wrote:
> > probinson wrote:
> > > pengfei wrote:
> > > > probinson wrote:
> > > > > pengfei wrote:
> > > > > > A more intrinsic guide format is `MEM[__X+j:j]`
> > > > > LoadXX is the syntax in the gather intrinsics, e.g. 
> > > > > _mm_mask_i32gather_pd. I'd prefer to be consistent.
> > > > I think the problem here is the measurement is easily confusing.
> > > > From C point of view, `__X` is a `int` pointer, so we should `+ i` 
> > > > rather than `i * 4`
> > > > From the other part of the code, we are measuring in bits, but here `i 
> > > > * 4` is a byte offset.
> > > Well, the pseudo-code is clearly not C. If you look at the gather code, 
> > > it computes a byte address using an offset multiplied by an explicit 
> > > scale factor. I am doing exactly the same here.
> > > 
> > > The syntax `MEM[__X+j:j]` is mixing a byte address with a bit offset, 
> > > which I think is more confusing. To be fully consistent, using `[]` with 
> > > bit offsets only, it should be
> > > ```
> > > k := __X*8 + i*32
> > > result[j+31:j] := MEM[k+31:k]
> > > ```
> > > which I think obscures more than it explains.
> > Yeah, it's not C code here. But we are easy to fall into C concepts, e.g., 
> > why assuming __X is measuring in bytes?
> > That's why I think it's clear to make both in bits.
> > I made a mistake here, I wanted to propose `MEM[__X+j+31: __X+j]`. It 
> > matches with [[ 
> > https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=4057,4058,4059,4053,665,3890,5959,5910,3870,4280&text=_mm256_maskload_epi32
> >  | Intrinsic Guide ]].
> > 
> We assume `__X` is in bytes because that's how addresses work on X86. Adding 
> a bit offset to a byte address makes no sense. I see that is how existing 
> Intel documentation works, which does not make it correct.
> 
> To "make both in bits" means multiplying `__X` by 8, as in the example in my 
> previous comment. Or coming up with a different syntax that makes the 
> difference clear.
> `MEM(__X)[j+31:j]` or even `MEM[__X][j+31:j]` would be preferable.
My intention is to match with Intrinsic Guide as possible. Multiplying by 8 
cannot achive it, but I cannot deny `__X` in bytes makes sense.
So I'm fine to use a different syntax.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to