On Tue, Sep 26, 2023 at 09:30:21AM +0200, Richard Biener via Gcc wrote:
> On Mon, Sep 25, 2023 at 5:17 PM Sylvain Noiry via Gcc <gcc@gcc.gnu.org> wrote:
> >
> > Hi,
> >
> > We had very interesting discussions during our presentation with Paul on
> > the
> > support of complex numbers in gcc at the Cauldron.
> >
> > Thank you all for your participation !
> >
> > Here is a small summary from our viewpoint:
> >
> > - Replace CONCAT with a backend defined internal representation in RTL
> > --> No particular problems
> >
> > - Allow backend to write patterns for operation on complex modes
> > --> No particular problems
> >
> > - Conditional lowering depending on whether a pattern exists or not
> > --> Concerns when the vectorization of split complex operations performs
> > better
> >     than not vectorized unified complex operations
> >
> > - Centralize complex lowering in cplxlower
> > --> No particular problems if it doesn't prevent IEEE compliance and
> >     optimizations (like const folding)
> >
> > - Vectorization of complex operations
> > --> 2 representations (interleaved and separated real/imag): cannot
> > impose one
> >     if some machines prefer the other
> > --> Complex are composite modes, the vectorizer assumes that the inner
> > mode is
> >     scalar to do some optimizations (which ones ?)
> > --> Mixed split/unified complex operations cannot be vectorized easely
> > --> Assuming that the inner representation of complex vectors is let to
> > target
> >     backends, the vectorizer doesn't know it, which prevent some
> > optimizations
> >     (which ones ?)
> >
> > - Explicit vectors of complex
> > --> Cplxlower cannot lower it, and moving veclower before cplxlower is a
> > bad
> >     idea as it prevents some optimizations
> > --> Teaching cplxlower how to deal with vectors of complex seems to be a
> >     reasonable alternative
> > --> Concerns about ABI or indexing if the internal representation is let
> > to the
> >     backend and differs from the representation in memory
> >
> > - Impact of the current SLP pattern matching of complex operations
> > --> Only with -ffast-math
> > --> It can match user defined operations (not C99) that can be
> > simplified with a
> >     complex instruction
> > --> Dedicated opcode and real vector type choosen VS standard opcode and
> > complex
> >     mode in our implementation
> > --> Need to preserve SLP pattern matching as too many applications
> > redefines
> >     complex and bypass C99 standard.
> > --> So need to harmonize with our implementation
> >
> > - Support of the pure imaginary type (_Imaginary)
> > --> Still not supported by gcc (and llvm), neither in our implementation
> > --> Issues comes from the fact that an imaginary is not a complex with
> > real part
> >     set to 0
> > --> The same issue with complex multiplication by a real (which is split
> > in the
> >     frontend, and our implementation hasn't changed it yet)
> > --> Idea: Add an attribute to the Tree complex type which specify pure
> > real / pure
> >     imaginary / full complex ?
> >
> > - Fast pattern for IEEE compliant emulated operations
> > --> Not enough time to discuss about it
> >
> > Don't hesitate to add something or bring more precision if you want.
> >
> > As I said at the end of the presentation, we have written a paper which
> > explains
> > our implementation in details. You can find it on the wiki page of the
> > Cauldron
> > (https://gcc.gnu.org/wiki/cauldron2023talks?action=AttachFile&do=view&target=Exposing+Complex+Numbers+to+Target+Back-ends+%28paper%29.pdf).
> 
> Thanks for the detailed presentation at the Cauldron.
> 
> My personal summary is that I'm less convinced delaying lowering is
> the way to go.

This is not only delayed lowering, if the SPN are there, there is no
lowering at all.

> I do think that if targets implement complex optabs we should use them but
> eventually re-discovering complex operations from lowered form is going to be
> more useful.

I would not be opposed to rediscovering complex operations but I think
that even though, rediscovering a + b, a - b is easy, a * b would
still be doable, but even a / b will be hard.  Even though, I doubt
will see a hardware complex division but who knows.  However, once
lowered, re-associating a * b * c and more complex expressions is going
to be hard.

> That's because as you said, use of _Complex is limited and people
> inventing their own representation.

Yes, this would be a step back at first, but, proper support for
_Complex would probably be an incentive for library writers to take
them into account.

> SLP vectorization can discover some ops
> already with the limiting factor being that we don't specifically search for
> only complex operations (plus we expose the result as vector operations,
> requiring target support for the vector ops rather than [SD]Cmode operations).

Our only concern with SLP is that it only works within loops.  If we
want to re-discover complex numbers we could either add a
dedicated pass before the SLP vectorizer or rely on match.pd?

> 
> There's the gimple-isel.cc or the widen-mul pass that perform
> instruction selection
> which could be enhanced to discover scalar [SD]Cmode operations.

We'll have another look there.

Thanks,
Paul
> 
> Richard.
> 
> > Sylvain
> >
> >
> >
> >
> >
> 
> 
> 
> 




Reply via email to