On Tue, Sep 26, 2023 at 09:28:08AM +0000, Tamar Christina wrote: > > -----Original Message----- > > From: Gcc <gcc-bounces+tamar.christina=arm....@gcc.gnu.org> On Behalf > > Of Paul Iannetta via Gcc > > Sent: Tuesday, September 26, 2023 9:54 AM > > To: Richard Biener <richard.guent...@gmail.com> > > Cc: Sylvain Noiry <sno...@kalrayinc.com>; gcc@gcc.gnu.org; > > sylvain.no...@hotmail.fr > > Subject: Re: Complex numbers support: discussions summary > > > > On Tue, Sep 26, 2023 at 09:30:21AM +0200, Richard Biener via Gcc wrote: > > > On Mon, Sep 25, 2023 at 5:17 PM Sylvain Noiry via Gcc <gcc@gcc.gnu.org> > > wrote: > > > > > > > > Hi, > > > > > > > > We had very interesting discussions during our presentation with > > > > Paul on the support of complex numbers in gcc at the Cauldron. > > > > > > > > Thank you all for your participation ! > > > > > > > > Here is a small summary from our viewpoint: > > > > > > > > - Replace CONCAT with a backend defined internal representation in > > > > RTL > > > > --> No particular problems > > > > > > > > - Allow backend to write patterns for operation on complex modes > > > > --> No particular problems > > > > > > > > - Conditional lowering depending on whether a pattern exists or not > > > > --> Concerns when the vectorization of split complex operations > > > > --> performs > > > > better > > > > than not vectorized unified complex operations > > > > > > > > - Centralize complex lowering in cplxlower > > > > --> No particular problems if it doesn't prevent IEEE compliance and > > > > optimizations (like const folding) > > > > > > > > - Vectorization of complex operations > > > > --> 2 representations (interleaved and separated real/imag): cannot > > > > impose one > > > > if some machines prefer the other > > > > --> Complex are composite modes, the vectorizer assumes that the > > > > --> inner > > > > mode is > > > > scalar to do some optimizations (which ones ?) > > > > --> Mixed split/unified complex operations cannot be vectorized > > > > --> easely Assuming that the inner representation of complex vectors > > > > --> is let to > > > > target > > > > backends, the vectorizer doesn't know it, which prevent some > > > > optimizations > > > > (which ones ?) > > > > > > > > - Explicit vectors of complex > > > > --> Cplxlower cannot lower it, and moving veclower before cplxlower > > > > --> is a > > > > bad > > > > idea as it prevents some optimizations > > > > --> Teaching cplxlower how to deal with vectors of complex seems to > > > > --> be a > > > > reasonable alternative > > > > --> Concerns about ABI or indexing if the internal representation is > > > > --> let > > > > to the > > > > backend and differs from the representation in memory > > > > > > > > - Impact of the current SLP pattern matching of complex operations > > > > --> Only with -ffast-math > > > > --> It can match user defined operations (not C99) that can be > > > > simplified with a > > > > complex instruction > > > > --> Dedicated opcode and real vector type choosen VS standard opcode > > > > --> and > > > > complex > > > > mode in our implementation > > > > --> Need to preserve SLP pattern matching as too many applications > > > > redefines > > > > complex and bypass C99 standard. > > > > --> So need to harmonize with our implementation > > > > > > > > - Support of the pure imaginary type (_Imaginary) > > > > --> Still not supported by gcc (and llvm), neither in our > > > > --> implementation Issues comes from the fact that an imaginary is > > > > --> not a complex with > > > > real part > > > > set to 0 > > > > --> The same issue with complex multiplication by a real (which is > > > > --> split > > > > in the > > > > frontend, and our implementation hasn't changed it yet) > > > > --> Idea: Add an attribute to the Tree complex type which specify > > > > --> pure > > > > real / pure > > > > imaginary / full complex ? > > > > > > > > - Fast pattern for IEEE compliant emulated operations > > > > --> Not enough time to discuss about it > > > > > > > > Don't hesitate to add something or bring more precision if you want. > > > > > > > > As I said at the end of the presentation, we have written a paper > > > > which explains our implementation in details. You can find it on the > > > > wiki page of the Cauldron > > > > > > (https://gcc.gnu.org/wiki/cauldron2023talks?action=AttachFile&do=view&tar > > get=Exposing+Complex+Numbers+to+Target+Back-ends+%28paper%29.pdf). > > > > > > Thanks for the detailed presentation at the Cauldron. > > > > > > My personal summary is that I'm less convinced delaying lowering is > > > the way to go. > > > > This is not only delayed lowering, if the SPN are there, there is no > > lowering at > > all. > > > > > I do think that if targets implement complex optabs we should use them > > > but eventually re-discovering complex operations from lowered form is > > > going to be more useful. > > > > I would not be opposed to rediscovering complex operations but I think that > > even though, rediscovering a + b, a - b is easy, a * b would still be > > doable, but > > even a / b will be hard. Even though, I doubt will see a hardware complex > > division but who knows. However, once lowered, re-associating a * b * c and > > more complex expressions is going to be hard. > > > > > That's because as you said, use of _Complex is limited and people > > > inventing their own representation. > > > > Yes, this would be a step back at first, but, proper support for _Complex > > would > > probably be an incentive for library writers to take them into account. > > > > > SLP vectorization can discover some ops already with the limiting > > > factor being that we don't specifically search for only complex > > > operations (plus we expose the result as vector operations, requiring > > > target support for the vector ops rather than [SD]Cmode operations). > > > > Our only concern with SLP is that it only works within loops. If we want > > to re- > > discover complex numbers we could either add a dedicated pass before the > > SLP vectorizer or rely on match.pd? > > SLP doesn't work in just loops. SLP works on scalar statements inside BBs > starting > from sink (constructors, stores, reductions etc). > I think you're confusing Loop-Aware SLP and SLP (in GCC these are two > different > Passes that share much common code. >
Indeed, we conflated both. Thanks for pointing this out! Paul > Tamar > > > > > > > > There's the gimple-isel.cc or the widen-mul pass that perform > > > instruction selection which could be enhanced to discover scalar > > > [SD]Cmode operations. > > > > We'll have another look there. > > > > Thanks, > > Paul > > > > > > Richard. > > > > > > > Sylvain > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >