On Tue, Sep 26, 2023 at 09:30:21AM +0200, Richard Biener via Gcc wrote: > On Mon, Sep 25, 2023 at 5:17 PM Sylvain Noiry via Gcc <gcc@gcc.gnu.org> wrote: > > > > Hi, > > > > We had very interesting discussions during our presentation with Paul on > > the > > support of complex numbers in gcc at the Cauldron. > > > > Thank you all for your participation ! > > > > Here is a small summary from our viewpoint: > > > > - Replace CONCAT with a backend defined internal representation in RTL > > --> No particular problems > > > > - Allow backend to write patterns for operation on complex modes > > --> No particular problems > > > > - Conditional lowering depending on whether a pattern exists or not > > --> Concerns when the vectorization of split complex operations performs > > better > > than not vectorized unified complex operations > > > > - Centralize complex lowering in cplxlower > > --> No particular problems if it doesn't prevent IEEE compliance and > > optimizations (like const folding) > > > > - Vectorization of complex operations > > --> 2 representations (interleaved and separated real/imag): cannot > > impose one > > if some machines prefer the other > > --> Complex are composite modes, the vectorizer assumes that the inner > > mode is > > scalar to do some optimizations (which ones ?) > > --> Mixed split/unified complex operations cannot be vectorized easely > > --> Assuming that the inner representation of complex vectors is let to > > target > > backends, the vectorizer doesn't know it, which prevent some > > optimizations > > (which ones ?) > > > > - Explicit vectors of complex > > --> Cplxlower cannot lower it, and moving veclower before cplxlower is a > > bad > > idea as it prevents some optimizations > > --> Teaching cplxlower how to deal with vectors of complex seems to be a > > reasonable alternative > > --> Concerns about ABI or indexing if the internal representation is let > > to the > > backend and differs from the representation in memory > > > > - Impact of the current SLP pattern matching of complex operations > > --> Only with -ffast-math > > --> It can match user defined operations (not C99) that can be > > simplified with a > > complex instruction > > --> Dedicated opcode and real vector type choosen VS standard opcode and > > complex > > mode in our implementation > > --> Need to preserve SLP pattern matching as too many applications > > redefines > > complex and bypass C99 standard. > > --> So need to harmonize with our implementation > > > > - Support of the pure imaginary type (_Imaginary) > > --> Still not supported by gcc (and llvm), neither in our implementation > > --> Issues comes from the fact that an imaginary is not a complex with > > real part > > set to 0 > > --> The same issue with complex multiplication by a real (which is split > > in the > > frontend, and our implementation hasn't changed it yet) > > --> Idea: Add an attribute to the Tree complex type which specify pure > > real / pure > > imaginary / full complex ? > > > > - Fast pattern for IEEE compliant emulated operations > > --> Not enough time to discuss about it > > > > Don't hesitate to add something or bring more precision if you want. > > > > As I said at the end of the presentation, we have written a paper which > > explains > > our implementation in details. You can find it on the wiki page of the > > Cauldron > > (https://gcc.gnu.org/wiki/cauldron2023talks?action=AttachFile&do=view&target=Exposing+Complex+Numbers+to+Target+Back-ends+%28paper%29.pdf). > > Thanks for the detailed presentation at the Cauldron. > > My personal summary is that I'm less convinced delaying lowering is > the way to go.
This is not only delayed lowering, if the SPN are there, there is no lowering at all. > I do think that if targets implement complex optabs we should use them but > eventually re-discovering complex operations from lowered form is going to be > more useful. I would not be opposed to rediscovering complex operations but I think that even though, rediscovering a + b, a - b is easy, a * b would still be doable, but even a / b will be hard. Even though, I doubt will see a hardware complex division but who knows. However, once lowered, re-associating a * b * c and more complex expressions is going to be hard. > That's because as you said, use of _Complex is limited and people > inventing their own representation. Yes, this would be a step back at first, but, proper support for _Complex would probably be an incentive for library writers to take them into account. > SLP vectorization can discover some ops > already with the limiting factor being that we don't specifically search for > only complex operations (plus we expose the result as vector operations, > requiring target support for the vector ops rather than [SD]Cmode operations). Our only concern with SLP is that it only works within loops. If we want to re-discover complex numbers we could either add a dedicated pass before the SLP vectorizer or rely on match.pd? > > There's the gimple-isel.cc or the widen-mul pass that perform > instruction selection > which could be enhanced to discover scalar [SD]Cmode operations. We'll have another look there. Thanks, Paul > > Richard. > > > Sylvain > > > > > > > > > > > > > >