https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88284
--- Comment #8 from Michael_S ---
(In reply to sandra from comment #7)
> While Intel has revived the "Altera" name, the Nios II processor is still
> listed as discontinued. I see they are offering ARM-based FPGA products
> again instead.
>
Arm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88284
--- Comment #4 from Michael_S ---
Deprecation of Nios2 was pushed by Intel that appears to have a love affair
with RISC-V. But now Altera is spun off. Intel is no longer involved in
technical side of their business.
So, may be, before purging all
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105617
--- Comment #21 from Michael_S ---
(In reply to Mason from comment #20)
> Doh! You're right.
> I come from a background where overlapping/aliasing inputs are heresy,
> thus got blindsided :(
>
> This would be the optimal code, right?
>
> add4i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105617
--- Comment #19 from Michael_S ---
(In reply to Mason from comment #18)
> Hello Michael_S,
>
> As far as I can see, massaging the source helps GCC generate optimal code
> (in terms of instruction count, not convinced about scheduling).
>
> #in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108279
--- Comment #24 from Michael_S ---
(In reply to Michael_S from comment #22)
> (In reply to Michael_S from comment #8)
> > (In reply to Thomas Koenig from comment #6)
> > > And there will have to be a decision about 32-bit targets.
> > >
> >
> >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108279
--- Comment #23 from Michael_S ---
(In reply to Jakub Jelinek from comment #19)
> So, if stmxcsr/vstmxcsr is too slow, perhaps we should change x86
> sfp-machine.h
> #define FP_INIT_ROUNDMODE \
> do {
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108279
--- Comment #22 from Michael_S ---
(In reply to Michael_S from comment #8)
> (In reply to Thomas Koenig from comment #6)
> > And there will have to be a decision about 32-bit targets.
> >
>
> IMHO, 32-bit targets should be left in their current
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108279
--- Comment #16 from Michael_S ---
(In reply to Jakub Jelinek from comment #15)
> libquadmath is not needed nor useful on aarch64-linux, because long double
> type there is already IEEE 754 quad.
That's good to know. Thank you.
If you are here
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108279
--- Comment #12 from Michael_S ---
(In reply to Thomas Koenig from comment #10)
> What we would need for incorporation into gcc is to have several
> functions, which would then called depending on which floating point
> options are in force at t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108279
--- Comment #11 from Michael_S ---
(In reply to Thomas Koenig from comment #9)
> Created attachment 54273 [details]
> matmul_r16.i
>
> Here is matmul_r16.i from a relatively recent trunk.
Thank you.
Unfortunately, I was not able to link it wit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108279
--- Comment #8 from Michael_S ---
(In reply to Thomas Koenig from comment #6)
> (In reply to Michael_S from comment #5)
> > Hi Thomas
> > Are you in or out?
>
> Depends a bit on what exactly you want to do, and if there is
> a chance that what
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108279
--- Comment #7 from Michael_S ---
Either here or my yahoo e-mail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108279
--- Comment #5 from Michael_S ---
Hi Thomas
Are you in or out?
If you are still in, I can use your help on several issues.
1. Torture.
See if Invalid Operand exception raised properly now. Also if there are still
remaining problems with NaN.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108279
--- Comment #4 from Michael_S ---
(In reply to Jakub Jelinek from comment #2)
> From what I can see, they are certainly not portable.
> E.g. the relying on __int128 rules out various arches (basically all 32-bit
> arches,
> ia32, powerpc 32-bit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832
--- Comment #22 from Michael_S ---
(In reply to Alexander Monakov from comment #21)
> (In reply to Michael_S from comment #19)
> > > Also note that 'vfnmadd231pd 32(%rdx,%rax), %ymm3, %ymm0' would be
> > > 'unlaminated' (turned to 2 uops before r
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832
--- Comment #20 from Michael_S ---
(In reply to Richard Biener from comment #17)
> (In reply to Michael_S from comment #16)
> > On unrelated note, why loop overhead uses so many instructions?
> > Assuming that I am as misguided as gcc about load-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832
--- Comment #19 from Michael_S ---
(In reply to Alexander Monakov from comment #18)
> The apparent 'bias' is introduced by instruction scheduling: haifa-sched
> lifts a +64 increment over memory accesses, transforming +0 and +32
> displacements t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832
--- Comment #16 from Michael_S ---
On unrelated note, why loop overhead uses so many instructions?
Assuming that I am as misguided as gcc about load-op combining, I would write
it as:
sub %rax, %rdx
.L3:
vmovupd (%rdx,%rax), %ymm1
vmovupd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832
--- Comment #14 from Michael_S ---
I tested a smaller test bench from Comment 3 with gcc trunk on godbolt.
Issue appears to be only partially fixed.
-Ofast result is no longer a horror that it was before, but it is still not as
good as -O3 or -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105617
--- Comment #15 from Michael_S ---
(In reply to Richard Biener from comment #14)
> (In reply to Michael_S from comment #12)
> > On related note...
> > One of the historical good features of gcc relatively to other popular
> > compilers was absen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106220
--- Comment #3 from Michael_S ---
-march-haswell is not very important.
I added it only because in absence of BMI extension an issue is somewhat
obscured by need to keep shift count in CL register.
-O2 is also not important. -O3 is the same. An
: 12.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: already5chosen at yahoo dot com
Target Milestone: ---
I am reporting about right shift issue, but left shift has
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101
--- Comment #23 from Michael_S ---
(In reply to jos...@codesourcery.com from comment #22)
> On Mon, 13 Jun 2022, already5chosen at yahoo dot com via Gcc-bugs wrote:
>
> > > The function should be sqrtf128 (present in glibc 2.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101
--- Comment #21 from Michael_S ---
(In reply to jos...@codesourcery.com from comment #20)
> On Sat, 11 Jun 2022, already5chosen at yahoo dot com via Gcc-bugs wrote:
>
> > On MSYS2 _Float128 and __float128 appears to be mostly th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101
--- Comment #19 from Michael_S ---
(In reply to jos...@codesourcery.com from comment #18)
> libquadmath is essentially legacy code. People working directly in C
> should be using the C23 _Float128 interfaces and *f128 functions, as in
> curre
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101
--- Comment #17 from Michael_S ---
(In reply to Jakub Jelinek from comment #15)
> From what I can see, it is mostly integral implementation and we already
> have one such in GCC, so the question is if we just shouldn't use it (most
> of the sou
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101
--- Comment #16 from Michael_S ---
(In reply to Thomas Koenig from comment #14)
> @Michael: Now that gcc 12 is out of the door, I would suggest we try to get
> your code into the gcc tree for gcc 13.
>
> It should follow the gcc style guideline
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105617
--- Comment #12 from Michael_S ---
On related note...
One of the historical good features of gcc relatively to other popular
compilers was absence of auto-vectorization at -O2.
When did you decide to change it and why?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105617
--- Comment #11 from Michael_S ---
(In reply to Richard Biener from comment #10)
> (In reply to Hongtao.liu from comment #9)
> > (In reply to Hongtao.liu from comment #8)
> > > (In reply to Hongtao.liu from comment #7)
> > > > Hmm, we have speci
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105617
--- Comment #6 from Michael_S ---
(In reply to Michael_S from comment #5)
>
> Even scalar-to-scalar or vector-to-vector moves that are hoisted at renamer
> does not have a zero cost, because quite often renamer itself constitutes
> the narrowes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105617
--- Comment #5 from Michael_S ---
(In reply to Richard Biener from comment #3)
> We are vectorizing the store it dst[] now at -O2 since that appears
> profitable:
>
> t.c:10:10: note: Cost model analysis:
> r0.0_12 1 times scalar_store costs 12
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105617
--- Comment #4 from Michael_S ---
(In reply to Andrew Pinski from comment #1)
> This is just the vectorizer still being too aggressive right before a return.
> It is a cost model issue and it might not really be an issue in the final
> code just
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: already5chosen at yahoo dot com
Target Milestone: ---
It took many years until gcc caught up with MSVC and LLVM/clang in generation
of code for chains of Intel's _addcarry_u64() intrinsic calls. But your
fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105468
--- Comment #4 from Michael_S ---
Created attachment 52925
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52925&action=edit
build script
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105468
--- Comment #3 from Michael_S ---
Created attachment 52924
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52924&action=edit
Another test bench that shows lower impact on Zen3, but higher impact on some
Intel CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105468
--- Comment #2 from Michael_S ---
Created attachment 52923
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52923&action=edit
test bench that shows lower impact on Zen3, but higher impact on some Intel
CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105468
--- Comment #1 from Michael_S ---
Created attachment 52922
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52922&action=edit
test bench that demonstrates maximal impact on Zen3
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: already5chosen at yahoo dot com
Target Milestone: ---
Created attachment 52921
--> ht
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101
--- Comment #13 from Michael_S ---
It turned out that on all micro-architectures that I care about (and majority
of those that I don't care) double precision floating point division is quite
fast.
It's so fast that it easily beats my clever reci
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101
--- Comment #12 from Michael_S ---
(In reply to Michael_S from comment #11)
> (In reply to Michael_S from comment #10)
> > BTW, the same ideas as in the code above could improve speed of division
> > operation (on modern 64-bit HW) by factor of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101
--- Comment #11 from Michael_S ---
(In reply to Michael_S from comment #10)
> BTW, the same ideas as in the code above could improve speed of division
> operation (on modern 64-bit HW) by factor of 3 (on Intel) or 2 (on AMD).
Did it.
On Intel i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101
--- Comment #10 from Michael_S ---
BTW, the same ideas as in the code above could improve speed of division
operation (on modern 64-bit HW) by factor of 3 (on Intel) or 2 (on AMD).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101
--- Comment #9 from Michael_S ---
(In reply to Michael_S from comment #4)
> If you want quick fix for immediate shipment then you can take that:
>
> #include
> #include
>
> __float128 quick_and_dirty_sqrtq(__float128 x)
> {
> if (isnanq(x)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101
Michael_S changed:
What|Removed |Added
CC||already5chosen at yahoo dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832
--- Comment #10 from Michael_S ---
I lost track of what you're talking about long time ago.
But that's o.k.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832
--- Comment #3 from Michael_S ---
(In reply to Richard Biener from comment #2)
> It's again reassociation making a mess out of the natural SLP opportunity
> (and thus SLP discovery fails miserably).
>
> One idea worth playing with would be to ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79173
--- Comment #9 from Michael_S ---
Despite what I wrote above, I did took a look at the trunk on godbolt with same
old code from a year ago. Because it was so easy. And indeed a trunk looks ALOT
better.
But until it's released I wouldn't know if i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79173
--- Comment #8 from Michael_S ---
(In reply to Jakub Jelinek from comment #7)
> (In reply to Michael_S from comment #5)
> > I agree with regard to "other targets", first of all, aarch64, but x86_64
> > variant of gcc already provides requested fu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79173
--- Comment #6 from Michael_S ---
(In reply to Marc Glisse from comment #1)
> We could start with the simpler:
>
> void f(unsigned*__restrict__ r,unsigned*__restrict__ s,unsigned a,unsigned
> b,unsigned c, unsigned d){
> *r=a+b;
> *s=c+d+(*r
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79173
Michael_S changed:
What|Removed |Added
CC||already5chosen at yahoo dot com
--- Comment
: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: already5chosen at yahoo dot com
Target Milestone: ---
I am reporting under 'target' because AVX2+FMA is the only 256-bit SIMD
platform I have to play with. If it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97428
--- Comment #9 from Michael_S ---
Hopefully, you did regression tests for all main AoS<->SoA cases.
I.e.
typedef struct { double re, im; } dcmlx_t;
void soa2aos(double* restrict dstRe, double* restrict dstIm, const dcmlx_t
src[], int nq)
{
for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97428
--- Comment #6 from Michael_S ---
(In reply to Richard Biener from comment #4)
>
> while the lack of cross-lane shuffles in AVX2 requires a
>
> .L3:
> vmovupd (%rsi,%rax), %xmm5
> vmovupd 32(%rsi,%rax), %xmm6
> vinsertf1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97428
--- Comment #5 from Michael_S ---
(In reply to Richard Biener from comment #4)
> I have a fix that, with -mavx512f generates just
>
> .L3:
> vmovupd (%rcx,%rax), %zmm0
> vpermpd (%rsi,%rax), %zmm1, %zmm2
> vpermpd %zmm0,
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: already5chosen at yahoo dot com
Target Milestone: ---
That my next example of bad handling of AoSoA layout by gcc
optimizer/vectorizer.
For discussion of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97343
--- Comment #2 from Michael_S ---
(In reply to Richard Biener from comment #1)
> All below for Part 2.
>
> Without -ffast-math you are seeing GCC using in-order reductions now while
> with -ffast-math the vectorizer gets a bit confused about rea
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: already5chosen at yahoo dot com
Target Milestone: ---
Let's continue our complex dot product series started here
https://gcc.gnu.org/bugzilla/show_bug.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97127
--- Comment #15 from Michael_S ---
(In reply to Hongtao.liu from comment #14)
> > Still I don't understand why compiler does not compare the cost of full loop
> > body after combining to the cost before combining and does not come to
> > conclusi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97127
--- Comment #13 from Michael_S ---
(In reply to Hongtao.liu from comment #11)
> (In reply to Michael_S from comment #10)
> > (In reply to Hongtao.liu from comment #9)
> > > (In reply to Michael_S from comment #8)
> > > > What are values of gcc "l
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97127
--- Comment #10 from Michael_S ---
(In reply to Hongtao.liu from comment #9)
> (In reply to Michael_S from comment #8)
> > What are values of gcc "loop" cost of the relevant instructions now?
> > 1. AVX256 Load
> > 2. FMA3 ymm,ymm,ymm
> > 3. AVX2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97127
--- Comment #8 from Michael_S ---
What are values of gcc "loop" cost of the relevant instructions now?
1. AVX256 Load
2. FMA3 ymm,ymm,ymm
3. AVX256 Regmove
4. FMA3 mem,ymm,ymm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97127
--- Comment #6 from Michael_S ---
Why do you see it as addition of peephole pattern?
I see it as removal. Like, "do what's written in the source and don't try to be
tricky".
Probably, I am too removed from how compilers work :(
Or, may be, handl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97127
--- Comment #3 from Michael_S ---
(In reply to Alexander Monakov from comment #2)
> Richard, though register moves are resolved by renaming, they still occupy a
> uop in all stages except execution, and since renaming is one of the
> narrowest po
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: already5chosen at yahoo dot com
Target Milestone: ---
The following clever gcc transformation leads to generation of slower code than
non-transformed original:
a = *mem;
a = a + b * c;
where both b and c are
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96854
--- Comment #15 from Michael_S ---
Thank you.
That does not sound too different from what I assumed in post above.
10.1.0 is release. Expected to be used by "normal" people.
10.1.1 was for purpose of development of 10.2.0. Since release of 10.2.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96854
--- Comment #13 from Michael_S ---
I don't follow gcc versioning policy all that closely.
What is the function "micro" versions now? For internal use and experimentation
only, but not for public release?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96854
--- Comment #11 from Michael_S ---
Just to understand
Will 10.1 and 10.2 be fixed?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96854
--- Comment #4 from Michael_S ---
Pay attention that it's not just AVX.
'-mavx2 -mfma -Ofast' generates different code, but at the end gives the same
wrong result.
Unfortunately, I have no AVX512 hardware to test, but wouldn't be surprised if
it'
Assignee: unassigned at gcc dot gnu.org
Reporter: already5chosen at yahoo dot com
Target Milestone: ---
'-Ofast -mavx -march=ivybridge' miscompiles this simple loop:
double complex foo(double complex acc, const double complex *x, const double
complex* y, int N)
{
: target
Assignee: unassigned at gcc dot gnu.org
Reporter: already5chosen at yahoo dot com
Target Milestone: ---
Created attachment 45131
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45131&action=edit
demonstration of bad scheduling
Compiler generates bad schedul
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86965
--- Comment #3 from Michael_S ---
(In reply to sandra from comment #1)
> I'm not sure what command-line options you were using, but with -O2 the bad2
> case now generates the expected code.
>
With 8.2.0 the problem exists both with -O2 and with
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80283
--- Comment #25 from Michael_S ---
Just a reminder 16 months later:
x86-64 case - both 8.2 and trunk are as bad as they were.
ARM-Neon case - 8.2 appears to be worse (by 5%) than either 6.x or 7.x. I
didn't check trunk.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87047
--- Comment #11 from Michael_S ---
Sorry for intervening, but IMHO a new __builtin is long overdue.
__builtin
(In reply to Jakub Jelinek from comment #9)
> (In reply to Alexander Monakov from comment #8)
> > Well, original_costs is already initia
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87031
--- Comment #7 from Michael_S ---
Done. a new report = 87079
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: already5chosen at yahoo dot com
Target Milestone: ---
Created attachment 44586
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44586&action=edit
5.3->8.3 regressio
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87031
--- Comment #5 from Michael_S ---
It's fine that you moved the 2nd case to 'tree-optimization'. I suppose that's
where it belongs.
But I just saw the second case by chance in the process of reduction of the
first case to bare minimum. For me it (
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87031
--- Comment #4 from Michael_S ---
It's fine that you moved the 2nd case to 'tree-optimization'. I suppose that's
where it belongs.
But I just saw the second case by chance in the process of reduction of the
first case to bare minimum. For me it (
rmal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: already5chosen at yahoo dot com
Target Milestone: ---
Created attachment 44570
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44570&action=edit
demonstrate performance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87031
--- Comment #2 from Michael_S ---
After playing with the 2nd case on godbolt I found that it's not target
specific.
The regression occurred at all targets between gcc6 and gcc7.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87031
--- Comment #1 from Michael_S ---
Created attachment 44564
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44564&action=edit
second case - loop unrolled
: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: already5chosen at yahoo dot com
Target Milestone: ---
Created attachment 44563
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44563&action=edit
first case -
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: already5chosen at yahoo dot com
Target Milestone: ---
On MIPS and Nios2 architectures logical instruction immediate (andi, ori)
zero-extend immediate field. It means that on this targets
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: already5chosen at yahoo dot com
Target Milestone: ---
Created attachment 44545
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44545&action=edit
source code that demonstrates
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83528
Michael_S changed:
What|Removed |Added
Status|RESOLVED|UNCONFIRMED
Resolution|WONTFIX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83528
--- Comment #5 from Michael_S ---
Created attachment 42944
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42944&action=edit
good asm output (gcc 4.8.3)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83528
--- Comment #4 from Michael_S ---
Created attachment 42943
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42943&action=edit
bad asm output (gcc 5.3.0)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83528
--- Comment #3 from Michael_S ---
Well, the guidline here https://gcc.gnu.org/bugs/ specifically tells me that
it's one of the things that you don't want ;)
But yes, I can.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83528
--- Comment #1 from Michael_S ---
I did a little more research and found out that it is relatively recent
regression introduced in gcc version 4.9.2 (Altera 15.1 Build 185).
gcc version 4.8.3 20140320 (prerelease) (Altera 14.1 Build 186) still g
: c
Assignee: unassigned at gcc dot gnu.org
Reporter: already5chosen at yahoo dot com
Target Milestone: ---
Created attachment 42942
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42942&action=edit
eaxmple of bad code generation for Nios2 target
In the loop over a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80283
--- Comment #18 from Michael_S ---
O.k. Not a back end.
The part of compiler that is responsible for binding local variables to
registers or to stack locations. I am assuming that such part exists in gcc and
acts after tree-ter phase, but before
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80283
--- Comment #14 from Michael_S ---
Created attachment 41293
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41293&action=edit
another case of bad vector register allocation
Here is another case of bad allocation of SIMD register that hopefu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80283
--- Comment #11 from Michael_S ---
Created attachment 41128
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41128&action=edit
ARMv7 case
ARMv7 - very similar to x64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80283
--- Comment #10 from Michael_S ---
Created attachment 41127
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41127&action=edit
bad reg allocation despite no-tree-ter
No problems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80283
Michael_S changed:
What|Removed |Added
CC||already5chosen at yahoo dot com
--- Comment
94 matches
Mail list logo