On 08/09/2016 12:33 AM, shmeel gutl wrote:
On 03-Aug-16 12:10 AM, Vladimir Makarov wrote:
On 08/02/2016 04:41 PM, shmeel gutl wrote:
I am trying to enable lra for a propriety backend. I ran into one
problem that I can't solve. In lra-constraints.c:split_reg
lra_create_new_reg can be called w
On 06/09/2010 06:45 AM, Amker.Cheng wrote:
Hi :
I am studying ira right now, there is following code in change_loop
if (parent_allocno == NULL
|| REGNO (ALLOCNO_REG (parent_allocno)) == REGNO (original_reg))
{
if (internal_flag_ira_verbose> 3&&
On 05/31/2010 08:17 PM, H.J. Lu wrote:
On Mon, May 31, 2010 at 12:31 PM, Vladimir Makarov wrote:
H.J. Lu wrote:
Hi,
I am working on generating vzeroupper to avoid AVX->SSE transition
penalty.
I have generated vzeroupper on function return as well as function
call. I am working on
On 05/28/2010 12:38 PM, H.J. Lu wrote:
Hi,
I want to generate vzeroupper when I know upper 128bits aren't used. I
can't find
a way to mark an pattern which zeros upper 128bits. So I added
;; Clear the upper 128bits of AVX registers, equivalent to a NOP.
;; This should be used only when the uppe
On 05/17/2010 02:44 AM, Maxim Kuvyrkov wrote:
CodeSourcery is working on improving performance for Intel's Core 2
and Core i7 families of processors.
CodeSourcery plans to add support for unaligned vector instructions,
to provide fine-tuned scheduling support and to update instruction
selecti
On 03/19/2010 12:47 PM, Ian Bolton wrote:
I mention all this because I was wondering which other architectures
have turned off sched1 for -Os? More importantly, I was wondering
if anyone else had considered creating some kind of clever hybrid
that only uses sched1 when it will increase performan
On 03/19/2010 12:09 PM, Ian Bolton wrote:
Hi folks!
I've moved on from register allocation (see Understanding IRA thread)
and onto scheduling.
In particular, I am investigating the effectiveness of the sched1
pass on our architecture and the associated interblock-scheduling
optimisation.
Let'
Jan Hubicka wrote:
IP RA as currenly implemented in IRA does propagate info only down in
topological order. But a good IP RA (e.g. Minimal cost inter-procedural
regiter allocator http://citeseer.ist.psu.edu/kurlander96minimum.html)
needs to propagate info up and down.
But I am quite skeptic
Jan Hubicka wrote:
I have to admit that i had not considered the issue of having the later
stages of compilation of one function feed into the early stages of
another. The problem with this is that it implies a serial ordering of
the compilation and that is not going to yield a usable system f
Here is mine benchmarking of the current LTO branch on 2.66Ghz Core2
under RHEL 5 in 64- and 32-bits mode. The vortex violates type
aliasing rules, therefore it should be compiled with
-fno-strict-aliasing. Perlbmk crashed in tree.c::build2_stat in
32-bits mode when LTO used. LTO currently gen
Kenneth Zadeck wrote:
Andrew MacLeod wrote:
On Fri, 2007-08-17 at 12:01 -0400, Kenneth Zadeck wrote:
In any case IRA can not use UREC because UREC is needed before IRA
calculates reg class info and the reg class info is needed for
calculation of UREC. If you manage to use LIVE inst
Kenneth Zadeck wrote:
Vladimir N. Makarov wrote:
Kenneth Zadeck wrote:
Vladimir N. Makarov wrote:
Kenneth Zadeck wrote:
it looks like the backwards scan is not getting "enough" interferences
to make reload/global happy.
the case comes about beca
Kenneth Zadeck wrote:
Vladimir N. Makarov wrote:
Kenneth Zadeck wrote:
it looks like the backwards scan is not getting "enough" interferences
to make reload/global happy.
the case comes about because of way local_alloc is preassigning regs for
pseudos that would map into m
Kenneth Zadeck wrote:
it looks like the backwards scan is not getting "enough" interferences
to make reload/global happy.
the case comes about because of way local_alloc is preassigning regs for
pseudos that would map into more than 1 hardreg.
pseudo's are as wide as they need to be. When loc
H. J. Lu wrote:
On Fri, Jun 15, 2007 at 06:21:53PM -0700, Ian Lance Taylor wrote:
This is hardly a new thought, but I believe that for the i386 gcc is
handicapped by reload. No matter how smart we are before reload, it
just take one poor decision by reload in an inner loop and we've lost
al
Ian Lance Taylor wrote:
Ian, may be I am wrong but I see a problem that some important for all
GCC community things are discussed only on IRC. Not all people are on
IRC. Moreover some people avoiding the IRC for some reasons.
There will always be private conversations about GCC.
Ian Lance Taylor wrote:
These charts are certainly discouraging. On the other hand, for some
real code we're seeing each new version of gcc produce an incremental
runtime improvement. So I'm not sure what to make of it.
This is hardly a new thought, but I believe that for the i386 gcc is
han
Eric Botcazou wrote:
No, GCC hit a fundamental wall because its backend was not modern.
The code we generate out of tree-ssa is in general, as good or better
than other compilers generate out of their middle ends.
The problem remaining is that they have much better backends then us.
Until data
Eric Botcazou wrote:
Please, just look at those charts
https://vmakarov.108.redhat.com/nonav/spec/comparison.html
The compilation speed decrease without a performance improving (at least
for the default case) is really scary.
Right, I also found those charts a bit depressing, given the
Ian Lance Taylor wrote:
"Vladimir N. Makarov" <[EMAIL PROTECTED]> writes:
Ian Lance Taylor wrote:
I've been lobbying for some time, on IRC, for more people to be able
to fill in the holes in the maintainership patterns. Most of the
existing global maintainers
Ian Lance Taylor wrote:
I've been lobbying for some time, on IRC, for more people to be able
to fill in the holes in the maintainership patterns. Most of the
existing global maintainers are inactive. There are areas of the code
which are not covered by the other maintainership groupings. Thus
Joe Buck wrote:
On Fri, Jun 15, 2007 at 08:50:49AM -0400, Vladimir N. Makarov wrote:
Looking at the last SC announcement, it is probably easy to get the
impression that SC is shrunk to David Edelsohn, may be Mark Mitchell
and Gerald Pfeifer.
That would be a mistake. Different SC
Richard Guenther wrote:
On 6/15/07, Vladimir N. Makarov <[EMAIL PROTECTED]> wrote:
Just to summarize what we test at SUSE currently:
Richard, thanks for the information.
- SPEC2000 is tested in various variants on AMD x86_64, including
32bit results and FDO runs.
- SPEC2000 is tes
Richard Kenner wrote:
Looking at the last SC announcement, it is probably easy to get the
impression that SC is shrunk to David Edelsohn, may be Mark Mitchell
and Gerald Pfeifer.
Those three people are indeed the ones that usually *speak for* the SC,
but you have absolutely no way of kn
People from gcc community found that GCC performance tracking at
RedHat stopped after Diego left RedHat. As I understand this was
helpful for some of them. Therefore we decided to resume GCC
Performance Tracking on GCC. This work is based on Diego Novillo's
scripts (Diego, thanks for the good
Sorry, my first reaction to latest SC announcements was to write
immediately. But I took time to think more about the situation (now
seing a discussion about "Non-Autopoiesis Maintainers" I am more
convinced in my decision). Here is my thoughts. I apologize in advance
if somebody feel offended
Prasad, Kamal R wrote:
Hi!
Has there been any contribution from HP at all on itanium specific
optimizations? I am referring to instruction scheduling and stuff at
that level.
Please, ask Steve Ellcey [EMAIL PROTECTED] about this. He is most active
developer of GCC from HP.
Our interfac
Prasad, Kamal R wrote:
Hello,
Can someone tell me the back-end optimizations available for itanium
(IA64)?
We (HP) may be able to contribute to this from our side.
Sorry, it is ambiguous question. There are a lot of optimizations in
GCC. Most of them are available for Itanium.
If you a
Ken Zadeck asked me to do another round of DF branch benchmarking on
SPEC2000. There is a progress on compilation speed (0.5%-1%) since
last my benchmarking pratically for all platforms. Now in average
code size and SPEC scores are practically the same for the mainline
and the branch. To be ho
Steven Bosscher wrote:
On 4/20/07, Vladimir N. Makarov <[EMAIL PROTECTED]> wrote:
> Yes, you have complained that you believe the data structure of DF is
> too fat. I guess that is a valid complaint. I don't see the "rtl info
> duplication" though. You'v
Steven Bosscher wrote:
On 4/20/07, Vladimir N. Makarov <[EMAIL PROTECTED]> wrote:
Did not I write several times that the data structure of DF is too fat
(because rtl info duplication) and that is probably the problem?
Yes, you have complained that you believe the data structure of
Steven Bosscher wrote:
On 4/20/07, Vladimir N. Makarov <[EMAIL PROTECTED]> wrote:
I am afraid that merging it earlier stops progress on the df
infrastructurey (e.g. Ken will work only on LTO)
There's nothing holding you, and many others, back from helping out,
other than that
Ian Lance Taylor wrote:
4) A discussion of dataflow. Ken Zadeck described the current state
of dataflow branch. It seems stable, and just about within the
compilation time guidelines set by the SC. He will do more testing
and retesting this weekend, and hopes to commit it to mainline
Steven Bosscher wrote:
On 4/12/07, Steven Bosscher <[EMAIL PROTECTED]> wrote:
On 4/12/07, Vladimir Makarov <[EMAIL PROTECTED]> wrote:
> An interesting observation is that the more hard registers the
processor
> has, the bigger slowdown is. Although it might be a coincidence.
Yes, I noticed
Zuxy Meng wrote:
"Mike Stump" <[EMAIL PROTECTED]>
??:[EMAIL PROTECTED]
On Apr 8, 2007, at 2:37 AM, Uros Bizjak wrote:
My docs say that "INC/DEC does not change the carry flag".
Personally, I'm having a hard time envisioning how the semantics of the
instruction are relevant
Mike Stump wrote:
I was wondering, if:
/* X86_TUNE_USE_INCDEC */
~(m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC),
is correct. Should it be:
/* X86_TUNE_USE_INCDEC */
~(m_PENT4 | m_NOCONA | m_GENERIC),
?
In the original patch in:
2006-11-18 Vladimir Makarov <[EMAIL PROTECTED]>
J.C. wrote:
Vladimir, you forgot a good book:
o Y.N. Srikant P.Shankar.
The Compiler Design Handbook: Optimizations and Machine Code
Generation.
CRC Press 2003. Upto page 916.
Thanks for reminding. I know about this book but I did not read it. It
looks very interesting but it is e
jimmy wrote:
Steven Bosscher wrote:
Hi,
I found this old patch
(http://gcc.gnu.org/ml/gcc-patches/2003-06/msg01669.html) that refers
to pages 202-214 of Muchnick's "Advanced Compiler Design and
Implementation" book. That book still is not in my own compiler books
collection because of its pri
Andrew Pinski wrote:
On 3/4/07, Vladimir N. Makarov <[EMAIL PROTECTED]> wrote:
Another important thing to do is to make the 1st scheduler register
pressure sensitive.
I don't know how many times this has to be said, no this is not the
correct approach to fix that issue. The cor
Maxim Kuvyrkov wrote:
Hi.
I want to share some of my thoughts and doings on improving / cleaning
up current GCC instruction scheduler (Haifa) - most of them are just
small obvious improvements.
I have semi-ready patches for about a half of them and would appreciate
any early suggestion or comm
Jan Hubicka wrote:
I am running SPEC on both AMD and Intel machines quite commonly and I
must say that there seems to be difference in between those two. For P4
and Core I get results within something like 1-2 SPEC point (0.1%) of overall
SPEC score, for Athlon I was never able to get so close,
Serge Belyshev wrote:
"Vladimir N. Makarov" <[EMAIL PROTECTED]> writes:
I run SPEC2000 several times per week and always look at 3 runs (to be
sure that is nothing wrong happened) but I never saw such big
"confidence" intervals (as I understand that is differenc
Serge Belyshev wrote:
I have compared 4.1.2 release (r121943) with three revisions of 4.2 on spec2k
on an 2GHz AMD Athlon64 box (in 64bit mode), detailed results are below.
In short, current 4.2 performs just as good as 4.1 on this target
with the exception of huge 80% win on 178.galgel. All ot
Here is the comparison of 4.1 and 4.2 branches (as of day before
yesterday) on SPEC2000 for ppc64.
In brief, gcc4.2 generates 3% faster code for SPECFP2000 and the same
code for SPECInt2000. In average the generated SPECInt2000 code size
is 0.5% smaller for gcc4.2. The SPECFp2000 code is 1.7% b
Mark Mitchell wrote:
Vladimir N. Makarov wrote:
Here is the comparison of 4.1 branch and 4.2 branch. In brief, 4.2
has 0.47% better performance in SPECInt2000 and 2.2% better
performance in SPECFP2000.
Thanks!
I assume that is with the aliasing safety patches turned on, i.e
Here is the comparison of 4.1 branch and 4.2 branch. In brief, 4.2
has 0.47% better performance in SPECInt2000 and 2.2% better
performance in SPECFP2000.
As I remeber this increase in SPECFP performance is as mostly from
implementation of Itanium speculation support for scheduling by ISP
RAS (m
Mark Mitchell wrote:
Vladimir Makarov wrote:
On Sunday I had accidentally chat about the df infrastructure on
IIRC. I've got some thoughts which I'd like to share.
I like df infrastructure code from the day one for its clearness.
Unfortunately users don't see it and probably don't care abo
Steven Bosscher wrote:
On 2/12/07, Vladimir Makarov <[EMAIL PROTECTED]> wrote:
I like df infrastructure code from the day one for its clearness.
Unfortunately users don't see it and probably don't care about it.
With my point of view the df infrastructure has a design flaw. It
extracts a lo
Richard Kenner wrote:
Vladimir Makarov writes:
Vlad> Especially I did not like David Edelhson's phrase "and no new
Vlad> private dataflow schemes will be allowed in gcc passes". It was not
Vlad> such his first expression. Such phrases are killing competition which
Vlad> is bad
Vlad,
I think that different people can have different perspectives.
You have been working on improving the register allocation for several
years, but very little has come of it because the reload
infrastructure does not suit itself to being integrated with modern
register allocators. You ha
tbp wrote:
On 1/28/07, Richard Guenther <[EMAIL PROTECTED]> wrote:
On 1/28/07, tbp <[EMAIL PROTECTED]> wrote:
> objdump -wdrfC --no-show-raw-insn $1|perl -pe 's/^\s+\w+:\s+//'|perl
> -ne 'printf "%4d\n", hex($1) if /sub\s+\$(0x\w+),%esp/'|sort -r| head
> -n 10
>
> msvc:2196 2100 1772 1692 1688
Rajkishore Barik wrote:
Hi,
Thanks very much. I still have doubts on your suggestion:
AFAIK, the back-end pass consists of (in order) : reg move -> insn sched
-> reg class -> local alloc -> global alloc -> reload -> post-reload.
There are edges from reg move to reg class and reload back to gl
H. J. Lu wrote:
If an instruction has latency 3 and throughput 1, should I write it as
(define_insn_reservation "simple" 3
(eq_attr "memory" "none")
"p0")
or
(define_insn_reservation "simple" 3
(eq_attr "memory" "none")
"p0,nothing*2")
Are they equivalent?
Yes.
What happens when ther
Andrew MacLeod wrote:
Those are the 4 actions/projects we left the summit with that I am aware
of. With any luck at all, one or more of these will have a significant
impact on our register allocator. Often projects like these proceed in
virtual silence until they are mostly done. Perhaps I'll
Andrew MacLeod wrote:
o register pressure relief through live range splitting and/or
rematerialization. We have no accurate information here, because
after that there are passes which change the pressure like insn
Sure, Im not suggesting that RABLET will reduce the register pressure
Steven Bosscher wrote:
Every time some RTL optimizer is re-re-re-re-re-evaluated, it turns
out we lose without it. Good luck to you, but I think you're seriously
underestimating the complexity of things here.
Its clearly not as good as a new register allocator would be, but the
effort to be
Steven Bosscher wrote:
Hello,
When fwprop gets approved, CSE path following will disappear. We lose
virtually no optimization opportunities. Still, it should not be very
hard to make CSE work on extended basic blocks (but without rescanning
like path following does). All that would be needed
[EMAIL PROTECTED] wrote:
I decided to look into the Yara branch to see if it could even be
bootstrap on PPC (with Yara turned on by default).
Thanks, for the information. It is even a surprise for me that some
tests work correctly for ppc. Last time when I had time and checked ppc
stat
Paolo Bonzini wrote:
Michael Matz wrote:
Hi Vladimir,
On Sat, 18 Mar 2006, Vladimir N. Makarov wrote:
What I am going to do in short perspective is
o work on code quality of some SPECINT tests (e.g. reload is doing
better job for crafty with many multi-registers than YARA)
The
Michael Matz wrote:
Hi Vladimir,
On Sat, 18 Mar 2006, Vladimir N. Makarov wrote:
What I am going to do in short perspective is
o work on code quality of some SPECINT tests (e.g. reload is doing
better job for crafty with many multi-registers than YARA)
I haven't looked at th
Paul Brook wrote:
On Saturday 18 March 2006 17:56, Vladimir N. Makarov wrote:
I've created a branch for my allocator project which is called Yet
Another Register Allocator (or YARA - yet another recursive acronim).
I am think I reached the point when my work on a public branch c
I've created a branch for my allocator project which is called Yet
Another Register Allocator (or YARA - yet another recursive acronim).
I am think I reached the point when my work on a public branch can
be made. I am focused only on x86 right now. So YARA will work only
for x86 and probably
[EMAIL PROTECTED] wrote:
Although many of us are involved, or have been involved, with other
compiler projects, the focus of the Gelato GCC Improvement Group is to
work *with* the GCC community *and* the GCC community *process* to
improve GCC for Itanium.
Some of the other projects which indi
On Fri, 2005-07-01 at 22:20 -0300, Rafael Ávila de Espíndola wrote:
> The finite automaton used in the pipeline hazard recognizer uses a cycle
> advancing arc in every state to represent a clock pulse. Bala(1) uses a
> different technique: All states were a instruction issue is not possible are
Steven Bosscher wrote:
Hi,
We have this charming move_insn function in haifa-sched.c:
/* Move INSN. Reemit notes if needed.
Return the last insn emitted by the scheduler, which is the
return value from the first call to reemit_notes. */
static rtx
move_insn (rtx insn, rtx last)
{
rtx retval
65 matches
Mail list logo