Some thoughts and quetsions about the data flow infrastracture

2007-02-12 Thread Vladimir Makarov
On Sunday I had accidentally chat about the df infrastructure on IIRC. I've got some thoughts which I'd like to share. I like df infrastructure code from the day one for its clearness. Unfortunately users don't see it and probably don't care about it. With my point of view the df infrastructur

Re: Some thoughts and quetsions about the data flow infrastracture

2007-02-12 Thread Vladimir Makarov
David Edelsohn wrote: Vladimir Makarov writes: Third, I am disappointed that you chose to make this argument personal. David, I really apologize to make it personal. We are all one community and we are all thinking to make gcc a better compiler.

Re: Some thoughts and quetsions about the data flow infrastracture

2007-02-13 Thread Vladimir Makarov
Wow, I got so many emails. I'll try to answer them in one email in order not to repeat. Mark Mitchell wrote: I was not trying to suggest that DF is necessarily as sweeping a change as TREE-SSA. Certainly, it's not a completely change to the representation. It is not sweeping change as Tree

Re: Some thoughts and quetsions about the data flow infrastracture

2007-02-13 Thread Vladimir Makarov
David Edelsohn wrote: Do you realize how confrontational your emails sound? Have you considered asking about the technical reasoning and justification instead of making unfounded assertions? Do you want everyone to refute your incorrect facts point by point? David, I am sorry if yo

Re: Some thoughts and quetsions about the data flow infrastracture

2007-02-13 Thread Vladimir Makarov
Steven Bosscher wrote: On 2/13/07, Vladimir Makarov <[EMAIL PROTECTED]> wrote: Wow, I got so many emails. I'll try to answer them in one email in Let us look at major RTL optimizations: combiner, scheduler, RA. ...PRE, CPROP,SEE, RTL loop optimizers, if-conversion, ... It

Re: Some thoughts and quetsions about the data flow infrastracture

2007-02-13 Thread Vladimir Makarov
David Edelsohn wrote: Vladimir Makarov writes: Vlad> I am just trying to convince that the proposed df infrastructure is not Vlad> ready and might create serious problems for this release and future Vlad> development because it is slow. Danny is saying that the beau

Re: Some thoughts and quetsions about the data flow infrastracture

2007-02-13 Thread Vladimir Makarov
Paolo Bonzini wrote: The problem is that to use the modern approach you need another description of insns (with one pattern - one machine insn relation) in tree representation with given cost for the tree. And it is a huge work to rewrite current machine descriptions even only for this. Th

Re: Some thoughts and quetsions about the data flow infrastracture

2007-02-13 Thread Vladimir Makarov
Steven Bosscher wrote: On 2/13/07, Vladimir Makarov <[EMAIL PROTECTED]> wrote: > Why is it unacceptable for it to mature further on mainline like >Tree-SSA? > > Two releases one after another to avoid. No one real experiment to try to rewrite an RTL optimization to

Re: Some thoughts and quetsions about the data flow infrastracture

2007-02-13 Thread Vladimir Makarov
David Edelsohn wrote: Vladimir Makarov writes: Vlad> I did investigate the current status of the infrastructure on future Vlad> mainstream processor Core2 (> 11% slower compiler, worse code and bigger Vlad> code size). That is the reason why I started this.

Re: GCC 4.2.0 Status Report (2007-02-19)

2007-02-20 Thread Vladimir Makarov
Mark Mitchell wrote: I've spent some time today looking at GCC 4.2. I've heard various comments about whether or not it's worth doing a 4.2 release at all. For example: [Option 1] Instead of 4.2, we should backport some functionality from 4.2 to the 4.1 branch, and call that 4.2. [Option 2]

Re: GCC 4.2.0 Status Report (2007-02-19)

2007-02-20 Thread Vladimir Makarov
Richard Guenther wrote: On 2/20/07, Vladimir Makarov <[EMAIL PROTECTED]> wrote: As for proposal to revert the aliasing fixes on the 4.2, IMHO aliasing bugs are pretty nasty it is hard to find a option to work around because alias info is used in many optimizations. All bugs we are t

Re: GCC 4.2.0 Status Report (2007-02-19)

2007-02-20 Thread Vladimir Makarov
Steven Bosscher wrote: On 2/20/07, Vladimir Makarov <[EMAIL PROTECTED]> wrote: >[Option 1] Instead of 4.2, we should backport some functionality from >4.2 to the 4.1 branch, and call that 4.2. > >[Option 2] Instead of 4.2, we should skip 4.2, stabilize 4.3, and call >t

Re: GCC 4.2.0 Status Report (2007-02-19)

2007-02-20 Thread Vladimir Makarov
Paolo Bonzini wrote: In term of ports, yes I am agree. As the preformance even with last Paolo's patches (some changes could be applied to the mainline too, so it is not only about df), the branch compiler is still 8.7% slower for SPECint2000 compilation on 2.66Ghz Core2 with --enable-chec

Re: GCC 4.2.0 Status Report (2007-02-19)

2007-02-20 Thread Vladimir Makarov
Paolo Bonzini wrote: I am agree, benchmarking is evil. My favorite benchmark is gcc because it is my tool and I work on it. Compilation of gcc from Spec2000 is >12% slower with df including last Paolo's patches (which are a really big improvement). Interesting enough, 176.gcc has a pre

Re: GCC 4.2.0 Status Report (2007-02-19)

2007-02-22 Thread Vladimir Makarov
Richard Guenther wrote: On 2/21/07, Mark Mitchell <[EMAIL PROTECTED]> wrote: To be honest, my instinct for the FSF is to take the 4% hit and get rid of this nasty class of bugs. Users measure compiler quality by more than just floating-point benchmarks; FP code is a relatively small (albeit

Re: 40% performance regression SPEC2006/leslie3d on gcc-4_2-branch

2007-02-22 Thread Vladimir Makarov
Jan Hubicka wrote: Grigory Zagorodnev wrote: Mark Mitchell wrote: Excellent question; I should have asked for that as well. If 4.2 has gained on 4.1 in other respects, the 4.7% drop might represent a smaller regression relative to 4.1. There is the 4.2 (r120817) vs. 4.1

Re: 40% performance regression SPEC2006/leslie3d on gcc-4_2-branch

2007-02-22 Thread Vladimir Makarov
Mark Mitchell wrote: Vladimir Makarov wrote: I remember nocona tunning gave 30% improvement SPECFp2000 for Intel nocona in 64 bit mode in comparison with the default x86_64 gcc tuning (for k8). So such big improvement is definetly mostly from new -mtune=generic. Well, then, lets

Re: Improvements of the haifa scheduler

2007-03-05 Thread Vladimir Makarov
Andrey Belevantsev wrote: Vladimir N. Makarov wrote: Good aliasing is very important for the scheduler. But I'd look at this more wider. We need a good aliasing for many RTL optimizations. What's happened to ISP RAS aliasing patch propagating SSA info to RTL? Why is it stalled? We'll p

Re: Compiling without the use of static registers in IA-64

2007-03-13 Thread Vladimir Makarov
[EMAIL PROTECTED] wrote: All- I was wondering if anyone knew how I could modify gcc to not use static general purpose registers on an IA-64 machine? Specifically, I only want the compiler to allocate registers from the register stack engine (RSE) and the system defined registers (e.g., stack po

Re: Application for Google Summer of Code with GCC.

2007-03-26 Thread Vladimir Makarov
Dmitry Zhurikhin wrote: Hello, I want to propose a project for Google Summer of Code on title "New static scheduling heuristic". I hope that Vlad Makarov from Redhat or Andrey Belevantsev from ISP RAS will menthor this application. I will appreciate any feedback and will try to answer any questi

Recent dataflow branch SPEC2000 benchmarking

2007-04-11 Thread Vladimir Makarov
DF made a big progress especially with recent Ken Zadeck's DCE/DSE improvements. I think dataflow benchmarking will be interesting to people. Here is the comparison of dataflow-branch as of Apr 7. with the mainline on the last merge point (r123656) done by Daniel Berlin on Apr 7. Compilers f

Re: Recent dataflow branch SPEC2000 benchmarking

2007-04-12 Thread Vladimir Makarov
Steven Bosscher wrote: On 4/12/07, Vladimir Makarov <[EMAIL PROTECTED]> wrote: SPECFp2000 compilation time (user time): machine mainline branch change - x86_64 104.8s117.7s +12.3% ppc64312.3s367.8s +17.8% ia64 377.6s502.9s

Re: Recent dataflow branch SPEC2000 benchmarking

2007-04-12 Thread Vladimir Makarov
Steven Bosscher wrote: On 4/12/07, Vladimir Makarov <[EMAIL PROTECTED]> wrote: > Thanks for testing this. Do you also have per benchmark compilation > times, perhaps? > Not really. I don't do that because runtest startup is about 0.4s (on ppc64) and a few fp tests are co

DF-branch benchmarking on SPEC2000

2007-04-23 Thread Vladimir Makarov
I've promised to make more thorough and accurate comparison of df-branch and mainline on last merge point to the branch. The df-branch compiler does not include sunday's Steven's patch which uses a separate obstack for df bitmaps. It does not change code but it can speedup the df-branch compile

Re: DF-branch benchmarking on SPEC2000

2007-04-23 Thread Vladimir Makarov
Vladimir Makarov wrote: I've promised to make more thorough and accurate comparison of df-branch and mainline on last merge point to the branch. The df-branch compiler does not include sunday's Steven's patch which uses a separate obstack for df bitmaps. It does not change

Re: Difference between stack pointer and frame address during codegeneration

2005-03-02 Thread Vladimir Makarov
Øyvind Harboe wrote: Is there an existing technique to figure out the difference between the frame address and the stack pointer during code generation of arbitary instructions? You can get the current offset through INITIAL_ELIMINATION_OFFSET. But this offset is not final until reload pass i

Re: Difference between stack pointer and frame address duringcodegeneration

2005-03-02 Thread Vladimir Makarov
Øyvind Harboe wrote: On Wed, 2005-03-02 at 10:03 -0500, Vladimir Makarov wrote: Øyvind Harboe wrote: Is there an existing technique to figure out the difference between the frame address and the stack pointer during code generation of arbitary instructions? You can get the current

Re: Difference between stack pointer and frame addressduringcodegeneration

2005-03-02 Thread Vladimir Makarov
Øyvind Harboe wrote: Does something along the lines of a "FINAL_ELIMINATION_OFFSET()" exist, that can be called from code-generation in the back-end? Sorry, I don't understand what you mean saying "code-generation in the back-end". As I wrote there is no "FINAL_ELIMINATION_OFFSET" before

Re: Difference between stack pointer and frameaddressduringcodegeneration

2005-03-02 Thread Vladimir Makarov
Øyvind Harboe wrote: What I'm wondering about is whether it is possible, in the code that gets invoked in a "define_insn" to generate the actual assembly, to find out the difference between the SP and the frame address. You shoul have asked this the first time. I'll try to be more spe

Re: problem with the scheduler in gcc-4.0-20040911

2005-03-08 Thread Vladimir Makarov
Kunal Parmar wrote: Following is the debugging dump by the scheduler - ** ;; == ;; -- basic block 1 from 17 to 89 -- after reload ;; == ;;

Re: problem with dependencies in gcc-4.0-20040911

2005-03-08 Thread Vladimir Makarov
Kunal Parmar wrote: Hello, I am working with a VLIW processor and GCC-4.0-20040911. There is a problem in the dependency calculation of GCC. GCC is giving write-after-read a higher priority than write-after-write. Thus, as in the following code, GCC gives a write-after-read dependency between the 2

Re: How to "disable" register allocation?

2005-04-07 Thread Vladimir Makarov
Øyvind Harboe wrote: Is there an option(compile/build time?) to tell GCC to use as few registers as possible? Basically I want to run some tests where the assumption is that stack slots generate faster/smaller code than registers. Any pointers to where I should look in the GCC source, would be grea

Re: Store scheduling with DFA scheduler

2005-04-26 Thread Vladimir Makarov
Jon Beniston wrote: Hi, I'm trying to get the DFA scheduler in GCC 4.0.0 to schedule loads and stores, but I can only get it to work for loads. I have an automaton defined as follows: (define_automaton "cpu") (define_cpu_unit "x" "cpu") (define_cpu_unit "m" "cpu") (define_insn_reservation "arith" 1

Re: Store scheduling with DFA scheduler

2005-04-26 Thread Vladimir Makarov
Jon Beniston wrote: Hi Vlad, There is not enough information to say what is wrong. It would be better if you send gcc output when -fsched-verbose=10 is used. Cheers, Jon ;; Ready list (t = 10):32 28 24 ;; 10--> 24 [`y']=r43 :x,m*2 ;; Ready

Re: The VLIW bundle output questions

2005-05-19 Thread Vladimir Makarov
Ling-hua Tseng wrote: > > For example: > ===[top] > mov .risc0 r1, #25 \\ > ldw .risc0 r2, [fp, #30] \\ > addub .mac0 d0, d4, d3 \\ > subub .mac1 d11, d7, d4 > add .risc0 r3, r1, r5 > ===[end]==

Re: Pro64-based GPLed compiler

2005-06-29 Thread Vladimir Makarov
Marc Gonzalez-Sigler wrote: Hello everyone, I've taken PathScale's source tree (they've removed the IA-64 code generator, and added an x86/AMD64 code generator), and tweaked the Makefiles. I thought some of you might want to take a look at the compiler. http://www-rocq.inria.fr/~gonzalez/

Re: Scheduler questions (related to PR17808)

2005-06-30 Thread Vladimir Makarov
Steven Bosscher wrote: Notice how the conditional sets of r14 and r17 in insns 9 and 10 have been moved past insn 14, which uses these registers. Shouldn't there be true dependencies on insns 9 and 10 for insn 14? In theory, yes. But the scheduler (it is ebb scheduler as I understand) fre

Re: Pro64-based GPLed compiler

2005-06-30 Thread Vladimir Makarov
James E Wilson wrote: Daniel Berlin wrote: A bunch of random code #ifdef KEY'd FYI Pathscale was formerly known as Key Research. So the KEY probably wouldn't mean anything special here, it is likely just a marker for local changes. I heard a lot of this compiler and expected a better r

Re: Scheduler questions (related to PR17808)

2005-06-30 Thread Vladimir Makarov
Andrey Belevantsev wrote: Vladimir Makarov wrote: I'll look at this PR today. We've looked today at this issue. We think the problem is that proposed patch of sched_get_condition() treats conditional jumps likely to COND_EXECs, but it doesn't fix other places in sch

Re: Scheduler questions (related to PR17808)

2005-07-05 Thread Vladimir Makarov
Andrey Belevantsev wrote: Vladimir Makarov wrote: I'll look at this PR today. We've looked today at this issue. We think the problem is that proposed patch of sched_get_condition() treats conditional jumps likely to COND_EXECs, but it doesn't fix other places in sch

Re: Introduction of GCC improvement work for Itanium via Gelato Federation

2005-09-14 Thread Vladimir Makarov
Steven Bosscher wrote: On Wednesday 14 September 2005 10:53, Robert Dewar wrote: Gerald Pfeifer wrote: (If so, I'm wondering what it's going to buy the interested parties, because I have a hard time seeing one of the large GNU/Linux distributors switching to a compiler different from F

Re: gcc feature request / RFC: extra clobbered regs

2015-07-01 Thread Vladimir Makarov
On 06/30/2015 05:37 PM, Jakub Jelinek wrote: On Tue, Jun 30, 2015 at 02:22:33PM -0700, Andy Lutomirski wrote: I'm working on a massive set of cleanups to Linux's syscall handling. We currently have a nasty optimization in which we don't save rbx, rbp, r12, r13, r14, and r15 on x86_64 before ca

Re: gcc feature request / RFC: extra clobbered regs

2015-07-01 Thread Vladimir Makarov
On 07/01/2015 11:31 AM, Jakub Jelinek wrote: On Wed, Jul 01, 2015 at 11:23:17AM -0400, Vladimir Makarov wrote: (I'm not necessarily suggesting that we do this for the syscall bodies themselves. I want to do it for the entry and exit helpers, so we'd still lose the five cycles i

Re: gcc feature request / RFC: extra clobbered regs

2015-07-01 Thread Vladimir Makarov
On 07/01/2015 11:27 AM, Andy Lutomirski wrote: On Wed, Jul 1, 2015 at 8:23 AM, Vladimir Makarov wrote: On 06/30/2015 05:37 PM, Jakub Jelinek wrote: On Tue, Jun 30, 2015 at 02:22:33PM -0700, Andy Lutomirski wrote: I'm working on a massive set of cleanups to Linux's syscall ha

Re: gcc feature request / RFC: extra clobbered regs

2015-07-01 Thread Vladimir Makarov
On 07/01/2015 01:43 PM, Jakub Jelinek wrote: On Wed, Jul 01, 2015 at 01:35:16PM -0400, Vladimir Makarov wrote: Actually it raise a question for me. If we describe that a function clobbers more than calling convention and then use it as a value (assigning a variable or passing as an argument

Re: Consideration of Cost associated with SEME regions.

2015-07-02 Thread Vladimir Makarov
On 07/02/2015 07:06 AM, Ajit Kumar Agarwal wrote: Sorry for the typo error. I meant exits instead of exists. The below is corrected. The Cost Calculation for a candidate to Spill in the Integrated Register Allocator(IRA) considers only the SESE regions. The Cost Calculation in the IRA should

Re: Allocation of hotness of data structure with respect to the top of stack.

2015-07-09 Thread Vladimir Makarov
On 2015-07-05 7:11 AM, Ajit Kumar Agarwal wrote: All: I am wondering allocation of hot data structure closer to the top of the stack increases the performance of the application. The data structure are identified as hot and cold data structure and all the data structures are sorted in decrea

Re: Proposal to postpone release of 5.2 for a week [Was: Re: patch to fix PR66782]

2015-07-10 Thread Vladimir Makarov
On 07/10/2015 04:09 AM, Richard Biener wrote: On Thu, 9 Jul 2015, Uros Bizjak wrote: Hello! The patch was bootstrapped and tested on x86/x86-64. Committed as rev. 225618. 2015-07-09 Vladimir Makarov PR rtl-optimization/66782 * lra-int.h (struct lra_insn_recog_data

Re: porting to lra

2015-08-24 Thread Vladimir Makarov
On 08/24/2015 02:43 PM, shmeel gutl wrote: are there any guidelines as to what needs to be done in the backend to enable lra for 5.2? Unfortunately, switching from reload to LRA can be a difficult task. Reload pass is driven by many machine target hooks. As LRA uses different algorithms these

Re: LRA reloads of subregs

2015-09-04 Thread Vladimir Makarov
On 09/03/2015 06:33 PM, David Miller wrote: I'm working on converting sparc to LRA, and thanks probably to the work the powerpc folks did this is going much better than when I last tried this. Thanks for working on this, David. The first major stumbling block I've run into is when LRA forces a

Re: LRA reloads of subregs

2015-09-04 Thread Vladimir Makarov
On 09/04/2015 09:02 PM, David Miller wrote: From: David Miller Date: Fri, 04 Sep 2015 11:27:31 -0700 (PDT) From: Vladimir Makarov Date: Fri, 4 Sep 2015 10:00:54 -0400 I don't think we should add a new LRA code calling process_address before adding insns for further processing. LRA

Re: Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Vladimir Makarov
On 09/08/2015 02:51 PM, Jeff Law wrote: On 09/08/2015 12:39 PM, Aditya K wrote: IIUC, in the haifa-sched.c, the default scheduling algorithm seems to be top-down (before reload). Is there a way to schedule the other way (bottom up), or both ways? Not that I'm aware of. Note that region scheduli

Re: Understanding GCC test results published by SUSE

2015-10-13 Thread Vladimir Makarov
On 10/11/2015 04:18 PM, Mikhail Maltsev wrote: Hi! SUSE performs periodic testing of GCC and publishes the results on their site: http://gcc.opensuse.org/ (many thanks for this great job!). I hope https://vmakarov.fedorapeople.org/spec/ could be useful for your purposes too. I've changed in

Re: [RFC] Cse reducing performance of register allocation with -O2

2015-10-13 Thread Vladimir Makarov
On 10/13/2015 01:06 PM, Jeff Law wrote: On 10/13/2015 07:12 AM, Dominik Vogt wrote: In some cases, the work of the cse1 pass is counterproductive, as we noticed on s390x. The effect described below is present since at least 4.8.0. Note that this may not become manifest in a performance issue p

Re: [RFC] Cse reducing performance of register allocation with -O2

2015-10-22 Thread Vladimir Makarov
On 10/22/2015 06:05 AM, Dominik Vogt wrote: On Tue, Oct 13, 2015 at 05:03:36PM -0400, Vladimir Makarov wrote: [snip] I checked my article ftp://ftp.uvsq.fr/pub/gcc/summit/2004/Fighting%20Register%20Pressure.pdf and GVN gave mostly 0.2% on eon only. The current environment is quite different

Re: out of bounds access in insn-automata.c

2016-03-23 Thread Vladimir Makarov
On 03/23/2016 02:32 AM, Aldy Hernandez wrote: Howdy! I'm working on enhancements to our out-of-bounds warnings in VRP, such that we can warn and isolate conditionally out-of-bound accesses (similar to what we do in gimple-ssa-isolate-paths.c for NULL accesses). With my WIP I have found the f

Re: [PATCH] clean up insn-automata.c

2016-05-11 Thread Vladimir Makarov
On 05/11/2016 01:39 AM, Alexander Monakov wrote: On Wed, 30 Mar 2016, Bernd Schmidt wrote: On 03/25/2016 04:43 AM, Aldy Hernandez wrote: If Bernd is fine with this, I'm happy to retract my patch and any possible followups. I'm just interested in having no path causing a possible out of bounds

Re: Help with lra

2016-08-02 Thread Vladimir Makarov
On 08/02/2016 04:41 PM, shmeel gutl wrote: I am trying to enable lra for a propriety backend. I ran into one problem that I can't solve. In lra-constraints.c:split_reg lra_create_new_reg can be called with a hard code rclass of NO_REGS. It then queues a move instruction of the type set TYPE:

Re: Live range shrinkage in pre-reload scheduling

2014-05-14 Thread Vladimir Makarov
On 2014-05-13, 6:27 AM, Kyrill Tkachov wrote: Hi all, In haifa-sched.c (in rank_for_schedule) I notice that live range shrinkage is not performed when SCHED_PRESSURE_MODEL is used and the comment mentions that it results in much worse code. Could anyone elaborate on this? Was it just empiricall

Re: Live range shrinkage in pre-reload scheduling

2014-05-14 Thread Vladimir Makarov
On 2014-05-14, 12:38 PM, Richard Sandiford wrote: Vladimir Makarov writes: On 2014-05-13, 6:27 AM, Kyrill Tkachov wrote: Hi all, In haifa-sched.c (in rank_for_schedule) I notice that live range shrinkage is not performed when SCHED_PRESSURE_MODEL is used and the comment mentions that it

Re: Live Range Splitting in Integrated Register Allocator

2014-05-14 Thread Vladimir Makarov
On 2014-05-14, 1:33 PM, Ajit Kumar Agarwal wrote: Hello All: I am planning to implement the Live range splitting based on the following cases in the Integrated Register Allocator. For a given Live range that spans from from outer region to inner region of the loop. Such Live ranges which a

Re: Live range shrinkage in pre-reload scheduling

2014-05-15 Thread Vladimir Makarov
On 05/15/2014 02:46 AM, Ramana Radhakrishnan wrote: > On Wed, May 14, 2014 at 5:38 PM, Richard Sandiford > wrote: >> Vladimir Makarov writes: >>> On 2014-05-13, 6:27 AM, Kyrill Tkachov wrote: >>>> Hi all, >>>> >>>> In haifa-sched.c (in r

Re: Live Range Splitting in Integrated Register Allocator

2014-05-15 Thread Vladimir Makarov
On 05/15/2014 03:28 AM, Ajit Kumar Agarwal wrote: > > On 2014-05-14, 1:33 PM, Ajit Kumar Agarwal wrote: > >> Hello All: >> >> I am planning to implement the Live range splitting based on the following >> cases in the Integrated Register Allocator. >> >> For a given Live range that spans from from

Re: Using particular register class (like floating point registers) as spill register class

2014-05-16 Thread Vladimir Makarov
On 2014-05-16, 6:23 AM, Kugan wrote: I would like to know if there is anyway we can use registers from particular register class just as spill registers (in places where register allocator would normally spill to stack and nothing more), when it can be useful. In AArch64, in some cases, compilin

Re: negative latencies

2014-05-20 Thread Vladimir Makarov
On 05/19/2014 02:13 AM, shmeel gutl wrote: > Are there hooks in gcc to deal with negative latencies? In other > words, an architecture that permits an instruction to use a result > from an instruction that will be issued later. > Could you explain more on *an example* what are you trying to achiev

Re: negative latencies

2014-05-21 Thread Vladimir Makarov
On 2014-05-20, 5:18 PM, shmeel gutl wrote: On 20-May-14 06:13 PM, Vladimir Makarov wrote: On 05/19/2014 02:13 AM, shmeel gutl wrote: Are there hooks in gcc to deal with negative latencies? In other words, an architecture that permits an instruction to use a result from an instruction that will

Re: Reducing Register Pressure through Live range Shrinking through Loops!!

2014-05-22 Thread Vladimir Makarov
On 05/21/2014 12:25 AM, Ajit Kumar Agarwal wrote: > Hello All: > > Simpson does the Live range shrinking and reduction of register pressure by > using the computation that are not load and store but the arithmetic > computation. The computation > where the operands and registers are live at the e

Re: negative latencies

2014-05-23 Thread Vladimir Makarov
On 2014-05-23, 3:49 AM, shmeel gutl wrote: On 21-May-14 06:30 PM, Vladimir Makarov wrote: I am just curious what happens when you put insn2, insn1. and insn2 uses a result of insn1 in 6 cycles and insn1 producing the result in 3 cycles, but there are not ready functional units (e.g

Re: Reducing Register Pressure based on Instruction Scheduling and Register Allocator!!

2014-06-06 Thread Vladimir Makarov
On 2014-06-06, 10:48 AM, Ajit Kumar Agarwal wrote: Hello All: I was looking further the aspect of reducing register pressure based on Register Allocation and Instruction Scheduling and the Following observation being made on reducing register pressure based on the existing papers on reducing r

Re: Register Pressure guided Unroll and Jam in GCC !!

2014-06-16 Thread Vladimir Makarov
On 2014-06-16, 10:14 AM, Ajit Kumar Agarwal wrote: Hello All: I have worked on the Open64 compiler where the Register Pressure Guided Unroll and Jam gave a good amount of performance improvement for the C and C++ Spec Benchmark and also Fortran benchmarks. The Unroll and Jam increases the re

Re: Register Pressure guided Unroll and Jam in GCC !!

2014-06-16 Thread Vladimir Makarov
On 2014-06-16, 2:25 PM, Aaron Sawdey wrote: On Mon, 2014-06-16 at 14:14 +, Ajit Kumar Agarwal wrote: Hello All: I have worked on the Open64 compiler where the Register Pressure Guided Unroll and Jam gave a good amount of performance improvement for the C and C++ Spec Benchmark and also F

Comparison of GCC-4.9 and LLVM-3.4 performance on SPECInt2000 for x86-64 and ARM

2014-06-24 Thread Vladimir Makarov
A few people asked me about new performance comparison of latest GCC and LLVM. So I've finished it and put it on my site http://vmakarov.fedorapeople.org/spec/ The comparison is achievable from 2014 link and links under it in the left frame. These pages are also achievable as http://vmak

Re: Comparison of GCC-4.9 and LLVM-3.4 performance on SPECInt2000 for x86-64 and ARM

2014-06-24 Thread Vladimir Makarov
On 06/24/2014 10:36 AM, Ramana Radhakrishnan wrote: > > > On 24/06/14 15:11, Vladimir Makarov wrote: >>A few people asked me about new performance comparison of latest GCC >> and LLVM. So I've finished it and put it on my site >> >> http://vmakar

Re: Comparison of GCC-4.9 and LLVM-3.4 performance on SPECInt2000 for x86-64 and ARM

2014-06-24 Thread Vladimir Makarov
On 06/24/2014 10:42 AM, Renato Golin wrote: > On 24 June 2014 15:11, Vladimir Makarov wrote: >> A few people asked me about new performance comparison of latest GCC >> and LLVM. So I've finished it and put it on my site >> >> http://vmakarov.fedorapeople.org

Re: Comparison of GCC-4.9 and LLVM-3.4 performance on SPECInt2000 for x86-64 and ARM

2014-06-24 Thread Vladimir Makarov
On 06/24/2014 10:57 AM, Ramana Radhakrishnan wrote: > > The ball-park number you have probably won't change much. > >>> >> Unfortunately, that is the configuration I can use on my system because >> of lack of libraries for other configurations. > > Using --with-fpu={neon / neon-vfpv4} shouldn't cau

Re: Comparison of GCC-4.9 and LLVM-3.4 performance on SPECInt2000 for x86-64 and ARM

2014-06-25 Thread Vladimir Makarov
On 2014-06-25, 5:32 AM, Renato Golin wrote: On 25 June 2014 10:26, Bingfeng Mei wrote: Why is GCC code size so much bigger than LLVM? Does -Ofast have more unrolling on GCC? It doesn't seem increasing code size help performance (164.gzip & 197.parser) Is there comparisons for O2? I guess that

Re: Comparison of GCC-4.9 and LLVM-3.4 performance on SPECInt2000 for x86-64 and ARM

2014-06-25 Thread Vladimir Makarov
On 2014-06-24, 10:57 AM, Ramana Radhakrishnan wrote: The ball-park number you have probably won't change much. I don't think Neon can improve score for SPECInt2000 significantly but may be I am wrong. It won't probably improve the overall score by a large amount but some individual benchmar

Re: Comparison of GCC-4.9 and LLVM-3.4 performance on SPECInt2000 for x86-64 and ARM

2014-06-25 Thread Vladimir Makarov
On 2014-06-25, 10:02 AM, Richard Biener wrote: On Wed, Jun 25, 2014 at 4:00 PM, Vladimir Makarov wrote: On 2014-06-25, 5:32 AM, Renato Golin wrote: On 25 June 2014 10:26, Bingfeng Mei wrote: Why is GCC code size so much bigger than LLVM? Does -Ofast have more unrolling on GCC? It doesn&#

Re: Comparison of GCC-4.9 and LLVM-3.4 performance on SPECInt2000 for x86-64 and ARM

2014-06-25 Thread Vladimir Makarov
On 2014-06-25, 10:37 AM, Marc Glisse wrote: On Wed, 25 Jun 2014, Vladimir Makarov wrote: Maybe. But in this case LLVM did a right thing. The variable addressing was through a restrict pointer. Ah, gcc implements (on purpose?) a weak version of restrict, where it only considers that 2

Re: Comparison of GCC-4.9 and LLVM-3.4 performance on SPECInt2000 for x86-64 and ARM

2014-06-25 Thread Vladimir Makarov
On 2014-06-25, 10:01 AM, Vladimir Makarov wrote: On 2014-06-24, 10:57 AM, Ramana Radhakrishnan wrote: I've tried this options too. As I guessed it resulted in GCC improvement of eon only by 6% which improved overall score by less 0.5%. No change for LLVM though. Eon is more fp benc

Re: combination of read/write and earlyclobber constraint modifier

2014-07-02 Thread Vladimir Makarov
On 2014-07-01, 3:27 PM, Tom de Vries wrote: Vladimir, There are a few patterns which use both the read/write constraint modifier (+) and the earlyclobber constraint modifier (&): So my question is: is the combination of '&' and '+' supported ? If so, what is the exact semantics ? If not, shou

Re: Enable EBX for x86 in 32bits PIC code

2014-08-25 Thread Vladimir Makarov
On 2014-08-22 8:21 AM, Ilya Enkovich wrote: Hi, On Cauldron 2014 we had a couple of talks about relaxation of ebx usage in 32bit PIC mode. It was decided that the best approach would be to not fix ebx register, use speudo register for GOT base address and let allocator do the rest. This sho

Re: Enable EBX for x86 in 32bits PIC code

2014-08-26 Thread Vladimir Makarov
On 08/26/2014 04:57 AM, Ilya Enkovich wrote: > 2014-08-26 11:49 GMT+04:00 Ilya Enkovich : >> 2014-08-25 19:08 GMT+04:00 Vladimir Makarov : >>> On 2014-08-22 8:21 AM, Ilya Enkovich wrote: >>>> Hi, >>>> >>>> On Cauldron 2014 we had a couple of

Re: Enable EBX for x86 in 32bits PIC code

2014-08-27 Thread Vladimir Makarov
On 2014-08-26 5:42 PM, Ilya Enkovich wrote: Hi, Here is a patch I tried. I apply it over revision 214215. Unfortunately I do not have a small reproducer but the problem can be easily reproduced on SPEC2000 benchmark 175.vpr. The problem is in read_arch.c:701 where float value is compared w

Re: Enable EBX for x86 in 32bits PIC code

2014-09-03 Thread Vladimir Makarov
On 2014-08-29 2:47 AM, Ilya Enkovich wrote: Seems your patch doesn't cover all cases. Attached is a modified patch (with your changes included) and a test where double constant is wrongly rematerialized. I also see in ira dump that there is still a copy of PIC reg created: Initialization of or

Re: Optimized Allocation of Argument registers

2014-11-17 Thread Vladimir Makarov
On 2014-11-17 8:13 AM, Ajit Kumar Agarwal wrote: Hello All: I was looking at the optimized usage and allocation to argument registers. There are two aspects to it as follows. 1. We need to specify the argument registers as followed by ABI in the target specific code. Based on the function a

Re: Optimized Allocation of Argument registers

2014-11-25 Thread Vladimir Makarov
On 11/24/2014 06:47 AM, Ajit Kumar Agarwal wrote: > All: > > The optimization of reducing save and restore of the callee and caller saved > register has been the attention Of > increasing the performance of the benchmark. The callee saved registers is > saved at the entry and restore at the > ex

Re: A Question About LRA/reload

2014-12-09 Thread Vladimir Makarov
On 12/09/2014 04:37 AM, lin zuojian wrote: > Hi, > I have read ira/lra code for a while, but still fails to understand > their relationship. The main question is why ira do color so early? > lra pass will do the assignment anyway. Sorry if I mess up coloring > and hard register assi

Re: bug in lra-constraints.c (simple_move_p register_move_cost)

2014-12-18 Thread Vladimir Makarov
On 2014-12-16 9:53 AM, BELBACHIR Selim wrote: Hi, I may have found a bug when I was trying to port my private backend to new LRA pass (using gcc 4.9.2+patches). In lra-constraints.c, in function simple_move_p, the target hook targetm.register_move_cost is called with two badly swapped paramet

Re: Allocating some Loop allocno in memory

2015-01-12 Thread Vladimir Makarov
On 2015-01-12 6:33 AM, Ajit Kumar Agarwal wrote: -Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Monday, January 12, 2015 2:33 PM To: Ajit Kumar Agarwal Cc: vmaka...@redhat.com; l...@redhat.com; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhu

Re: IRA : Changes in the cost of putting allocno into memory.

2015-01-12 Thread Vladimir Makarov
On 2015-01-12 2:25 PM, Jeff Law wrote: On 01/08/15 04:00, Ajit Kumar Agarwal wrote: Hello Vladimir: We have made the changes in the ira-color.c in ira_loop_edge_freq and move_spill_restore. The main motivation behind the change is to reduce the memory instruction with respect to the Loops.

Re: LRA and CANNOT_CHANGE_MODE_CLASS

2015-01-16 Thread Vladimir Makarov
On 2015-01-16 12:30 PM, Andreas Krebbel wrote: Hi, on S/390 I see invalid subregs being generated by LRA although CANNOT_CHANGE_MODE_CLASS is supposed to prevent these. The reason appears to be the code you've added with: commit c6a6cdaaea571860c94f9a9fe0f98c597fef7c81 Author: vmakarov Date

Re: Rematerialization and Live Range Splitting on Region Frequency

2015-01-26 Thread Vladimir Makarov
On 2015-01-25 4:55 AM, Ajit Kumar Agarwal wrote: Hello All: Looks like Live range splitting and rematerialization are connected to each other. If the boundary of Live range Splitting is in the high frequency of the region then the move connected to splitted live ranges are inside the High fr

Re: Optimal Coalescing with respect to move instruction for Live range splitting

2015-01-26 Thread Vladimir Makarov
On 2015-01-18 12:37 AM, Ajit Kumar Agarwal wrote: Register allocation with two phase approach does optimal coalescing after the spilling. Sometime Live range splitting makes the coalescing non optimal. The splitted Live range are connected by move instruction. Thus the Live range splitting and

Re: Possible typo in LRA

2015-02-06 Thread Vladimir Makarov
On 2015-02-05 4:36 PM, sa...@hederstierna.com wrote: Hi When reviewing some code from LRA, I just saw some lines that looked a bit strange, could it be a possible typo perhaps? The file "lra.c" from GC5 master branch current date Line 469: /* Try x = index_scale; x = x + disp;

Re: Proposal for path splitting for reduction in register pressure for Loops.

2015-03-09 Thread Vladimir Makarov
On 2015-03-09 8:10 PM, Ajit Kumar Agarwal wrote: -Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Monday, March 09, 2015 11:01 PM To: Richard Biener Cc: Ajit Kumar Agarwal; vmaka...@redhat.com; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; N

Re: More methods of reducing the register pressure

2015-04-19 Thread Vladimir Makarov
On 19/04/15 01:11 PM, Jeff Law wrote: On 04/19/2015 09:28 AM, Ajit Kumar Agarwal wrote: Hello All: To reduce the register pressure, I am proposing the following methods of reducing the registers. 1. Assigning same registers or sharing same register for the logical registers having the sam

Re: More methods of reducing the register pressure

2015-04-19 Thread Vladimir Makarov
On 19/04/15 11:28 AM, Ajit Kumar Agarwal wrote: Hello All: To reduce the register pressure, I am proposing the following methods of reducing the registers. 1. Assigning same registers or sharing same register for the logical registers having the same value. To determine the logical register

Re: IRA preferencing issues

2015-04-20 Thread Vladimir Makarov
On 17/04/15 09:26 AM, Matthew Fortune wrote: Wilco Dijkstra writes: While investigating why the IRA preferencing algorithm often chooses incorrect preferences from the costs, I noticed this thread: https://gcc.gnu.org/ml/gcc/2011-05/msg00186.html I am seeing the exact same issue on AArch64 -

Re: PR63633: May middle-end come up width hard regs for insn expanders?

2015-04-20 Thread Vladimir Makarov
On 17/04/15 05:58 AM, Georg-Johann Lay wrote: I allowed me to CC Vladimir; maybe he can propose how the backend can describe an efficient, constraint-based solution. The problem is about expanders producing insns with non-fixed hard-regs as in/out operands or clobbers. This includes move in

  1   2   3   4   >