Re: Help understand the may_be_zero field in loop niter information

2014-06-12 Thread Zdenek Dvorak
Hi, > > I noticed there is below code/comments about may_be_zero field in loop > > niter desc: > > > > tree may_be_zero;/* The boolean expression. If it evaluates to true, > >the loop will exit in the first iteration (i.e. > >its latch will not be executed),

Re: How do I modify SSA and copy basic blocks?

2013-04-25 Thread Zdenek Dvorak
Hi, > On Tue, 2013-04-23 at 15:24 -0600, Jeff Law wrote: > > > Well, you have to copy the blocks, adjust the edges and rewrite the SSA > > graph. I'd use duplicate_block to help. > > > > You really want to look at tree-ssa-threadupdate.c. There's a nice big > > block comment which gives the

Re: A question about loop ivopt

2012-05-15 Thread Zdenek Dvorak
Hi, > > > > Why can't we replace function force_expr_to_var_cost directly with > function > > > > computation_cost in tree-ssa-loop-ivopt.c? > > > > > > > > Actually I think it is inaccurate for the current recursive algorithm > in > > > > force_expr_to_var_cost to estimate expr cost. Instead > co

Re: A question about loop ivopt

2012-05-15 Thread Zdenek Dvorak
Hi, > > Why can't we replace function force_expr_to_var_cost directly with function > > computation_cost in tree-ssa-loop-ivopt.c? > > > > Actually I think it is inaccurate for the current recursive algorithm in > > force_expr_to_var_cost to estimate expr cost. Instead computation_cost can > > cou

Re: reverse conditionnal jump

2012-01-06 Thread Zdenek Dvorak
Hi, > I'm still developping a new private target backend (gcc4.5.2) and I noticed > something strange in the assembler generated for conditionnal jump. > > > The compiled C code source is : > > void funct (int c) { > int a; > a = 7; > if (c < 0) > a = 4; > return a; > } >

Re: Simplification of relational operations (was [patch for PR18942])

2011-12-02 Thread Zdenek Dvorak
Hi, > I'm looking at a missed optimizations in combine and it is similar to the one > you've fixed in PR18942 > (http://thread.gmane.org/gmane.comp.gcc.patches/81504). > > I'm trying to make GCC optimize > (leu:SI > (plus:SI (reg:SI) (const_int -1)) > (const_int 1)) > > into > > (leu:SI >

Re: Loop-iv.c ICEs on subregs

2010-11-25 Thread Zdenek Dvorak
Hi, > I'm investigating an ICE in loop-iv.c:get_biv_step(). I hope you can shed > some light on what the correct fix would be. > > The ICE happens when processing: > == > (insn 111 (set (reg:SI 304) >(plus (subreg:SI (reg:DI 251) 4) > (const_int 1 > > (

Re: non-algorithmic maintainers

2010-11-15 Thread Zdenek Dvorak
> On Mon, Nov 15, 2010 at 10:00 PM, Paolo Bonzini wrote: > > We currently have 3 non-algorithmic maintainers: > > > > loop optimizer          Zdenek Dvorak           o...@ucw.cz > > loop optimizer          Daniel Berlin           dber...@dberlin.org > > l

Re: Question about Doloop

2010-09-06 Thread Zdenek Dvorak
Hi, > Doloop optimization fails to be applied on the following inner loop > when compiling for PowerPC (GCC -r162294) due to: > > Doloop: number of iterations too costly to compute. strength reduction is performed in ivopts, introducing new variable: for (p = inptr; p < something; p += 3) ..

Re: Irreducible loops in generated code

2010-08-19 Thread Zdenek Dvorak
Hi, > > I'm working on decompiling x86-64 binary programs, using branches to rebuild > > a control-flow graph and looking for loops. I've found a significant number > > of irreducible loops in gcc-produced code (irreducible loops are loops with > > more than one entry point), especially in -O3 opt

Re: A question about loop-unroll

2009-12-17 Thread Zdenek Dvorak
Hi, > > Is there a way to pass to the unroller the maximum number of iterations > > of the loop such that it can decide to avoid unrolling if > > the maximum number  is small. > > > > To be more specific, I am referring to the following case: > > After the vectorizer decides to peel for alignment

Re: Turning off unrolling to certain loops

2009-10-15 Thread Zdenek Dvorak
Hi, > I faced a similar issue a while ago. I filed a bug report > (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36712) In the end, > I implemented a simple tree-level unrolling pass in our port > which uses all the existing infrastructure. It works quite well for > our purpose, but I hesitated t

Re: Turning off unrolling to certain loops

2009-10-14 Thread Zdenek Dvorak
Hi, > Ok, I've actually gone a different route. Instead of waiting for the > middle end to perform this, I've directly modified the parser stage to > unroll the loop directly there. I think this is a very bad idea. First of all, getting the information needed to decide at this stage whether unro

Re: Turning off unrolling to certain loops

2009-10-08 Thread Zdenek Dvorak
Hi, > 2) I was using a simple example: > > #pragma unroll 2 > for (i=0;i<6;i++) > { > printf ("Hello world\n"); > } > > If I do this, instead of transforming the code into : > for (i=0;i<3;i++) > { > printf ("Hello world\n"); >

Re: Scev analysing number of loop iterations returns (2^32-1) instead of -1

2009-10-07 Thread Zdenek Dvorak
> On Wed, Oct 7, 2009 at 7:21 PM, Tobias Grosser > wrote: > > On Wed, 2009-10-07 at 18:30 +0200, Tobias Grosser wrote: > >> On Wed, 2009-10-07 at 17:44 +0200, Richard Guenther wrote: > >> > On Wed, Oct 7, 2009 at 5:35 PM, Tobias Grosser > >> > wrote: > >> > > On Wed, 2009-10-07 at 17:23 +0200, Ri

Re: Scev analysing number of loop iterations returns (2^32-1) instead of -1

2009-10-07 Thread Zdenek Dvorak
Hi, > > Ah, indeed. Sorry for being confused. Is tree-niter-desc->assumptions > > or ->may_be_zero non-NULL? > > Yes both. I attached the gdb content for both. you need to check may_be_zero, which in your case should contain something like N <= 49. If this condition is true, the number of ite

Re: Turning off unrolling to certain loops

2009-10-05 Thread Zdenek Dvorak
Hi, > I was wondering if it was possible to turn off the unrolling to > certain loops. Basically, I'd like the compiler not to consider > certain loops for unrolling but fail to see how exactly I can achieve > that. > > I've traced the unrolling code to multiple places in the code (I'm > working

Re: RFC: missed loop optimizations from loop induction variable copies

2009-09-23 Thread Zdenek Dvorak
Hi, > IVOpts cannot identify start_26, start_4 and ivtmp_32_7 to be copies. > The root cause is that expression 'i + start' is identified as a common > expression between the test in the header and the index operation in the > latch. This is unified by copy propagation or FRE prior to loop > opt

Re: M32C vs PR tree-optimization/39233

2009-04-15 Thread Zdenek Dvorak
Hi, > Can we somehow make this fix contingent on ports that have suitable > integral modes? yes; however, maybe it would be easier to wait till Richard finishes the work on not representing the overflow semantics in types (assuming that's going to happen say in a few weeks?), which should make th

Re: New no-undefined-overflow branch

2009-02-27 Thread Zdenek Dvorak
Hi, > > I obviously thought about this. The issue with using a flag is > > that there is no convenient place to stick it and that it makes > > the distinction between the two variants less visible. Consider > > the folding routines that take split trees for a start. > > > > IMHO using new tree-

Re: New no-undefined-overflow branch

2009-02-27 Thread Zdenek Dvorak
Hi, > > introducing new codes seems like a bad idea to me. There are many > > places that do not care about the distinction between PLUS_EXPR and > > PLUSV_EXPR, and handling both cases will complicate the code (see eg. > > the problems caused by introducing POINTER_PLUS_EXPR vs PLUS_EXPR > > dis

Re: New no-undefined-overflow branch

2009-02-26 Thread Zdenek Dvorak
Hi, in general, I like this proposal a lot. However, > As a start there will be no-overflow variants of NEGATE_EXPR, > PLUS_EXPR, MINUS_EXPR, MULT_EXPR and POINTER_PLUS_EXPR. > > The sizetypes will simply be operated on in no-overflow variants > by default (by size_binop and friends). > > Nami

Re: Warning when compiling dwarf2out.c with -ftree-parallelize-loops=4

2008-11-25 Thread Zdenek Dvorak
Hi, > As far as I get it, there is no real failure here. > Parloop, unaware of the array's upper bound, inserts the 'enough > iterations' condition (i>400-1), and thereby > makes the last iteration range from 400 upwards. > VRP now has a constant it can compare to the array's upper bound. > Cor

Re: query regarding adding a pass to undo final value replacement.

2008-10-15 Thread Zdenek Dvorak
Hi, > > but you only take the hash of the argument of the phi node (i.e., the > > ssa name), not the computations on that it is based > > Is this something like what you had in mind ? > > gen_hash (stmt) > { > > if (stmt == NULL) > return 0; > > use_operand_p use_p; > ssa_op_

Re: query regarding adding a pass to undo final value replacement.

2008-10-14 Thread Zdenek Dvorak
Hi, > >> >> So if the ssa_names are infact reused they won't be the same > >> >> computations. > >> > > >> > do you also check this for ssa names inside the loop (in your example, > >> > D.10_1? > >> > >> If we have to reinsert for a = phi (B) . We do the following checks. > >> > >> 1. If the edge

Re: query regarding adding a pass to undo final value replacement.

2008-10-13 Thread Zdenek Dvorak
Hi, > [Sorry about dropping the ball on this. I've had some trouble with > internet connectivity and was on vacation for a few days. ] > > On Thu, Oct 2, 2008 at 2:56 AM, Zdenek Dvorak <[EMAIL PROTECTED]> wrote: > > Hi, > > > >> >> b) If a

Re: query regarding adding a pass to undo final value replacement.

2008-10-01 Thread Zdenek Dvorak
Hi, > > I would disagree on that. Whether a final value replacement is > > profitable or not largely depends on whether it makes further > > optimization of the loop possible or not; this makes it difficult > > to find a good cost model. I think undoing FVR is a good approach > > to solve this p

Re: query regarding adding a pass to undo final value replacement.

2008-10-01 Thread Zdenek Dvorak
Hi, > >> b) If any PHI node has count zero it can be inserted back and its > >> corresponding computations removed, iff the argument of the PHI > >> node > >> still exists as an SSA variable. This means that we can insert > >> a_1 = PHI if D.10_1 still exists and hasnt b

Re: query regarding adding a pass to undo final value replacement.

2008-10-01 Thread Zdenek Dvorak
Hi, > On Wed, Oct 1, 2008 at 3:59 PM, Richard Guenther > <[EMAIL PROTECTED]> wrote: > > On Wed, Oct 1, 2008 at 3:22 PM, Ramana Radhakrishnan <[EMAIL PROTECTED]> > > wrote: > >> Hi , > >> > >> Based on the conversation in the thread at > >> http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've t

Re: query regarding adding a pass to undo final value replacement.

2008-10-01 Thread Zdenek Dvorak
Hi, > > Based on the conversation in the thread at > > http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a > > pass trying to undo final value replacement going. The initial > > implementation was done by Pranav Bhandarkar when he was employed at > > Azingo as part of work spons

Re: query regarding adding a pass to undo final value replacement.

2008-10-01 Thread Zdenek Dvorak
Hi, > b) If any PHI node has count zero it can be inserted back and its > corresponding computations removed, iff the argument of the PHI node > still exists as an SSA variable. This means that we can insert > a_1 = PHI if D.10_1 still exists and hasnt been removed by >

Re: Using cfglayout mode in the selective scheduler

2008-08-11 Thread Zdenek Dvorak
Hi, > > I am probably missing something: > > > >> The basic idea is enabling cfglayout mode and then ensuring that insn > >> stream and control flow are in sync with each other at all times. This > >> is required because e.g. on Itanium the final bundling happens right > >> after scheduling, and a

Re: Using cfglayout mode in the selective scheduler

2008-08-11 Thread Zdenek Dvorak
Hi, I am probably missing something: > The basic idea is enabling cfglayout mode and then ensuring that insn > stream and control flow are in sync with each other at all times. This > is required because e.g. on Itanium the final bundling happens right > after scheduling, and any extra jumps emit

Re: Out of ssa form only for some basic block

2008-04-16 Thread Zdenek Dvorak
Hi, > I'm trying to add a simple function to the callgraph using > cgraph_add_new_function() ( new function body is obtained by function > actually processed) . > I put my pass in pass_tree_loop.sub as first pass just after > pass_tree_loop_init pass, but I have some problems because the code > th

Re: Moving statements from one BB to other BB.

2008-04-15 Thread Zdenek Dvorak
Hi, > > > To clarify what Richard means, your assertion that "you have updated > > > SSA information" is false. > > > If you had updated the SSA information, the error would not occur :). > > > > > > How exactly are you updating the ssa information? > > > > > > The general way to update SSA

Re: Moving statements from one BB to other BB.

2008-04-15 Thread Zdenek Dvorak
Hi, > To clarify what Richard means, your assertion that "you have updated > SSA information" is false. > If you had updated the SSA information, the error would not occur :). > > How exactly are you updating the ssa information? > > The general way to update SSA for this case would be: > > For

Re: Fusing two loops

2008-04-10 Thread Zdenek Dvorak
Hi, > The error is rectified. The bug is in the function that calls fuse_loops(). > Now I am trying to transfer all the statements, using code - > > /* The following function fuses two loops. */ > > void > fuse_loops (struct loop *loop_a, struct loop *loop_b) > { > debug_loop (loop_a, 10); >

Re: Fusing two loops

2008-04-04 Thread Zdenek Dvorak
Hi, > I am trying to fuse two loops in tree level. For that, I am trying to > transfer statements in the header of one loop to the header of the > other one. > The code " http://rafb.net/p/fha0IG57.html " contains the 2 loops. > After moving a statement from one BB to another BB, do I need to

Re: [PATCH][RFC] Statistics "infrastructure"

2008-03-15 Thread Zdenek Dvorak
Hi, > > > A statistics event consists of a function (optional), a statement > > > (optional) and the counter ID. I converted the counters from > > > tree-ssa-propagate.c as an example, instead of > > > > > > prop_stats.num_copy_prop++; > > > > > > you now write > > > > > >

Re: [PATCH][RFC] Statistics "infrastructure"

2008-03-15 Thread Zdenek Dvorak
Hi, > This is an attempt to provide (pass) statistics collection. The > goal is to provide infrastructure to handle the current (pass specific) > statistics dumping that is done per function and per pass along the > regular tree/rtl dumps as well as to allow CU wide "fancy" analysis. > > The mos

Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-11 Thread Zdenek Dvorak
Hi, > On 03/10/08 08:24, Richard Guenther wrote: > > >You could either do > > > >GIMPLE_ASSIGN > > But 'cond' would be an unflattened tree expression. I'm trying to avoid > that. > > >or invent COND_GT_EXPR, COND_GE_EXPR, etc. (at least in GIMPLE > >we always have a comparison in COND_EXPR_C

Re: The effects of closed loop SSA and Scalar Evolution Const Prop.

2008-03-11 Thread Zdenek Dvorak
Hi, > Now tree scalar evolution goes over PHI nodes and realises that > aligned_src_35 has a scalar evolution {aligned_src_22 + 16, +, 16}_1) > where aligned_src_22 is > (const long int *) src0_12(D) i.e the original src pointer. Therefore > to calculate aligned_src_62 before the second loop comp

Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Zdenek Dvorak
Hi, > On 3/9/08 3:24 PM, Zdenek Dvorak wrote: > > >however, it would make things simpler. Now, we need to distiguish > >three cases -- SINGLE, UNARY and BINARY; if we pretended that > >GIMPLE_COPY is an unary operator, this would be reduced just > >to UNARY and B

Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Zdenek Dvorak
Hi, > >>So, what about adding a GIMPLE_COPY code? The code would have 0 > >>operands and used only for its numeric value. > > > >another possibility would be to make GIMPLE_COPY an unary operator, and > >get rid of the SINGLE_RHS case altogether (of course, unlike any other > >unary operator, it

Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Zdenek Dvorak
Hi, > So, what about adding a GIMPLE_COPY code? The code would have 0 > operands and used only for its numeric value. another possibility would be to make GIMPLE_COPY an unary operator, and get rid of the SINGLE_RHS case altogether (of course, unlike any other unary operator, it would not requir

Re: [tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-09 Thread Zdenek Dvorak
Hi, > On Sun, Mar 9, 2008 at 2:17 PM, Diego Novillo <[EMAIL PROTECTED]> wrote: > > On Sun, Mar 9, 2008 at 08:15, Richard Guenther > > <[EMAIL PROTECTED]> wrote: > > > > > What is GIMPLE_SINGLE_RHS after all? > > > > Represents a "copy" operation, an operand with no operator (e.g., a = 3, b >

[tuples] gimple_assign_subcode for GIMPLE_SINGLE_RHS

2008-03-08 Thread Zdenek Dvorak
Hi, I just noticed an error in a part of the code that I converted, that looks this way: switch (gimple_assign_subcode (stmt)) { case SSA_NAME: handle_ssa_name (); break; case PLUS_EXPR: handle_plus (); break; default: something (); } The problem of course is that for

Re: GCC loop optimizations

2008-02-29 Thread Zdenek Dvorak
Hi, > I'd like to know your experiences with the gcc loop optimizations. > > What loop optimizations (in your opinion) can be applied to a large > number of programs and yield a (significant) improvement of the > program run-time? in general, I would say invariant motion, load/store motion, stre

Re: Assigning a value to a temp var in GIMPLE SSA

2008-02-22 Thread Zdenek Dvorak
Hi, > I'm trying to add a simple statement to GIMPLE code adding a new pass, > that I put in pass_tree_loop.sub as last pass just before > pass_tree_loop_done pass. Just as test I'd like to add a call like: > > .palp = shmalloc (16); > > This is the code I'm using: > > t = build_fu

Re: [tuples] tree-tailcall.c

2008-02-21 Thread Zdenek Dvorak
Hi, > Zdenek, you committed changes to tree-tailcall.c but you didn't fully > convert the file. Was that a mis-commit? The file does not compile and > uses PHI_RESULT instead of gimple_phi_result. the file compiles for me; it indeed uses PHI_RESULT, but since that is equivalent to DEF_FROM_PT

Re: [tuples] Call for help converting passes

2008-02-11 Thread Zdenek Dvorak
Hi, > Everything else should work well enough for passes to be converted. > If anyone has some free cycles and are willing to put up with various > broken bits, would you be willing to help converting passes? There is > a list of the passes that need conversion in the tuples wiki > (http://gcc.gn

Re: Tree-SSA and POST_INC address mode inompatible in GCC4?

2007-11-03 Thread Zdenek Dvorak
Hi, > >> I believe that this is something new and is most likely fallout from > >> diego's reworking of the tree to rtl converter. > >> > >> To fix this will require a round of copy propagation, most likely in > >> concert with some induction variable detection, since the most > >> profitable plac

Re: Tree-SSA and POST_INC address mode inompatible in GCC4?

2007-11-03 Thread Zdenek Dvorak
Hi, > I believe that this is something new and is most likely fallout from > diego's reworking of the tree to rtl converter. > > To fix this will require a round of copy propagation, most likely in > concert with some induction variable detection, since the most > profitable place for this will b

Re: optimising recursive functions

2007-10-27 Thread Zdenek Dvorak
Hi, > > > So I am guessing the Felix version is lucky there are > > > no gratuitous temporaries to be saved when this happens, > > > and the C code is unlucky and there are. > > > > > > Maybe someone who knows how the optimiser works can comment? > > > > One problem with departing from the ABI eve

Re: problem with iv folding

2007-10-26 Thread Zdenek Dvorak
Hi, > traceback, tt, and ops follow. Why is this going wrong? > [ gdb ] call debug_tree(arg0) > type > [ gdb ] call debug_tree(arg1) > type

Re: From SSA back to GIMPLE.

2007-10-22 Thread Zdenek Dvorak
compilers in general (so that what you say makes some sense)? While I was mildly annoyed by your previous "contributions" to the discussion in the gcc mailing list, I could tolerate those. But answering a seriously ment question of a beginner by this confusing and completely irrelevant drivel is another thing. Sincerely, Zdenek Dvorak

Re: Question on GGC

2007-09-27 Thread Zdenek Dvorak
Hello, > I have several global variables which are of type rtx. They are used > in flow.c ia64.c and final.c. As stated in the internal doc with > types. I add GTY(()) marker after the keyword 'extern'. for example: > extern GTY(()) rtx a; > these 'extern's are added in regs.h which is in

Re: Re[2]: [GSoC: DDG export][RFC] Current status

2007-09-04 Thread Zdenek Dvorak
Hello, > An important missing piece is correction of exported information for > loop unrolling. As far as I can tell, for loop unrolled by factor N we > need to clone MEM_ORIG_EXPRs and datarefs for newly-created MEMs, create > no-dependence DDRs for those pairs, for which original DDR was > no-d

Re: question about rtl loop-iv analysis

2007-08-28 Thread Zdenek Dvorak
Hello, > And finally at the stage of rtl unrolling it looks like this: > [6] r186 = r2 + C; > r318 = r186 + 160; > loop: > r186 = r186 + 16 > if (r186 != r318) then goto loop else exit > > Then, in loop-unroll.c we call iv_number_of

Re: question about rtl loop-iv analysis

2007-08-28 Thread Zdenek Dvorak
Hello, > >> And finally at the stage of rtl unrolling it looks like this: > >> [6] r186 = r2 + C; > >> r318 = r186 + 160; > >> loop: > >> r186 = r186 + 16 > >> if (r186 != r318) then goto loop else exit > >> > >> Then, in loop-unroll.c we call iv_number_of_iterations, whi

Re: question about rtl loop-iv analysis

2007-08-28 Thread Zdenek Dvorak
Hello, > And finally at the stage of rtl unrolling it looks like this: > [6] r186 = r2 + C; > r318 = r186 + 160; > loop: > r186 = r186 + 16 > if (r186 != r318) then goto loop else exit > > Then, in loop-unroll.c we call iv_number_of_iterations, which eventually > calls i

Re: GCC 4.3.0 Status Report (2007-08-09)

2007-08-12 Thread Zdenek Dvorak
Hello, > > Are there any folks out there who have projects for Stage 1 or Stage 2 > > that they are having trouble getting reviewed? Any comments > > re. timing for Stage 3? > > Zadeck has the parloop branch patches, which I've been reviewing. I am > not sure how many other patches are left, bu

Re: RFC: Rename Non-Autpoiesis maintainers category

2007-07-27 Thread Zdenek Dvorak
Hello, > I liked the idea of 'Reviewers' more than any of the other options. > I would like to go with this patch, unless we find a much better > option? to cancel this category of maintainers completely? I guess it was probably discussed before (I am too lazy to check), but the existence of non

Re: Loop optimizations cheatsheet

2007-07-20 Thread Zdenek Dvorak
Hello, > Can you send out your presentation too? the slides and the example code are at http://kam.mff.cuni.cz/~rakdver/slides-gcc2007.pdf http://kam.mff.cuni.cz/~rakdver/diff_reverse.diff Zdenek

Loop optimizations cheatsheet

2007-07-20 Thread Zdenek Dvorak
Hello, you can find the cheatsheet I used during my loop optimizations tutorial on gccsummit at http://kam.mff.cuni.cz/~rakdver/loopcheat.ps Zdenek

Re: Re[2]: [GSoC: DDG export][RFC] Current status

2007-07-15 Thread Zdenek Dvorak
Hello, > Testing on tree-vectorizer testsuite and some of the GCC source files > showed that frequent source of apparent loss of exported information > were passes that performed basic block reordering or jump threading. > The verifier asserted that number of loops was constant and the order > the

Re: Does unrolling prevents doloop optimizations?

2007-06-30 Thread Zdenek Dvorak
Hello, > > > It doesn't seem that the number of iterations analysis from loop-iv.c > deals > > > with EQ closing branches. > > > > loop-iv works just fine for EQ closing branches. > > > > Thanks for the clarification (I didn't see EQ in iv_number_of_iterations's > switch (cond)). that is because

Re: Does unrolling prevents doloop optimizations?

2007-06-30 Thread Zdenek Dvorak
ed form here. */ > + > + return 0; > +} > /* Return nonzero if the loop specified by LOOP is suitable for >the use of special low-overhead looping instructions. DESC >describes the number of iterations of the loop. */ > Index: modulo-sched.c > =====

Re: Does unrolling prevents doloop optimizations?

2007-06-30 Thread Zdenek Dvorak
Hello, > It doesn't seem that the number of iterations analysis from loop-iv.c deals > with EQ closing branches. loop-iv works just fine for EQ closing branches. Zdenek > One option is for sms to use > doloop_condition_get/loop-iv analysis in their current form, and if failed > check (on our ow

Re: Does unrolling prevents doloop optimizations?

2007-06-29 Thread Zdenek Dvorak
Hello, > By "this change" I mean just commenting out the check in > doloop_condition_get. After applying the patch that introduced DOLOOP > patterns for SPU (http://gcc.gnu.org/ml/gcc-patches/2007-01/msg01470.html) > we needed this hack in order to be able to use the doloop_condition_get to > retu

Re: Does unrolling prevents doloop optimizations?

2007-06-29 Thread Zdenek Dvorak
anyway, you cannot submit new changes for 4.1). Zdenek > Thanks, > Vladimir > > On 6/12/07, Zdenek Dvorak <[EMAIL PROTECTED]> wrote: > > > >Hello, > > > >> To make sure I understood you correctly, does it mean that the change > >> (below in /*

Re: [tuples] Accessors for RHS of assignments

2007-06-20 Thread Zdenek Dvorak
Hello, > So, I think I am still not convinced which way we want to access the RHS > of a GS_ASSIGN. > > Since GS_ASSIGN can have various types of RHS, we originally had: > > gs_assign_unary_rhs (gs) <- Access the only operand on RHS > gs_assign_binary_rhs1 (gs)<- Access the 1st RHS oper

Re: machine learning for loop unrolling

2007-06-15 Thread Zdenek Dvorak
Hello, > Of course, instead of clock(), I'd like to use a non-intrusive > mechanism. However, my research on this topic didn't lead to anything > but perfsuite, which doesn't work very well for me (should it?). > > So here are the questions > > - how can I actually insert the code (I need to do

Re: Does unrolling prevents doloop optimizations?

2007-06-12 Thread Zdenek Dvorak
mewhere else. Zdenek > Thanks, > Vladimir > > > On 6/12/07, Zdenek Dvorak <[EMAIL PROTECTED]> wrote: > >Hello, > > > >> In file loop_doloop.c function doloop_condition_get makes sure that > >> the condition is GE or NE > >> otherwise

Re: Does unrolling prevents doloop optimizations?

2007-06-12 Thread Zdenek Dvorak
Hello, > In file loop_doloop.c function doloop_condition_get makes sure that > the condition is GE or NE > otherwise it prevents doloop optimizations. This caused a problem for > a loop which had NE condition without unrolling and EQ if unrolling > was run. actually, doloop_condition_get is not a

Re: Help understanding tree-affine.c

2007-06-11 Thread Zdenek Dvorak
Hello, > I am trying to understand the usage of some functions in tree-affine.c > file and I appreciate your help. > > For example; for the two memory accesses > arr[b+8].X and arr[b+9].X, how does their affine combinations > will look like after executing the following sequence of operation?

Re: machine learning for loop unrolling

2007-06-08 Thread Zdenek Dvorak
Hello, > The number of floating point ops. in loop body. > The number of memory ops. in loop body. > The number of operands in loop body. > The number of implicit instructions in loop body. > The number of unique predicates in loop body. > The number of indirect references in loop body. > The numb

Re: [rfc] Moving bbs back to pools

2007-06-08 Thread Zdenek Dvorak
Hello, > > The problem is, that it does not give any speedups (it is almost > > completely compile-time neutral for compilation of preprocessed > > gcc sources). I will check whether moving also edges to pools > > changes anything, but so far it does not seem very promising :-( > > Well, the ben

Re: [rfc] Moving bbs back to pools

2007-06-07 Thread Zdenek Dvorak
Hello, > Ian Lance Taylor <[EMAIL PROTECTED]> writes: > > > Zdenek Dvorak <[EMAIL PROTECTED]> writes: > > > > > The problem is, that it does not give any speedups (it is almost > > > completely compile-time neutral for compilation of preprocessed

[rfc] Moving bbs back to pools

2007-06-07 Thread Zdenek Dvorak
Hello, as discussed in http://gcc.gnu.org/ml/gcc-patches/2007-05/msg01133.html, it might be a good idea to try moving cfg to alloc pools. The patch below does that for basic blocks (each function has a separate pool from that its basic blocks are allocated). At the moment, the patch breaks preco

Re: Predictive commoning miscompiles 482.sphinx3 in SPEC CPU 2006

2007-06-01 Thread Zdenek Dvorak
Hello, > > Because the patch had other effects like adding a DCE after Copyprop > > in the loop optimizer section. > > > > Disable DCE after Copyprop in the loop optimizer section fixes my > problem. Any idea why? no, not really; it could be anything (it may even have nothing to do with dce, pe

Re: A problem with the loop structure

2007-05-03 Thread Zdenek Dvorak
Hello, > ii) > In loop_version there are two calls to loop_split_edge_with > 1. loop_split_edge_with (loop_preheader_edge (loop), NULL); > 2. loop_split_edge_with (loop_preheader_edge (nloop), NULL); > nloop is the versioned loop, loop is the original. > > loop_split_edge_with has the following

Re: A problem with the loop structure

2007-04-28 Thread Zdenek Dvorak
Hello, > (based on gcc 4.1.1). now that is a problem; things have changed a lot since then, so I am not sure how much I will be able to help. > 1. The problem was unveiled by compiling a testcase with dump turned > on. The compilation failed while calling function get_loop_body from > flow_loop_

Re: GCC 4.2.0 Status Report (2007-04-24)

2007-04-25 Thread Zdenek Dvorak
Hello, > 4. PR 31360: Missed optimization > > I don't generally mark missed optimization bugs as P1, but not hoisting > loads of zero out of a 4-instruction loop is bad. Zdenek has fixed this > on mainline. Andrew says that patch has a bug. So, what's the story here? I found the problem, I wi

Re: GCC mini-summit - compiling for a particular architecture

2007-04-22 Thread Zdenek Dvorak
Hello, > On Sun, 2007-04-22 at 14:44 +0200, Richard Guenther wrote: > > On 4/22/07, Laurent GUERBY <[EMAIL PROTECTED]> wrote: > > > > > but also does not make anyone actually use the options. Nobody reads > > > > > the documention. Of course, this is a bit overstatement, but with a > > > > > few

Re: GCC mini-summit - compiling for a particular architecture

2007-04-22 Thread Zdenek Dvorak
> Look from what we're starting: > > << > @item -funroll-loops > @opindex funroll-loops > Unroll loops whose number of iterations can be determined at compile > time or upon entry to the loop. @option{-funroll-loops} implies > @option{-frerun-cse-after-loop}. This option makes code larger, > and

Re: GCC mini-summit - compiling for a particular architecture

2007-04-20 Thread Zdenek Dvorak
Hello, > Steve Ellcey wrote: > > >This seems unfortunate. I was hoping I might be able to turn on loop > >unrolling for IA64 at -O2 to improve performance. I have only started > >looking into this idea but it seems to help performance quite a bit, > >though it is also increasing size quite a bi

Re: adding dependence from prefetch to load

2007-04-12 Thread Zdenek Dvorak
Hello, > Well, the target architecture is actually quite peculiar, it's a > parallel SPMD machine. The only similarity with MIPS is the ISA. The > latency I'm trying to hide is somewhere around 24 cycles, but because it > is a parallel machine, up to 1024 threads have to stall for 24 cycles in

Re: adding dependence from prefetch to load

2007-04-12 Thread Zdenek Dvorak
Hello, > 2. Right now I am inserting a __builting_prefetch(...) call immediately > before the actual read, getting something like: > D.1117_12 = &A[D.1101_14]; > __builtin_prefetch (D.1117_12, 0, 1); > D.1102_16 = A[D.1101_14]; > > However, if I enable the instruction scheduler pass, it doesn

Re: Proposal: changing representation of memory references

2007-04-05 Thread Zdenek Dvorak
Hello, > >Remarks: > >-- it would be guaranteed that the indices of each memory reference are > > independent, i.e., that &ref[idx1][idx2] == &ref[idx1'][idx2'] only > > if idx1 == idx1' and idx2 = idx2'; this is important for dependency > > analysis (and for this reason we also need to reme

Re: Proposal: changing representation of memory references

2007-04-04 Thread Zdenek Dvorak
Hello, > >> >> That is, unless we could share most of the index struct (upper, > >> >> lower, step) among expressions that access them (IE make index be > >> >> immutable, and require unsharing and resharing if you want to modify > >> >> the expression). > >> > > >> >That appears a bit dangerous

Re: Proposal: changing representation of memory references

2007-04-04 Thread Zdenek Dvorak
Hello, > >at the moment, any pass that needs to process memory references are > >complicated (or restricted to handling just a limited set of cases) by > >the need to interpret the quite complex representation of memory > >references that we have in gimple. For example, there are about 1000 of >

Re: Proposal: changing representation of memory references

2007-04-04 Thread Zdenek Dvorak
Hello, > > -- base of the reference > > -- constant offset > > -- vector of indices > > -- type of the accessed location > > -- original tree of the memory reference (or another summary of the > > structure of the access, for aliasing purposes) > > -- flags > > What do you do with Ada COMPO

Re: Proposal: changing representation of memory references

2007-04-04 Thread Zdenek Dvorak
Hello, > >> >-- flags > >> > > >> >for each index, we remeber > >> >-- lower and upper bound > >> >-- step > >> >-- value of the index > >> > >> This seems a lot, however, since most of it can be derived from the > >> types, why are we also keeping it in the references. > > > >The lower bound and

Re: Proposal: changing representation of memory references

2007-04-04 Thread Zdenek Dvorak
Hello, > >Proposal: > > > >For each memory reference, we remember the following information: > > > >-- base of the reference > >-- constant offset > >-- vector of indices > >-- type of the accessed location > >-- original tree of the memory reference (or another summary of the > > structure o

Re: Proposal: changing representation of memory references

2007-04-04 Thread Zdenek Dvorak
Hello, > >> This looks like a very complicated (though very generic) way of > >> specifying a memory > >> reference. Last time we discussed this I proposed to just have BASE, > >OFFSET > >> and accessed TYPE (and an alias tag of the memory reference). I realize > >> this > >> doesn't cover acce

Re: Proposal: changing representation of memory references

2007-04-04 Thread Zdenek Dvorak
Hello, > This looks like a very complicated (though very generic) way of > specifying a memory > reference. Last time we discussed this I proposed to just have BASE, OFFSET > and accessed TYPE (and an alias tag of the memory reference). I realize > this > doesn't cover accesses to multi-dimensi

Re: Proposal: changing representation of memory references

2007-04-04 Thread Zdenek Dvorak
Hello, > This looks like a very complicated (though very generic) way of > specifying a memory > reference. Last time we discussed this I proposed to just have BASE, OFFSET > and accessed TYPE (and an alias tag of the memory reference). I realize > this > doesn't cover accesses to multi-dimensi

Proposal: changing representation of memory references

2007-04-04 Thread Zdenek Dvorak
Hello, at the moment, any pass that needs to process memory references are complicated (or restricted to handling just a limited set of cases) by the need to interpret the quite complex representation of memory references that we have in gimple. For example, there are about 1000 of lines of quite

Re: Valid gimple for MEM_REF

2007-03-04 Thread Zdenek Dvorak
Hello, > > only gimple_vals (name or invariant). However, the expressions are > >matched in final_cleanup dump (after out-of-ssa and ter), so this no > >longer is the case. I think just the regular expressions need to be > >updated. > > Then IV-OPTs has an issue too but IV-OPTs dump gives: > D

  1   2   3   >