Thanks Revital; it may actually be better to include the scheduling of the branch even where the doloop pattern is simple, as it rightfully accounts for its resources.
> OK for mainline? Yes. I have only the following minor comments: +/* Bump the SCHED_TIMEs of all nodes by AMOUNT. Set the values of + SCHED_ROW and SCHED_STAGE. */ please clarify that, e.g., instruction scheduled on cycle AMOUNT will move to cycle zero. + /* The calculation of stage count is done adding the number + of stages before cycle zero and after cycle zero. */ + sc_until_cycle_zero = CALC_STAGE_COUNT (-1, new_min_cycle, ii); + + if (SCHED_TIME (u) < 0) + { + stage = CALC_STAGE_COUNT (-1, SCHED_TIME (u), ii); + SCHED_STAGE (u) = sc_until_cycle_zero - stage; + } + else + { + stage = CALC_STAGE_COUNT (SCHED_TIME (u), 0, ii); + SCHED_STAGE (u) = sc_until_cycle_zero + stage - 1; + } shouldn't normalized_time be used here instead of SCHED_TIME (u)? In any case, an alternative way of calculating the stage count of u can be (iinm): CALC_STAGE_COUNT (normalized_time, new_zero_kelvin, ii); where new_zero_kelvin = new_min_cycle - SMODULO (new_min_cycle, ii); As you've pointed out, this change may effect code size (by increasing the stage count) and/or performance (by increasing the ii, although the branch might not have had an available slot originally). So it may be interesting to share comparative measurements when available. Ayal. Revital Eres <revital.e...@linaro.org> wrote on 08/05/2011 07:37:07 AM: > From: Revital Eres <revital.e...@linaro.org> > To: Ayal Zaks/Haifa/IBM@IBMIL > Cc: gcc-patches@gcc.gnu.org, Patch Tracking <patc...@linaro.org> > Date: 08/05/2011 07:37 AM > Subject: [PATCH, SMS 1/3] Support closing_branch_deps (second try) > > Hello, > > The attached patch includes enhancements for SMS to support targets > that their doloop part is not decoupled from the rest of the loop's > instructions, as SMS currently requires. In this case, the branch can > not be placed wherever we want (as is currently done) due to the fact > it must honor dependencies and thus we schedule the branch instruction > with the rest of the loop's instructions and rotate it to be in row > ii-1 at the end of the scheduling procedure to make sure it's the last > instruction in the iteration. > > The attached patch changes the current implementation to always schedule > the branch in order to support the above case. > > As explained in http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00250.html > by always scheduling the branch the code size might be effected due to > the fact SC can be increased by 1, which means adding instructions from > at most one iteration to the prologue and epilogue. Also, it might > be that ii will be increased by one due to resources constraints -- > unavailability of free slots to schedule the branch. > > The patch was tested together with the rest of the patches in this series > and on top of the patch to support do-loop for ARM (not yet in mainline, > but approved http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01718.html). > On ppc64-redhat-linux regtest as well as bootstrap with SMS flags > enabling SMS also on loops with stage count 1. Regtested on SPU. > On arm-linux-gnueabi regtseted on c,c++. Bootstrap c language with SMS > flags enabling SMS also on loops with stage count 1. > > OK for mainline? > > Thanks, > Revital > > ChangeLog: > > * ddg.c (create_ddg_dep_from_intra_loop_link): If a true dep edge > enters the branch create an anti edge in the opposite direction > to prevent the creation of reg-moves. > * modulo-sched.c: Adjust comment to reflect the fact we are > scheduling closing branch. > (PS_STAGE_COUNT): Rename to CALC_STAGE_COUNT and redefine. > (stage_count): New field in struct partial_schedule. > (calculate_stage_count): New function. > (normalize_sched_times): Rename to reset_sched_times and handle > incrementing the sched time of the nodes by a constant value > passed as parameter. > (duplicate_insns_of_cycles): Skip closing branch. > (sms_schedule_by_order): Schedule closing branch. > (ps_insn_find_column): Handle closing branch. > (sms_schedule): Call reset_sched_times and adjust the code to > support scheduling of the closing branch. > (ps_insert_empty_row): Update calls to normalize_sched_times > and rotate_partial_schedule functions. > > testsuite Changlog: > > * gcc.target/arm/sms-9.c: New file. > * gcc.target/arm/sms-10.c: New file. > [attachment "patch_final_linaro_6_5.txt" deleted by Ayal Zaks/Haifa/IBM] > [attachment "sms-9.c" deleted by Ayal Zaks/Haifa/IBM] [attachment "sms-10.c" > deleted by Ayal Zaks/Haifa/IBM]