Re: [PATCH RFC]Pair load store instructions using a generic scheduling fusion pass

Jeff Law Thu, 30 Oct 2014 12:43:54 -0700

On 10/10/14 21:32, Bin.Cheng wrote:

Mike already gave great answers, here are just some of my thoughts on
the specific questions.  See embedded below.

Thanks to both of you for your answers.

Fundamentally, what I see is this scheme requires us to be able to comeup with a key based solely on information in a particular insn. To getfusion another insn has to have the same or a closely related key.

This implies that the the two candidates for fusion are related, even ifthere isn't a data dependency between them. The canonical example wouldbe two loads with reg+d addressing modes. If they use the same baseregister and the displacements differ by a word, then we don't have adata dependency between the insns, but the insns are closely related bytheir address computations and we can compute a key to ensure those tworelated insns end up consecutive. At any given call to the hook, theonly context we can directly see is the current insn.

I'm pretty sure if I were to tweak the ARM bits ever-so-slightly itcould easily model the load-load or store-store special case on thePA7xxx[LC] processors. Normally a pair of loads or stores can't dualissue. But if the two loads (or two stores) hit the upper and lowerhalf of a double-word objects, then the instructions can dual issue.

I'd forgotten about that special case scheduling opportunity until Istarted looking at some unrelated enhancement for prefetching.

Your code would also appear to allow significant cleanup of the oldcaller-save code that had a fair amount of bookkeeping added to issuedouble-word memory loads/stores rather than single word operations.This *greatly* improved performance on the old sparc processors whichhad no call-saved FP registers.

However, your new code doesn't handle fusing instructions which aretotally independent and of different static types. There just isn't agood way to compute a key that I can see. And this is OK -- that case,if we cared to improve it, would be best handled by the SCHED_REORDER hooks.


I guess another way to ask the question, are fusion priorities static based on 
the insn/alternative, or can they vary?  And if they can vary, can they vary 
each tick of the scheduler?


Though this pass works on predefined fusion types and priorities now,
there might be two possible fixes for this specific problem.
1) Introduce another exclusive_pri, now it's like "fusion_pri,
priority, exclusive_pri".  The first one is assigned to mark
instructions belonging to same fusion type.  The second is assigned to
fusion each pair/consecutive instructions together.  The last one is
assigned to prevent specific pair of instructions from being fused,
just like "BC" mentioned.
2) Extend the idea by using hook function
TARGET_SCHED_REORDER/TARGET_SCHED_REORDER2.  Now we can assign
fusion_pri at the first place, making sure instructions in same fusion
type will be adjacent to each other, then we can change priority (thus
reorder the ready list) at back-end's wish even per each tick of the
scheduler.

#2 would be the best solution for the case I was pondering, but I don'tthink solving that case is terribly important given the processors forwhich it was profitable haven't been made for a very long time.


Jeff

Re: [PATCH RFC]Pair load store instructions using a generic scheduling fusion pass

Reply via email to