On 10/10/14 21:32, Bin.Cheng wrote:
Mike already gave great answers, here are just some of my thoughts on
the specific questions.  See embedded below.
Thanks to both of you for your answers.

Fundamentally, what I see is this scheme requires us to be able to come up with a key based solely on information in a particular insn. To get fusion another insn has to have the same or a closely related key.

This implies that the the two candidates for fusion are related, even if there isn't a data dependency between them. The canonical example would be two loads with reg+d addressing modes. If they use the same base register and the displacements differ by a word, then we don't have a data dependency between the insns, but the insns are closely related by their address computations and we can compute a key to ensure those two related insns end up consecutive. At any given call to the hook, the only context we can directly see is the current insn.

I'm pretty sure if I were to tweak the ARM bits ever-so-slightly it could easily model the load-load or store-store special case on the PA7xxx[LC] processors. Normally a pair of loads or stores can't dual issue. But if the two loads (or two stores) hit the upper and lower half of a double-word objects, then the instructions can dual issue.

I'd forgotten about that special case scheduling opportunity until I started looking at some unrelated enhancement for prefetching.

Your code would also appear to allow significant cleanup of the old caller-save code that had a fair amount of bookkeeping added to issue double-word memory loads/stores rather than single word operations. This *greatly* improved performance on the old sparc processors which had no call-saved FP registers.

However, your new code doesn't handle fusing instructions which are totally independent and of different static types. There just isn't a good way to compute a key that I can see. And this is OK -- that case, if we cared to improve it, would be best handled by the SCHED_REORDER hooks.


I guess another way to ask the question, are fusion priorities static based on 
the insn/alternative, or can they vary?  And if they can vary, can they vary 
each tick of the scheduler?

Though this pass works on predefined fusion types and priorities now,
there might be two possible fixes for this specific problem.
1) Introduce another exclusive_pri, now it's like "fusion_pri,
priority, exclusive_pri".  The first one is assigned to mark
instructions belonging to same fusion type.  The second is assigned to
fusion each pair/consecutive instructions together.  The last one is
assigned to prevent specific pair of instructions from being fused,
just like "BC" mentioned.
2) Extend the idea by using hook function
TARGET_SCHED_REORDER/TARGET_SCHED_REORDER2.  Now we can assign
fusion_pri at the first place, making sure instructions in same fusion
type will be adjacent to each other, then we can change priority (thus
reorder the ready list) at back-end's wish even per each tick of the
scheduler.
#2 would be the best solution for the case I was pondering, but I don't think solving that case is terribly important given the processors for which it was profitable haven't been made for a very long time.

Jeff

Reply via email to