This is also interesting for the ARC700 processor. There is also an issue if the flag for the conditionalized instruction is set in the immediately preceding instruction, and the result of the conditionalized instruction is required in the immediately following instruction, and if using a conditional branch with a short offset, there is also the opportunity to combine a comparison or bit test with the branch.
MOreover, since the ARCompact architecture a lot more registers than x86, if you don't use a frame pointer, there are also realistic opportunities to use conditional function returns. Already back when I was an SH maintainer, I was annoyed that there is only one BRANCH_COST. We should really have different ones for predictable and unpredictable/mispredicted branches. Also, it would make sense if the cost could be modified according to if the compiler thinks it will be able to schedule a delay slot instruction. Ideally alignment could also be taken into account, but that would require to do register allocation first, so there appears to be no viable pass ordering withing the gcc infrastructure to make this work. For an exact modeling, we should actually have three branch costs, distinguishing the cost from having no prediction to having a wrong prediction. However, in 'hot' code we can assume we have some prediction - either right or wrong, and 'cold' code would typically not matter, unles you have a humongous program with very poor locality. Howevr, for these reasons I think that COLD_BRANCH_COST is a misnomer, and could also promt port writers to put the wrong value there, since it's the mispredicted branches we are interested in. MISPREDICTED_BRANCH_COST would be more descriptive.