On Wed, Feb 29, 2012 at 9:11 AM, Aurelien Buhrig <aurelien.buhrig....@gmail.com> wrote: > Hi, > > I'm porting a gcc backend (4.6.1) for a 16-bit MCU with PSI pmode, and > SI ptr_mode. > > I have a QoR problem with loops: the chosen IVs are often not good. > I looked at tree-ssa-loop-ivopts.c but it is hard to understand that > code. So sorry if my questions are a bit confused but I would like to > understand what happens. > > First of all, I checked many times and the rtx_cost function is right. > > It seems that the choice of IVs is done according to the cost of IV > candidates themselves, but also their uses, register pressure (...) so > that it is difficult for me to understand why a candidate is preferred > from another one. > But what I "feel" is that gcc tries to use "important" candidates to > satisfy all uses. For example in a simple copy from an int array to > another ( for (i=0; i<N; i++) M1[i] = M2[i]; ), the i is extended to SI > (ptr_mode), addresses are computed in SImode from i, and then truncated > into PSImode. When modifing the code so that the IV is explicited as a > pointer (ex: for (ptr1=M1; ptr1<XXX;) *ptr1++=*ptr2++;) the code can be > reduced by 20%. > > Moreover, in loop intensive computations, setting the > iv-max-considered-uses=2 (so preventing optimization on complex loops) > can make code size much much better (in Os), until 30% reduction! So it > seems that, in such test cases, trying to optimize loops is worst than > doing nothing. > > > Here are my questions: > > - Is there a probable explanation for such behaviors when optimizing loops? > > - Is there a document (other than gccint) describing loops and their > optimization? > > - It seems that keeping computations and IVs in PSI is often preferable, > but there is no Pmode in tree representation, right? So when/where is > the choice for the mode around pointer operations made (ptr_mode vs Pmode) ? > > - PSImode is only used in very few backends as Pmode (m32c). Is its use > really optimized from middle-end algorithms/heuristics ? > > - Looking at the code, it seems there are different sets of IVs, for > instance in find_optimal_ivs_set with origset and set. Sometimes, > forcing one (often origset) generates better code. But what is the > difference between origset and set ? > > - And finally, is there something I can do from the back-end to make > loop code better?
The issue is most probably that on GIMPLE we only deal with ptr_mode, not Pmode, and IVOPTs thinks that pointer induction variables will have ptr_mode. To fix this the cost computation would need to take into account ptr_mode to Pmode conversions _and_ would need to consider Pmode IVs in the first place (I'm not sure that will be easy). Richard. > Thank you by advance! > Aurélien