Abt definition for structure tree
Hello all, Can anyone tell me where i can find the definition of tree. One structure is typedef-ed to tree. But i cant find that structure. I have been hunting it for sometime. Can some one help me. Thanks in advance. Regards, Shafi
abt compiler flags
Hello all, During regression tests if i want to disable some features like trampolines i can give -DNO_TRAMPOLINES as an compiler flag. Do i have similar flags for profiling and PIC? thanks in advance Regards, Shafi
Abt SIMD Emulation
Hello all, For targets which doesn't have simd hardware support like fr30 , simd stuff is emulated? Is there some flags/macros in gcc to indicate that? How is it done in other targets which deosnt have the hardware support? Thanks in advance Regards, Shafi.
Re: Abt SIMD Emulation
First thanks for the reply. I want to know what can be done in the back end of a target to indicate that SIMD stuff should be emulated all the way. __attribute__ ((vector_size (NN))) is something that can be done in programs. Is there any target macros or hooks available for that. Will the target hook TARGET_VECTOR_MODE_SUPPORTED_P hep me to indicate that? Guess this is the right mailing list for my question. Thanks in advance. Regards, Shafi. - Original Message From: Ian Lance Taylor <[EMAIL PROTECTED]> To: Mohamed Shafi <[EMAIL PROTECTED]> Cc: gcc@gcc.gnu.org Sent: Friday, October 13, 2006 8:01:11 PM Subject: Re: Abt SIMD Emulation Mohamed Shafi <[EMAIL PROTECTED]> writes: This question is more appropriate for the gcc-help mailing list than for the gcc mailing list. > For targets which doesn't have simd hardware support like fr30 , simd stuff > is emulated? Yes, if you use __attribute__ ((vector_size (NN))) for a target which does not support vector registers of that size, gcc will emulate the vector handling. > Is there some flags/macros in gcc to indicate that? To indicate what? > How is it done in other targets which deosnt have the hardware support? In the obvious tedious way: as a loop over the elements. Ian
Abt RTL expression
hello all, Sorry i am asking this kind of question.This might be weird to most of you but i am new to GCC. Can somebody tell me how to analyze the below instruction pattern (insn 8 6 9 1 (parallel [ (set (reg/f:SI 32) (symbol_ref:SI ("t") )) (clobber (reg:CC 21 cc)) ]) -1 (nil) (nil)) Will i be able to find this pattern in .md files? what does insn 8 6 9 1 mean? reg/f ? for varible declaration why is it needed to clobber CC? Hope somebody will help me. Thanks in advance. Regards, Shafi
Re: Abt RTL expression
A lot of thanks for the pointer... >Probably. Look for at pattern with "movsi" in the name. In th document http://gcc.gnu.org/onlinedocs/gccint/Insns.html#Insns it is said that "An integer that says which pattern in the machine description matches this insn, or −1 if the matching has not yet been attempted.Such matching is never attempted and this field remains −1 on an insn whose pattern consists of a single use, clobber " The integer in my pattern is -1. In fact integer for all the insn for a program (980602-2.c) like struct { unsigned bit : 30; } t; int main() { if (!(t.bit++)) exit (0); else abort (); } is -1 for my target. Can you explain this? Thanks in advance. Regards, Shafi - Original Message From: Rask Ingemann Lambertsen <[EMAIL PROTECTED]> To: Mohamed Shafi <[EMAIL PROTECTED]> Cc: gcc@gcc.gnu.org Sent: Monday, October 16, 2006 7:28:42 PM Subject: Re: Abt RTL expression On Mon, Oct 16, 2006 at 05:20:44AM -0700, Mohamed Shafi wrote: > hello all, > > Sorry i am asking this kind of question.This might be weird to most of you > but i am new to GCC. > Can somebody tell me how to analyze the below instruction pattern > > (insn 8 6 9 1 (parallel [ > (set (reg/f:SI 32) > (symbol_ref:SI ("t") )) > (clobber (reg:CC 21 cc)) > ]) -1 (nil) > (nil)) > > Will i be able to find this pattern in .md files? Probably. Look for at pattern with "movsi" in the name. > what does insn 8 6 9 1 mean? 8 is the number of the insn, 6 is the number of the previous insn, 9 is the number of the next insn and 1 is the number of the basic block to which the insn belongs. > reg/f ? This is actually documented (look for REG_POINTER): http://gcc.gnu.org/onlinedocs/gccint/Flags.html#Flags>. > for varible declaration why is it needed to clobber CC? It depends on the target. Some targets modify the condition codes when copying a value into a register while other targets don't. A third possibility is that of the m68k, where storing a value in a data register sets the condition codes while storing a value in an address register leaves the condition codes unmodified. A fourth possibility is that of the PowerPC, where this is optional on a per insn basis, but then you wouldn't normally include the (clobber (reg:CC 21 cc)) version in the machine description. -- Rask Ingemann Lambertsen.
Re: Abt RTL expression
> It is because matching has not yet been attempted. ok.. so what is the option to get hold of a rtl dump after all the matching is done - Original Message From: Rask Ingemann Lambertsen <[EMAIL PROTECTED]> To: Mohamed Shafi <[EMAIL PROTECTED]> Cc: gcc@gcc.gnu.org; Revital1 Eres <[EMAIL PROTECTED]> Sent: Tuesday, October 17, 2006 1:21:30 PM Subject: Re: Abt RTL expression On Mon, Oct 16, 2006 at 09:32:58PM -0700, Mohamed Shafi wrote: > > In th document > http://gcc.gnu.org/onlinedocs/gccint/Insns.html#Insns > > it is said that "An integer that says which pattern in the machine > description matches > this insn, or -1 if the matching has not yet been attempted.Such matching is > never attempted and this field remains -1 on an insn > whose pattern consists of a single use, clobber " > > The integer in my pattern is -1. In fact integer for all the insn for a > program (980602-2.c) like It is quite normal for the first 6-7 dump files. It is because matching has not yet been attempted. -- Rask Ingemann Lambertsen
Abt code generation
Hello, For the code (20020611-1.c) int p;int k;unsigned int n; void x () { unsigned int h;//line 1 h = n <= 30; //line 2 // printf("%u\n",h); if (h) p = 1; else p = 0; if (h) k = 1; else k = 0; } unsigned int n = 30; main () { x (); if (p != 1 || k != 1) abort (); exit (0); } By looking rtl dump generated with -dump-rtl-rnreg for code optimization Os, my target generates no code for line 1 and 2 .It generates code starting with checking CC value for if (h). For other optimization level, it generates proper code. Again, if a printf statement is added below line 2 (commented in above example), then even for Optimization Os it generates proper code. 1. Can anyone suggest the probable areas for this kind of behaviour. 2. What part of Optimzation level Os deals with removing redundant codes or is there a way to disable them? Thanks in advance. Regards, Shafi.
Abt gcses-1.c testcase
Hello all, Can anybody tell me the purpose of the testcase testsuite\gcc.dg\special\gcsec-1.c in the gcc testsuite ? Is it something related with garbage clooection? What exactly doec this testcase test ? Thanks in advance. Regards , Shafi.
Abt an RTL expression
Hello all, Can anyone tell me what the below expression means ? (insn 38 37 40 4 (parallel [ (asm_operands/v ("") ("") 0 [ //line 2 (reg:SI 32 [ s5.1 ]) //line 3 ] [ (asm_input:SI ("r")) //line 6 ] ("test55.c") 42) //line 7 (clobber (mem:BLK (scratch) [0 A8])) //line 8 ]) -1 (nil) (nil)) in line 2, what is the 0 for? what does line 3 mean?what is it purpose ? In line 7 test55.c is the file name . why is it needed and what is 42? In line 8 what does [0 A8] mean? Thanks in advance, Regards, shafi
Abt long long support
Hello all, Looking at a .md file of a backend it there a way to know whether a target supports long long Should i look for patterns with machine mode DI? Is there some other way? Thanks in advance for the help. Regards, Shafi
Re: Abt long long support
Thanks for the reply My target (non gcc/private one) fails for long long testcases and there are cases (with long long) which gets through, but not with the right output. When i replace long long with long the testcases runs fine, even those giving wrong output. The target is not able to compile properly for simple statements like long long a = 10; So when i looked into the .md file i saw no patterns with DI machine mode ,used for long long(am i right?), execpt define_insn "adddi3" and define_insn "subdi3" The .md file says that this is to prevent gcc from synthesising it, though i didnt understand what that means. Thats when i started to doubt if the backend provides support for long long.But if what Rask is saying is true , which has to be i guess since you guys are saying that,then middle end should take care of synthesizing long long. The 32 bit target has this defined in the .h file LONG_TYPE_SIZE 32 LONG_LONG_TYPE_SIZE 64 Is there anything else thati should provide in the bach end to make sure that rest of gcc is synthesizing long long properly? Any thoughts? On 11/7/06, Rask Ingemann Lambertsen <[EMAIL PROTECTED]> wrote: On Mon, Nov 06, 2006 at 10:52:00AM +0530, Mohamed Shafi wrote: > Hello all, > > Looking at a .md file of a backend it there a way to know whether a > target supports long long > Should i look for patterns with machine mode DI? No. For example, 8-bit, 16-bit and 32-bit targets should normally not define patterns such as anddi3, iordi3 and xordi3. It is possible that a target could have no patterns with mode DI but still support long long, although probably with significant slowdown. E.g. the middle end can synthesize adddi3 and subdi3 from SImode operations, but I think most targets can easily improve 10x in terms of speed and size on that code. Watch out for targets where units are larger than 8 bits. An example is the c4x where a unit is 32-bits and HImode is 64-bits. > Is there some other way? This depends a lot on exactly what you mean when you say support, but grep for LONG_TYPE_SIZE and LONG_LONG_TYPE_SIZE in the .h file and compare the two. -- Rask Ingemann Lambertsen
Re: Abt long long support
On 11/7/06, Mike Stump <[EMAIL PROTECTED]> wrote: On Nov 6, 2006, at 9:30 PM, Mohamed Shafi wrote: > My target (non gcc/private one) fails for long long testcases Does it work flawlessly otherwise, if not, fix all those problems first. After those are all fixed, then you can see if it then just works. In particular, you will want to ensure that 32 bit things work fine, first. Well, the test cases fails only for one condition. when main calls a function, like llabs ,to find the absolute value of a negative number and the function performs the action with return (arg<0 ? -arg : arg ); The program works fine if i pass a 1.positive value 2.use -fomit-frame-pointer flag while compiling (with negative value) 3.use another variable in function body to return i.e long long foo(long long x){ long long k; k=(x<0 ? -x : x); return k; } When i diff the rtl dumps for programs passing negative value with and without frame pointer i find changes from file.greg . Thats when the frame pointer issue kicks in. This is a small test case which produces the bug #include long long fun(long long k) { return ( k>0 ? k : -k); } int main() { long long i= -1; if(fun(i) == 1) printf("\nsuccess \n"); else printf("\nfailure \n"); } here the relevant rtl dump for the function fun from .greg file ; Hard regs used: 0 1 2 3 12 13 14 21 (note 2 0 9 NOTE_INSN_DELETED) ;; Start of basic block 0, registers live: 0 [d0] 1 [d1] 14 [a6] 15 [a7] 22 [vAP] (note 9 2 4 0 [bb 0] NOTE_INSN_BASIC_BLOCK) (insn 4 9 5 0 (parallel [ (set (reg/f:SI 13 a5 [31]) (plus:SI (reg/f:SI 14 a6) (const_int -8 [0xfff8]))) (clobber (reg:CC 21 cc)) ]) 29 {addsi3} (nil) (nil)) (insn 5 4 6 0 (set (mem/c/i:SI (reg/f:SI 13 a5 [31]) [0 k+0 S4 A32]) (reg:SI 0 d0 [ k ])) 16 {movsi_store} (nil) (nil)) (insn 6 5 7 0 (set (mem/c/i:SI (plus:SI (reg/f:SI 13 a5 [31]) (const_int 4 [0x4])) [0 k+4 S4 A32]) (reg:SI 1 d1 [orig:0 k+4 ] [0])) 16 {movsi_store} (nil) (nil)) (note 7 6 13 0 NOTE_INSN_FUNCTION_BEG) (insn 13 7 14 0 (parallel [ (set (reg/f:SI 13 a5 [33]) (plus:SI (reg/f:SI 14 a6) (const_int -8 [0xfff8]))) (clobber (reg:CC 21 cc)) ]) 29 {addsi3} (nil) (nil)) (insn 14 13 63 0 (set (reg:SI 0 d0) (mem/c/i:SI (reg/f:SI 13 a5 [33]) [0 k+0 S4 A32])) 15 {movsi_load} (nil) (nil)) (insn 63 14 64 0 (set (reg:SI 12 a4) (const_int -16 [0xfff0])) 17 {movsi_short_const} (nil) (nil)) (insn 64 63 65 0 (parallel [ (set (reg:SI 12 a4) (plus:SI (reg:SI 12 a4) (reg/f:SI 14 a6))) (clobber (reg:CC 21 cc)) ]) 29 {addsi3} (nil) (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6) (const_int -16 [0xfff0])) (nil))) (insn 65 64 15 0 (set (mem/c:SI (reg:SI 12 a4) [0 D.1863+0 S8 A32]) (reg:SI 0 d0)) 16 {movsi_store} (nil) (nil)) (insn 15 65 68 0 (set (reg:SI 0 d0) (mem/c/i:SI (plus:SI (reg/f:SI 13 a5 [33]) (const_int 4 [0x4])) [0 k+4 S4 A32])) 15 {movsi_load} (nil) (nil)) (insn 68 15 69 0 (set (reg:SI 12 a4) (const_int -12 [0xfff4])) 17 {movsi_short_const} (nil) (nil)) (insn 69 68 70 0 (parallel [ (set (reg:SI 12 a4) (plus:SI (reg:SI 12 a4) (reg/f:SI 14 a6))) (clobber (reg:CC 21 cc)) ]) 29 {addsi3} (nil) (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6) (const_int -12 [0xfff4])) (nil))) (insn 70 69 73 0 (set (mem/c:SI (reg:SI 12 a4) [0 D.1863+0 S8 A32]) (reg:SI 0 d0)) 16 {movsi_store} (nil) (nil)) (insn 73 70 74 0 (set (reg:SI 12 a4) (const_int -16 [0xfff0])) 17 {movsi_short_const} (nil) (nil)) (insn 74 73 75 0 (parallel [ (set (reg:SI 12 a4) (plus:SI (reg:SI 12 a4) (reg/f:SI 14 a6))) (clobber (reg:CC 21 cc)) ]) 29 {addsi3} (nil) (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6) (const_int -16 [0xfff0])) (nil))) (insn 75 74 17 0 (set (reg:SI 12 a4) (mem/c:SI (reg:SI 12 a4) [0 D.1863+0 S8 A32])) 15 {movsi_load} (nil) (nil)) (insn 17 75 18 0 (set (reg:CC 21 cc) (compare:CC (reg:SI 12 a4) (const_int 0 [0x0]))) 67 {*cmpsi_internal0} (nil) (nil)) (jump_insn 18 17 50 0 (set (pc) (if_then_else (gt:CC (reg:CC 21 cc) (const_int 0 [0x0])) (label_ref 32) (pc))) 41 {*branch_true} (nil) (nil)) ;; End of basic block 0, registers live: 14 [a6] 15 [a7] 22 [vAP] 28 ;; Start of basic block 2, registers live: 14 [a6] 15 [a7] 22 [vAP] 28 (note 50 18 78 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (insn 78 50 79 2 (set (reg:SI 13 a5) (const_int -16 [0xf
Re: Abt long long support
Thanks for the input and the questions Did you examine: long long l, k; l = -k; for correctness by itself? Was it valid or invalid? Yes this is working. [ read ahead for spoilers, I'd rather you pull this information out of the dump and present it to us... ] A quick glance at the rtl shows that insn 95 tries to use [a4+4] but insn 94 clobbered a4 already, also d3 is used by insn 93, but there isn't a set for it. Looks like you have found out the problem.But i need to look more into it. The way the instructions are numbered suggests that the code went wrong before this point. You have to read and understand all the The instructions are numbered randomly and not in the increasing order ... But looking at the diff of working and non working code i thought it was not an issue. Is this natural ? Those are ids for insns and each insns have unique id. Is it wrong to the the insns ids to be in jumbled fashion. and one more thing. In the dumps i noticed that before using a register in DI mode they are all clobbred first, like (insn 30 54 28 6 (clobber (reg:DI 34)) -1 (nil) (nil)) What is the use of this insns ... Why do we need to clobber these registers befor the use? After some pass they are not seen in the dump. Regards, Shafi.
Re: Abt long long support
On 11/10/06, Mike Stump <[EMAIL PROTECTED]> wrote: On Nov 9, 2006, at 6:39 AM, Mohamed Shafi wrote: > When i diff the rtl dumps for programs passing negative value with and > without frame pointer i find changes from file.greg . A quick glance at the rtl shows that insn 95 tries to use [a4+4] but insn 94 clobbered a4 already, also d3 is used by insn 93, but there isn't a set for it. The following part of the rtl dump of greg pass is the one which is giving the wrong output. (insn 90 29 91 6 (set (reg:SI 12 a4) (const_int -16 [0xfff0])) 17 {movsi_short_const} (nil) (nil)) (insn 91 90 94 6 (parallel [ (set (reg:SI 12 a4) (plus:SI (reg:SI 12 a4) (reg/f:SI 14 a6))) (clobber (reg:CC 21 cc)) ]) 29 {addsi3} (nil) (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6) (const_int -16 [0xfff0])) (nil))) (insn 94 91 95 6 (set (reg:SI 12 a4) (mem/c:SI (reg:SI 12 a4) [0 D.1863+0 S4 A32])) 15 {movsi_load} (nil) (nil)) (insn 95 94 31 6 (set (reg:SI 13 a5 [orig:12+4 ] [12]) (mem/c:SI (plus:SI (reg:SI 12 a4) (const_int 4 [0x4])) [0 D.1863+4 S4 A32])) 15 {movsi_load} (nil) (nil)) (insn 31 95 87 6 (parallel [ (set (reg:DI 2 d2) (minus:DI (reg:DI 0 d0 [34]) (reg:DI 12 a4))) (clobber (reg:CC 21 cc)) ]) 33 {subdi3} (nil) (nil)) Setting of register d3 is actually done in insns 31 . (set (reg:DI 2 d2) Since this is in DI mode it is using d2 and d3 in DI mode.Similary d0 and a4 is accessed in DI mode. So d1 and a5 is also being used in this insns.Hence negations is proper. Just like Mike pointed out 95 tries to use [a4+4] but insn 94 clobbered a4 already. The compiler should actually generate insn similar to insn 91 and 92 in between insn 94 and 95, but not using a4,or after saving a4. This is not happening. Insn 90 to 94 are emitted only from greg pass onwards. When i inserted the necessary assembly instructions correspoinding to movsi_short_const and addsi3 between insns 91 and 92 in the assemble file , the program worked fine. There are spill codes for insns 31 in the beginning of the the .greg file but i cant understand anything of that. Spilling for insn 31. Using reg 2 for reload 2 Using reg 12 for reload 3 Using reg 13 for reload 0 Using reg 13 for reload 1 The same program works for gcc 3.2 and gcc3.4.6 ports of the same private target I am not sure whether this is because of reload pass or global register allocation. 1. What could be the reason for this behavior? 2. How to overcome this type of behavior Regards, Shafi
Re: Abt long long support
First thanks very much for your thoughts If those two instructions appear for the first time in the .greg dump file, then they have been created by reload. Yes they appear for the first time in .greg dump file. > 1. What could be the reason for this behavior? I'm really shooting in the dark here, but my guess is that you have a define_expand for movdi that is not reload safe. You can do this operation correctly, you just have to reverse the instructions: load a5 from (a4 + 4) before you load a4 from (a4). See, e.g., mips_split_64bit_move in mips.c and note the use of reg_overlap_mentioned_p. I have already mentioned earlier in this conversation that adddi3 and subdi3 are the only DI mode patterns in the .md file. Then Rask pointed out that middile end will synthesize other patterns for DI mode looking at similar SI mode patters in the backend. As this is the case am i to assume that the synthesized movdi pattern is not safe for reload? Should i tweak the movsi pattern to to correct this issue or should i write a explicit movdi pattern ? With this in mind how come this worked fine in gcc 3.4.6 port of the target. Has the behavior of reload changed very much in gcc 4.1.1? Regards, Shafi
Re: Abt long long support
I'm really shooting in the dark here, but my guess is that you have a define_expand for movdi that is not reload safe. You can do this operation correctly, you just have to reverse the instructions: load a5 from (a4 + 4) before you load a4 from (a4). See, e.g., mips_split_64bit_move in mips.c and note the use of reg_overlap_mentioned_p. Sir, the following is a the part of .lreg dump file which is being changed in .greg file. (insn 29 28 31 6 (set (subreg:SI (reg:DI 34) 4) (const_int 0 [0x0])) 17 {movsi_short_const} (nil) (nil)) (insn 31 29 32 6 (parallel [ (set (reg:DI 28 [ D.1863 ]) (minus:DI (reg:DI 34) (reg:DI 28 [ D.1863 ]))) (clobber (reg:CC 21 cc)) ]) 33 {subdi3} (nil) (expr_list:REG_UNUSED (reg:CC 21 cc) (expr_list:REG_DEAD (reg:DI 34) (expr_list:REG_UNUSED (reg:CC 21 cc) (nil) In greg pass some instructions are inserted between insns 29 and 31.These instruction are inserted by reload. In .greg file there is spill code for insn 31, which is given below Reloads for insn # 31 Reload 0: reload_in (SI) = (plus:SI (reg/f:SI 14 a6) (const_int -16 [0xfff0])) ADDR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine reload_in_reg: (plus:SI (reg/f:SI 14 a6) (const_int -16 [0xfff0])) reload_reg_rtx: (reg:SI 13 a5) Reload 1: reload_in (SI) = (plus:SI (reg/f:SI 14 a6) (const_int -16 [0xfff0])) ADDR_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 2), can't combine reload_in_reg: (plus:SI (reg/f:SI 14 a6) (const_int -16 [0xfff0])) reload_reg_rtx: (reg:SI 12 a4) Reload 2: reload_out (DI) = (mem/c:DI (plus:SI (reg/f:SI 14 a6) (const_int -16 [0xfff0])) [0 D.1863+0 S8 A32]) GENERAL_REGS, RELOAD_OTHER (opnum = 0) reload_out_reg: (reg:DI 28 [ D.1863 ]) reload_reg_rtx: (reg:DI 2 d2) Reload 3: reload_in (DI) = (mem/c:DI (plus:SI (reg/f:SI 14 a6) (const_int -16 [0xfff0])) [0 D.1863+0 S8 A32]) GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 2), can't combine reload_in_reg: (reg:DI 28 [ D.1863 ]) reload_reg_rtx: (reg:DI 12 a4) I didnt understand what these means. The following pattern is from .greg file (insn 29 28 90 6 (set (reg:SI 1 d1 [orig:34+4 ] [34]) (const_int 0 [0x0])) 17 {movsi_short_const} (nil) (nil)) (insn 90 29 91 6 (set (reg:SI 12 a4) (const_int -16 [0xfff0])) 17 {movsi_short_const} (nil) (nil)) (insn 91 90 94 6 (parallel [ (set (reg:SI 12 a4) (plus:SI (reg:SI 12 a4) (reg/f:SI 14 a6))) (clobber (reg:CC 21 cc)) ]) 29 {addsi3} (nil) (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6) (const_int -16 [0xfff0])) (nil))) (insn 94 91 95 6 (set (reg:SI 12 a4) (mem/c:SI (reg:SI 12 a4) [0 D.1863+0 S4 A32])) 15 {movsi_load} (nil) (nil)) (insn 95 94 31 6 (set (reg:SI 13 a5 [orig:12+4 ] [12]) (mem/c:SI (plus:SI (reg:SI 12 a4) (const_int 4 [0x4])) [0 D.1863+4 S4 A32])) 15 {movsi_load} (nil) (nil)) (insn 31 95 87 6 (parallel [ (set (reg:DI 2 d2) (minus:DI (reg:DI 0 d0 [34]) (reg:DI 12 a4))) (clobber (reg:CC 21 cc)) ]) 33 {subdi3} (nil) (nil)) As you can see insns 90,91,94 and 95 are inserted in this pass, and the code goes wrong in insns 95/94 Why are these insns inserted in between ? With only subdi3 and adddi3 pattern available in the md file, and no other define_split or define_insns or define_expand for DI mode, how can i control the instructions generated due to reload? Regards, Shafi
Re: poisened macro definitions
From: Markus Franke <[EMAIL PROTECTED]> To: gcc@gcc.gnu.org Date: Tue, 05 Dec 2006 21:37:30 +0100 Subject: poisened macro definitions Dear GCC Developers, I want to port an existing backend (based on version gcc-2.7.2.3) on the most recent release (gcc-4.1.1). During compilation process I get several messages about some poisened macro definitions. The macros which make problems are listed below: ---snip--- --snap--- I read something about poisened macros and that they shouldn't be used anymore. But in fact I was not able to find any documentation about these macros. When were they declared as poisened and especially why? What should be done instead of using this macros? Just uncommenting everything can't be a solution. I was also looking in GCC-Internals manual without any success. Most of the target macros in older version of gcc have been converted into target hooks. These macros which are converted are now poisoned macros. So you will have to replace the macros with the corresponding target hooks. Some macros will be mergerd into one target hook while some will be replaced with a target hook. You will have to look into internals of 4.1.1 to find them out. The following messages should help you out http://gcc.gnu.org/ml/gcc/2006-08/msg00451.html http://gcc.gnu.org/ml/gcc-help/2006-08/msg00213.html Hope this helps. Regards, Shafi
Defining cmd line symbolic literals
Hello all, I am building a GCC Compiler. I have some ifdef checks in the compiler source code. In case i define a symbolic literal in command line while compiling a sample program, I want that set of statements to be invoked after ifdef checks. e.g. GCC Source: #ifdef SHAFI_DEBUG printf("\n Shafi Debugging!!\n"); #endif compiling 1.c: gcc -DSHAFI_DEBUG 1.c Is there any way to do this ? Thanks in advance Regards, Shafi
Arithmetic conversions between two different data types
Hello all, In arithmetic expressions we need to conversion when the operands are of different data types. In gcc 4.1.1 where is this process started? Is this in c-typeck.c, particularly in the function c_common_type ? Thanks in advance, Regards, Shafi.
Strange behavior for scan-tree-dump testing
Hello all, I added few testcases to the existing testsuite in gcc 4.1.1 for a private target. After running the testsuite i found out that all my test cases with scan-tree-dump testing failed for one particular situation. The values are scanned from gimple tree dump and its fails for cases like b4 = 6.3e+1 c1 = 1.345286102294921875e+0 but it was not failing for other values in the same tree dump which has values like some_identifier = x.xe-1 some_identifier = x.xe0 The failures are only when the tree dump values are positive and represented in the above format. I checked the tree dumps manually and found out that all the values are proper and the scan lines in the test-cases are also proper. this is way i used them /* { dg-final { scan-tree-dump "b4 = 6.3e+1" "gimple" } } */ Why is this behavior ? For positive values should i be writing it in some other way? One other question is that i am getting "test for excess errors" Fails for some cases which produce lot of warnings but otherwise proper. Can anyone help me? Thanks in advance. Regards, Shafi.
Providing default option to GAS through gcc driver
Hello all, I would like to know if there is any way to control the gcc driver program to pass a default option to the assembler if no option/switch is given. say -march=arch1 if no -march option is provided by the user. The macro TARGET_OPTION_TRANSLATE_TABLE does something like this but with this, one can only override the option and cannot provide one if none is given Is there anyway to do this? Regards, Shafi.
What to do when constraint doesn't match
Hello all, Looking at the internals i couldn't find an answer for my problem. I have a define_expand with the pattern name mov and a define_insn mov_store The predicate in define_expand is general_operand, so that all operands are matched. While in define_insn i have a predicate which allows only two class of registers say 'a' and 'b'. But the constraint for define_insn only allows registers of class 'b'. I also have a pattern for register move from 'a' to 'b', call it mova2b. So if for mov_store define_insn constraint doesn't satisfy why is that the compiler is not trying to match the constraint by generating a mova2b pattern? Is there something that i am missing here? Regards, Shafi
peephole patterns are not matching
hello everyone, I have the following 2 patterns which are consecutive. (from shorten rtl dump file) (insn 69 34 70 (set (reg:SQ 0 d0) (reg:SQ 18 f2)) 79 {movsq} (nil) (nil)) (insn 70 69 35 (set (reg:SQ 16 f0 [orig:38 D.3693 ] [38]) (reg:SQ 0 d0)) 79 {movsq} (nil) (nil)) For the above pattern i wrote a peephole like this (define_peephole [(set (match_operand:SF 0 "data_reg" "=d") (match_operand:SF 1 "float_reg" "f")) (set (match_operand:SF 2 "float_reg" "=f") (match_operand:SF 3 "data_reg" "d"))] "REGNO(operands[0]) == REGNO(operands[3])" "movf\\t%1, %3" ) even i wrote define_peephole2 which is similar to the above. But the above patterns are not matched at all. But i can find these patterns in the rtl dumps. What could be the reason for this behavior? Regards, Shafi
Re: peephole patterns are not matching
On 4/12/07, Andreas Schwab <[EMAIL PROTECTED]> wrote: "Mohamed Shafi" <[EMAIL PROTECTED]> writes: > hello everyone, > > I have the following 2 patterns which are consecutive. (from shorten > rtl dump file) > > (insn 69 34 70 (set (reg:SQ 0 d0) >(reg:SQ 18 f2)) 79 {movsq} (nil) >(nil)) > > (insn 70 69 35 (set (reg:SQ 16 f0 [orig:38 D.3693 ] [38]) >(reg:SQ 0 d0)) 79 {movsq} (nil) >(nil)) > > > For the above pattern i wrote a peephole like this > > (define_peephole > [(set (match_operand:SF 0 "data_reg" "=d") > (match_operand:SF 1 "float_reg" "f")) >(set (match_operand:SF 2 "float_reg" "=f") > (match_operand:SF 3 "data_reg" "d"))] The patterns match mode SF, but the insns have mode SQ. sorry actually the patterns are like this (insn 69 34 70 (set (reg:SF 0 d0) (reg:SF 18 f2)) 79 {movsf} (nil) (nil)) (insn 70 69 35 (set (reg:SF 16 f0 [orig:38 D.3693 ] [38]) (reg:SF 0 d0)) 79 {movsf} (nil) (nil)) and the peephole is same as the above Andreas. -- Andreas Schwab, SuSE Labs, [EMAIL PROTECTED] SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different."
How to control the offset for stack operation?
hello all, Depending on the machine mode the compiler will generate automatically the offset required for the stack operation i.e for a machine with word size is 32, for char type the offset is 1, for int type the offset is 2 and so on.. Is there a way to control this ? i mean say for long long the offset is 4 if long long is mapped to TI mode and i want the generate the offset such that it is 2. Is there a way to do this in gcc ? Regards, Shafi
Re: How to control the offset for stack operation?
On 4/16/07, J.C. Pizarro <[EMAIL PROTECTED]> wrote: 2007/4/16, Mohamed Shafi <[EMAIL PROTECTED]>: > hello all, > > Depending on the machine mode the compiler will generate automatically > the offset required for the stack operation i.e for a machine with > word size is 32, for char type the offset is 1, for int type the > offset is 2 and so on.. > > Is there a way to control this ? i mean say for long long the offset > is 4 if long long is mapped to TI mode and i want the generate the > offset such that it is 2. > > Is there a way to do this in gcc ? > > Regards, > Shafi > For a x86 machine, the stack's offset always is multiple of 4 bytes. long long is NOT 4 bytes, is 8 bytes! I was not talking about the size of long long but the offset i.e 4x32, required for stack operation. I want gcc to generate the code such that the offset is 2 (64 bytes)and not 4 (128 bytes) Is there a way to do this? Sincerely J.C. Pizarro :)
Re: How to control the offset for stack operation?
On 4/16/07, J.C. Pizarro <[EMAIL PROTECTED]> wrote: 2007/4/16, Mohamed Shafi <[EMAIL PROTECTED]> wrote: > > > Depending on the machine mode the compiler will generate automatically > > > the offset required for the stack operation i.e for a machine with > > > word size is 32, for char type the offset is 1, for int type the > > > offset is 2 and so on.. > >I was not talking about the size of long long but the offset i.e > 4x32, required for stack operation. > I want gcc to generate the code such that the offset is 2 (64 > bytes)and not 4 (128 bytes) > Offset in bytes? Offset in 32-bit words? Please, define offset? You confuse. Offset in 32-bit words. J.C. Pizarro
ICE in get_constraint_for_component_ref
Hi all, I am trying to port a private target in GCC 4.5.1. Following are the properties of the target #define BITS_PER_UNIT 32 #define BITS_PER_WORD32 #define UNITS_PER_WORD 1 #define CHAR_TYPE_SIZE32 #define SHORT_TYPE_SIZE 32 #define INT_TYPE_SIZE 32 #define LONG_TYPE_SIZE32 #define LONG_LONG_TYPE_SIZE 32 I am getting an ICE internal compiler error: in get_constraint_for_component_ref, at tree-ssa-structalias.c:3031 For the following testcase: struct fb_cmap { int start; int len; int *green; }; extern struct fb_cmap fb_cmap; void directcolor_update_cmap(void) { fb_cmap.green[0] = 34; } The following is the output of debug_tree of the argument thats given for the function get_constraint_for_component_ref unit size align 32 symtab 0 alias set -1 canonical type 0x2b6a4554a498 precision 32 min max pointer_to_this > unsigned PQI size unit size align 32 symtab 0 alias set -1 canonical type 0x2b6a45559930> arg 0 unit size align 32 symtab 0 alias set -1 canonical type 0x2b6a45602888 fields context chain > used public external common BLK file pr28675.c line 7 col 23 size unit size align 32 chain public static QI file pr28675.c line 9 col 6 align 32 initial result (mem:QI (symbol_ref:PQI ("directcolor_update_cmap") [flags 0x3] ) [0 S1 A32]) struct-function 0x2b6a455453f0>> arg 1 unsigned PQI file pr28675.c line 4 col 7 size unit size align 32 offset_align 32 offset bit offset context > pr28675.c:11:10> I was wondering if this ICE is due to the fact that this is a 32bit char target ? Can somebody help me with pointers to debug this issue? Regards, Shafi
Re: ICE in get_constraint_for_component_ref
On 10 February 2011 15:57, Richard Guenther wrote: > On Thu, Feb 10, 2011 at 6:23 AM, Mohamed Shafi wrote: >> Hi all, >> >> I am trying to port a private target in GCC 4.5.1. Following are the >> properties of the target >> >> #define BITS_PER_UNIT 32 >> #define BITS_PER_WORD 32 >> #define UNITS_PER_WORD 1 >> >> >> #define CHAR_TYPE_SIZE 32 >> #define SHORT_TYPE_SIZE 32 >> #define INT_TYPE_SIZE 32 >> #define LONG_TYPE_SIZE 32 >> #define LONG_LONG_TYPE_SIZE 32 >> >> >> >> I am getting an ICE >> internal compiler error: in get_constraint_for_component_ref, at >> tree-ssa-structalias.c:3031 >> >> For the following testcase: >> >> struct fb_cmap { >> int start; >> int len; >> int *green; >> }; >> >> extern struct fb_cmap fb_cmap; >> >> void directcolor_update_cmap(void) >> { >> fb_cmap.green[0] = 34; >> } >> >> The following is the output of debug_tree of the argument thats given >> for the function get_constraint_for_component_ref >> >> > type > type > size >> unit size >> align 32 symtab 0 alias set -1 canonical type >> 0x2b6a4554a498 precision 32 min > -2147483648> max >> pointer_to_this > >> unsigned PQI size unit size >> >> align 32 symtab 0 alias set -1 canonical type 0x2b6a45559930> >> >> arg 0 > type > size >> unit size >> align 32 symtab 0 alias set -1 canonical type >> 0x2b6a45602888 fields context >> >> chain > >> used public external common BLK file pr28675.c line 7 col 23 >> size unit size > 0x2b6a455fc488 3> >> align 32 >> chain > type >> public static QI file pr28675.c line 9 col 6 align 32 >> initial result > D.1200> >> (mem:QI (symbol_ref:PQI ("directcolor_update_cmap") [flags >> 0x3] ) [0 S1 >> A32]) >> struct-function 0x2b6a455453f0>> >> arg 1 >> unsigned PQI file pr28675.c line 4 col 7 size > 0x2b6a4553c460 32> unit size >> align 32 offset_align 32 >> offset >> bit offset context >> > >> pr28675.c:11:10> >> >> I was wondering if this ICE is due to the fact that this is a 32bit >> char target ? Can somebody help me with pointers to debug this issue? > > Try fixing the * 8 in bitpos_of_field to use BITS_PER_UNIT. > That did the trick. Looking at the code i assume that this is proper and hence should be committed in the trunk and 4.5 branch. Will that be done? Shafi
Re: ICE in get_constraint_for_component_ref
On 10 February 2011 17:16, Richard Guenther wrote: > On Thu, Feb 10, 2011 at 12:42 PM, Mohamed Shafi wrote: >> On 10 February 2011 15:57, Richard Guenther >> wrote: >>> On Thu, Feb 10, 2011 at 6:23 AM, Mohamed Shafi wrote: >>>> Hi all, >>>> >>>> I am trying to port a private target in GCC 4.5.1. Following are the >>>> properties of the target >>>> >>>> #define BITS_PER_UNIT 32 >>>> #define BITS_PER_WORD 32 >>>> #define UNITS_PER_WORD 1 >>>> >>>> >>>> #define CHAR_TYPE_SIZE 32 >>>> #define SHORT_TYPE_SIZE 32 >>>> #define INT_TYPE_SIZE 32 >>>> #define LONG_TYPE_SIZE 32 >>>> #define LONG_LONG_TYPE_SIZE 32 >>>> >>>> >>>> >>>> I am getting an ICE >>>> internal compiler error: in get_constraint_for_component_ref, at >>>> tree-ssa-structalias.c:3031 >>>> >>>> For the following testcase: >>>> >>>> struct fb_cmap { >>>> int start; >>>> int len; >>>> int *green; >>>> }; >>>> >>>> extern struct fb_cmap fb_cmap; >>>> >>>> void directcolor_update_cmap(void) >>>> { >>>> fb_cmap.green[0] = 34; >>>> } >>>> >>>> The following is the output of debug_tree of the argument thats given >>>> for the function get_constraint_for_component_ref >>>> >>>> >>> type >>> type >>> size >>>> unit size >>>> align 32 symtab 0 alias set -1 canonical type >>>> 0x2b6a4554a498 precision 32 min >>> -2147483648> max >>>> pointer_to_this > >>>> unsigned PQI size unit size >>>> >>>> align 32 symtab 0 alias set -1 canonical type 0x2b6a45559930> >>>> >>>> arg 0 >>> type >>> size >>>> unit size >>>> align 32 symtab 0 alias set -1 canonical type >>>> 0x2b6a45602888 fields context >>>> >>>> chain > >>>> used public external common BLK file pr28675.c line 7 col 23 >>>> size unit size >>> 0x2b6a455fc488 3> >>>> align 32 >>>> chain >>> type >>>> public static QI file pr28675.c line 9 col 6 align 32 >>>> initial result >>> D.1200> >>>> (mem:QI (symbol_ref:PQI ("directcolor_update_cmap") [flags >>>> 0x3] ) [0 S1 >>>> A32]) >>>> struct-function 0x2b6a455453f0>> >>>> arg 1 >>> 0x2b6a45559930> >>>> unsigned PQI file pr28675.c line 4 col 7 size >>> 0x2b6a4553c460 32> unit size >>>> align 32 offset_align 32 >>>> offset >>>> bit offset context >>>> > >>>> pr28675.c:11:10> >>>> >>>> I was wondering if this ICE is due to the fact that this is a 32bit >>>> char target ? Can somebody help me with pointers to debug this issue? >>> >>> Try fixing the * 8 in bitpos_of_field to use BITS_PER_UNIT. >>> >> >> That did the trick. Looking at the code i assume that this is proper >> and hence should be committed in the trunk and 4.5 branch. Will that >> be done? > > I'll include it in one of my next bootstraps/tests and commit it. > Thanks Richard :) Shafi
Reloading an auto-increment addresses
Hello all, I am porting GCC 4.5.1 for a private target. For one particular test reloading pass is being asked to reload the following instruction: (insn 45 175 46 11 pr20601-1.c:90 (set (reg/f:PQI 3 g3 [70]) (mem/f:PQI (pre_inc:PQI (reg/f:PQI 1 g1 [orig:55 prephitmp.16 ] [55])) [2 S1 A32])) 9 {movpqi_op} (expr_list:REG_INC (reg/f:PQI 1 g1 [orig:55 prephitmp.16 ] [55]) (nil))) The address is invalid in this. Base address should always be stored in the address register. This instruction gets reloaded in the following manner: (insn 175 43 202 11 pr20601-1.c:90 (set (reg/f:PQI 1 g1 [orig:55 prephitmp.16 ] [55]) (reg/f:PQI 12 as0 [orig:49 e.4 ] [49])) 9 {movpqi_op} (nil)) (insn 202 175 203 11 pr20601-1.c:90 (set (reg/f:PQI 1 g1 [orig:55 prephitmp.16 ] [55]) (plus:PQI (reg/f:PQI 1 g1 [orig:55 prephitmp.16 ] [55]) (const_int 1 [0x1]))) 14 {addpqi3} (nil)) (insn 203 202 45 11 pr20601-1.c:90 (set (reg:PQI 28 a0) (reg/f:PQI 1 g1 [orig:55 prephitmp.16 ] [55])) 9 {movpqi_op} (nil)) (insn 45 203 46 11 pr20601-1.c:90 (set (reg/f:PQI 3 g3 [70]) (mem/f:PQI (reg:PQI 28 a0) [2 S1 A32])) 9 {movpqi_op} (nil)) The issue with this reload is that there is no move operation between GP registers and address registers. So insn 203 is invalid. I am catching these kinds in secondary reloads, but auto-increment addressing modes are not handled in that . So if i try to do that in TARGET_SECONDARY_RELOAD i am getting assert failure from reload1.c:emit_input_reload_insns() due to the following code: /* Auto-increment addresses must be reloaded in a special way. */ if (rl->out && ! rl->out_reg) { /* We are not going to bother supporting the case where a incremented register can't be copied directly from OLDEQUIV since this seems highly unlikely. */ gcc_assert (rl->secondary_in_reload < 0); How can i overcome this failure? Can some one suggest a solution? Thanks for the help. Regards, Shafi
Re: Reloading an auto-increment addresses
On 11 February 2011 15:28, Paulo J. Matos wrote: > > > On 11/02/11 09:46, Mohamed Shafi wrote: >> >> How can i overcome this failure? Can some one suggest a solution? >> > > > Have you defined TARGET_LEGITIMATE_ADDRESS_P and also BASE_REG_CLASS > correctly for your target? > > Yes, I have. Register allocator is allocating the wrong registers for the base registers. This probably is due to the fact that address registers cannot be saved and restored directly, a secondary reload is required. There is also the restriction that there is no move operation between the address registers. For that also a secondary reload is required. (I know its weird). I am trying to figure out why register allocator is not assigning a base register. But even then, reload could be asked to reload a auto-increment addresses. Shafi
How to generate loop counter with a different mode ?
Hi all, I am trying to add support for hardware loops for a 32bit target. In the target QImode is 32bit. The loop counter used in hardware loop construct is 17bit address registers. This is represented using PQImode. Since mode for the doloop pattern is found out after loop discovery it need not be always PQImode . So what i did was to convert the mode of the counter variable to PQImode then emit the a new pattern with PQImode along with other bells and whistles required by the target loop construct. I am able to generate the assembly files with the proper loop initialization instructions and all. But the issue is that the loop counter is set to 0 in the body of the loop. In define_expand (in doloop_end and doloop_begin) I am converting to PQImode using the following construct: operands[0] = convert_to_mode (PQImode, operands[0], 0); So the above construct will result in an rtl pattern like: (insn 33 17 34 4 loop.c:52 (set (reg:PQI 50) (truncate:PQI (reg:QI 49))) -1 (nil)) But GCC will extract the loop counter from the define_expand generated doloop pattern, which is in PQImode. (insn 33 17 34 4 loop.c:52 (set (reg:PQI 50) (truncate:PQI (reg:QI 49))) -1 (nil)) (jump_insn 34 33 20 4 loop.c:52 (parallel [ (set (pc) (if_then_else (ne (reg:PQI 50) (const_int 1 [0x1])) (label_ref:PQI 30) (pc))) (set (reg:PQI 50) (plus:PQI (reg:PQI 50) (const_int -1 [0x]))) (unspec [ (const_int 0 [0x0]) ] 3) (clobber (scratch:PQI)) ]) 62 {doloop_end_pqi} (expr_list:REG_BR_PROB (const_int 9100 [0x238c]) (nil)) -> 30) This is the counter value that gets used for doloop begin. Hence the original loop counter (reg:QI 49) never gets initialized. Due to this 'if-conversion' pass will modify the statement to: (insn 33 38 34 4 loop.c:52 (set (reg:PQI 50) (const_int 0 [0x0])) 9 {movpqi_op} (nil)) This results in loop counter being set to 0 in the body of the loop. Can someone suggest me solution to get out of this? Regards, Shafi
Issue with delay slot scheduling?
Hi, I am doing a private port in GCC 4.5.1. For the my target i see some strange behavior in delay slot scheduling. For my target the instruction in the delay slots gets executed irrespective of whether the branch is taken or not. I have generated the following code after commenting out the call to 'relax_delay_slots' in the function 'dbr_schedule'. RTL: (insn 97 42 51 del1.c:19 (sequence [ (jump_insn 61 42 38 del1.c:19 (set (pc) (if_then_else (ne (reg:CCF 34 CC) (const_int 0 [0x0])) (label_ref:PQI 86) (pc))) 56 {conditional_branch} (expr_list:REG_BR_PRED (const_int 5 [0x5]) (expr_list:REG_DEAD (reg:CCF 34 CC) (expr_list:REG_BR_PROB (const_int 5000 [0x1388]) (nil -> 86) (insn 38 61 43 (set (mem/s/j:QI (reg/f:PQI 28 a0 [orig:62 D.1955 ] [62]) [0 bytes S1 A32]) (reg:QI 1 g1 [orig:65 D.1938 ] [65])) 7 {movqi_op} (nil)) (insn 43 38 51 (set (reg:QI 1 g1 [75]) (ior:QI (reg:QI 1 g1 [orig:65 D.1938 ] [65]) (reg:QI 3 g3 [77]))) 31 {iorqi3} (expr_list:REG_EQUAL (ior:QI (reg:QI 1 g1 [orig:65 D.1938 ] [65]) (const_int 128 [0x80])) (nil))) ]) -1 (nil)) (code_label 51 97 52 1 "" [2 uses]) (note 52 51 73 [bb 4] NOTE_INSN_BASIC_BLOCK) (jump_insn 73 52 72 (return) 72 {return_rts} (expr_list:REG_BR_PRED (const_int 12 [0xc]) (nil))) (barrier 72 73 86) (code_label 86 72 41 5 "" [1 uses]) (note 41 86 45 [bb 5] NOTE_INSN_BASIC_BLOCK) (insn 45 41 44 del1.c:20 (set (reg:QI 2 g2 [orig:68 ivtmp.7 ] [68]) (plus:QI (reg:QI 2 g2 [orig:68 ivtmp.7 ] [68]) (const_int 1 [0x1]))) 13 {addqi3} (nil)) (insn 44 45 101 del1.c:20 (set (mem/s/j:QI (reg/f:PQI 28 a0 [orig:62 D.1955 ] [62]) [0 bytes S1 A32]) (reg:QI 1 g1 [75])) 7 {movqi_op} (expr_list:REG_DEAD (reg/f:PQI 28 a0 [orig:62 D.1955 ] [62]) (expr_list:REG_DEAD (reg:QI 1 g1 [75]) (nil (code_label 101 44 79 7 "" [1 uses]) Corresponding code: jmp.ne .L5; st [a0], g1; (INSN 38) or g1, g1, g3; (INSN 43) .L1: rts; nop; nop; .L5: add g2, g2, 1; (INSN 45) st [a0], g1;(INSN 44) -> deleted .L7: You can see that INSN 44 and INSN 38 are identical. In 'relax_delay_slots' while processing INSN 97, the second call to 'try_merge_delay_insns' deletes the INSN 44 because of which unexpected result is generated. /* If we own the thread opposite the way this insn branches, see if we can merge its delay slots with following insns. */ if (INSN_FROM_TARGET_P (XVECEXP (pat, 0, 1)) && own_thread_p (NEXT_INSN (insn), 0, 1)) try_merge_delay_insns (insn, next); else if (! INSN_FROM_TARGET_P (XVECEXP (pat, 0, 1)) && own_thread_p (target_label, target_label, 0)) try_merge_delay_insns (insn, next_active_insn (target_label)); Deleting the INSN 44 would have been proper if the 2nd delay slot insn had not modified G1. But looking at the comments from the function 'try_merge_delay_insns' /* Try merging insns starting at THREAD which match exactly the insns in INSN's delay list. If all insns were matched and the insn was previously annulling, the annul bit will be cleared. For each insn that is merged, if the branch is or will be non-annulling, we delete the merged insn. */ I think REGOUT dependency of g1 between instructions 38 and 43 in the delay slot is not being considered by 'try_merge_delay_insns'. Is this a bug? Regards, Shafi
Re: Issue with delay slot scheduling?
On 6 September 2011 20:50, Jeff Law wrote: > > On 09/06/11 08:46, Mohamed Shafi wrote: >> Hi, >> >> I am doing a private port in GCC 4.5.1. For the my target i see some >> strange behavior in delay slot scheduling. For my target the >> instruction in the delay slots gets executed irrespective of whether >> the branch is taken or not. I have generated the following code >> after commenting out the call to 'relax_delay_slots' in the function >> 'dbr_schedule'. > [ ... ] > It looks like you have found a bug. While reorg.c is supposed to work > with targets that have multiple delay slots, it's not something that has > been extensively tested. > >>> >> I think REGOUT dependency of g1 between instructions 38 and 43 in >> the delay slot is not being considered by 'try_merge_delay_insns'. > You're probably correct. > > Jeff How do raise a bug report, mine being a private target? Regards, Shafi
Reloading going wrong. Bug in GCC?
Hi, I am working on a 32bit private target which has the following restriction 1. store/load can happen only through a general purpose register (GP_REGS) 2. base register should be an address register (AD_REGS) 3. moves between GP_REGS and AD_REGS can happen only through PT_REGS In a PRE_MODIFY instruction when both the base register and the output register gets spilled the reloading is going wrong. befor IRA pass ~~~ (insn 259 336 317 2 ../rld_bug.c:94 (set (reg:QI 234 [+1 ]) (mem/s/j/c:QI (pre_modify:PQI (reg/f:PQI 233) (plus:PQI (reg/f:PQI 233) (const_int 1 [0x1]))) [0+1 S1 A32])) 7 {movqi_op} (expr_list:REG_INC (reg/f:PQI 233) (nil))) after IRA pass ~~~ Reloads for insn # 259 Reload 0: GP_REGS, RELOAD_FOR_OPADDR_ADDR (opnum = 1), can't combine, secondary_reload_p reload_reg_rtx: (reg:PQI 11 g11) Reload 1: PT_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), can't combine, secondary_reload_p reload_reg_rtx: (reg:PQI 12 as0) secondary_in_reload = 0 Reload 2: GP_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), can't combine, secondary_reload_p reload_reg_rtx: (reg:PQI 11 g11) Reload 3: PT_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), can't combine, secondary_reload_p reload_reg_rtx: (reg:PQI 13 as1) secondary_out_reload = 2 Reload 4: reload_in (PQI) = (reg/f:PQI 233) reload_out (PQI) = (reg/f:PQI 233) AD_REGS, RELOAD_OTHER (opnum = 1) reload_in_reg: (reg/f:PQI 233) reload_out_reg: (reg/f:PQI 233) reload_reg_rtx: (reg:PQI 31 a3) secondary_in_reload = 1, secondary_out_reload = 3 Reload 5: reload_out (QI) = (reg:QI 234 [+1 ]) GP_REGS, RELOAD_FOR_OUTPUT (opnum = 0) reload_out_reg: (reg:QI 234 [+1 ]) reload_reg_rtx: (reg:QI 11 g11) (insn 744 336 745 2 ../rld_bug.c:94 (set (reg:PQI 11 g11) (mem/c:PQI (plus:PQI (reg/f:PQI 32 sp) (const_int -24 [0xffe8])) [99 %sfp+8 S1 A32])) 9 {movpqi_op} (nil)) (insn 745 744 746 2 ../rld_bug.c:94 (set (reg:PQI 12 as0) (reg:PQI 11 g11)) 9 {movpqi_op} (nil)) (insn 746 745 259 2 ../rld_bug.c:94 (set (reg:PQI 31 a3) (reg:PQI 12 as0)) 9 {movpqi_op} (nil)) (insn 259 746 747 2 ../rld_bug.c:94 (set (reg:QI 11 g11) (mem/s/j/c:QI (pre_modify:PQI (reg:PQI 31 a3) (plus:PQI (reg:PQI 31 a3) (const_int 1 [0x1]))) [0+1 S1 A32])) 7 {movqi_op} (expr_list:REG_INC (reg:PQI 31 a3) (nil))) (insn 747 259 748 2 ../rld_bug.c:94 (set (reg:PQI 13 as1) (reg:PQI 31 a3)) 9 {movpqi_op} (nil)) (insn 748 747 749 2 ../rld_bug.c:94 (set (reg:PQI 11 g11) (reg:PQI 13 as1)) 9 {movpqi_op} (nil)) (insn 749 748 750 2 ../rld_bug.c:94 (set (mem/c:PQI (plus:PQI (reg/f:PQI 32 sp) (const_int -24 [0xffe8])) [99 %sfp+8 S1 A32]) (reg:PQI 11 g11)) 9 {movpqi_op} (nil)) (insn 750 749 751 2 ../rld_bug.c:94 (set (mem/c:QI (plus:PQI (reg/f:PQI 32 sp) (const_int -29 [0xffe3])) [99 %sfp+3 S1 A32]) (reg:QI 11 g11)) 7 {movqi_op} (nil)) After IRA pass for insn 259 1st the modified address is stored into its spilled location and then the modified value is stored. As you can see from the instructions same register (g11) is used for Reload 5 and 2, and hence the modified value is getting corrupted and hence the modified address gets stored instead of modified value (insn 749 and insn 750). I am not able to figure out where this is going wrong in the reload phase. I suspect that this is a GCC issue. Can some one give me some pointers to resolve this issue? Regards, Shafi
Function argument passing
Hello all, I am doing a port for a private target in GCC 4.4.0. It generates code for both little & big endian. The ABI for the target is as follows: 1. All arguments passed in stack are passed using their alignment constrains. Solution: For this to happen no argument promotion should be done. 2. Functions with a variable number of arguments pass the last fixed argument and all subsequent variable arguments on the stack. Such arguments of fewer than 4 bytes are located on the stack as if the argument had been promoted to 32 bits. Solution: For TARGET_STRICT_ARGUMENT_NAMING the internals says the following : This hook controls how the named argument to FUNCTION_ARG is set for varargs and stdarg functions. If this hook returns true, the named argument is always true for named arguments, and false for unnamed arguments. If it returns false, but TARGET_PRETEND_OUTGOING_VARARGS_NAMED returns true, then all arguments are treated as named. Otherwise, all named arguments except the last are treated as named. So i made both TARGET_STRICT_ARGUMENT_NAMING and PRETEND_OUTGOING_VARARGS_NAMED to return false. Is this correct? How to make the varargs argument to be promoted to 32bits when the normal argument don't require promotion as mentioned in point (1) ? 3. A function returning a structure or union receives in D0 the address of the returned structure or union. The caller allocates space for the returned object. Solution: Used TARGET_FUNCTION_VALUE and returned D0 reg_rtx for structure and unions. 4. A long long return value is returned in R6 and R7, R6 containing the most significant long word and R7 containing the least significant long word, regardless of the endianess mode. Solution: Used TARGET_RETURN_IN_MSB to return true when the mode is little endian 5. If the first argument is a long long , it is passed in R6 and R7, R6 containing the most significant long word and R7 containing the least significant long word, regardless of the endianess mode. For return value, i have done as mentioned in (4) but I am not sure how to control the argument passing so that R6 contains the msw and R7 contains lsw, regardless of the endianess mode. Regards, Shafi
Output sections
Hello all, Is it possible to emit a assembler directive at the end of each sections? Say like section_end Is there any support for doing something like this in the back-end files? Or should i need to the make changes in the gcc sources? Is so do does anyone know in which function it should happen? Regards, Shafi
current_function_outgoing_args_size
Hello all, The change logs says that current_function_outgoing_args_size is no more available. But it doesnt say with what it is replaced. Looking at the other targets i find that its replaced with some field in a structure crtl. Where is this defined/declared. I am working in GCC 4.4.0. I checked with the mainline internals. Even there the references of these deleted variables are not replaced. Could somebody please take care of this. Regards, Shafi
Re: current_function_outgoing_args_size
2009/7/18 Ian Lance Taylor : > Mohamed Shafi writes: > >> The change logs says that current_function_outgoing_args_size is no >> more available. But it doesnt say with what it is replaced. Looking at >> the other targets i find that its replaced with some field in a >> structure crtl. Where is this defined/declared. > > crtl is declared in function.h. > >> I am working in GCC 4.4.0. I checked with the mainline internals. Even >> there the references of these deleted variables are not replaced. >> Could somebody please take care of this. > And also references to "regs_ever_live". Regards, Shafi
Re: Output sections
2009/7/18 Dave Korn : > Mohamed Shafi wrote: >> Hello all, >> >> Is it possible to emit a assembler directive at the end of each sections? >> Say like section_end >> Is there any support for doing something like this in the back-end files? >> Or should i need to the make changes in the gcc sources? >> Is so do does anyone know in which function it should happen? > > There isn't really such a concept as 'end of a section' until you get to > final-link time and get all the contributions from different .o files to a > given section. During assembler output GCC treats sections as random access, > switching freely from one to another and back; it doesn't have any concept of > starting/stopping/opening/closing a section but just jumps into any one it > likes completely ad-hoc. > > Assuming you're happy with adding something to the end of each section in > each generated .s file, you could use the TARGET_ASM_FILE_END hook to output > directives that re-enter each used section and then output your new directive. > You may find it hard to know which sections have been used or not in a given > file - you can define TARGET_ASM_NAMED_SECTION and make a note of which > sections get invoked there, but I'm not sure if that gets called for all > sections e.g. init/fini, you may have to try it and see. > I am looking for adding something to the end of each section in the generated .s file. Using TARGET_ASM_NAMED_SECTION i will be able to keep track of the sections that are being emitted. But from TARGET_ASM_FILE_END hook how can i re-enter into each section. Are the sections stored in some global variable? Shafi
Re: Output sections
2009/8/1 Dave Korn : > Mohamed Shafi wrote: >> I am looking for adding something to the end of each section in the >> generated .s file. Using TARGET_ASM_NAMED_SECTION i will be able to >> keep track of the sections that are being emitted. But from >> TARGET_ASM_FILE_END hook how can i re-enter into each section. Are the >> sections stored in some global variable? > > I'm not sure I understand the question. You "enter a section" simply by > emitting the correct .section directive into the asm output. You re-enter it > by > the same method. > > cheers, > DaveK > Ok, Then i don't understand your solution. >> you could use the TARGET_ASM_FILE_END hook to output >> directives that re-enter each used section and then output your new >> directive. if i want to do the following in the assembly output section .code . . .. section_end you are saying that if i emit a section directive the compiler will switch to the previously emitted section and then i have to somehow seek to the end of that section and emit my 'section_end' directive? Shafi
Re: Output sections
2009/8/1 Dave Korn : > Mohamed Shafi wrote: >> 2009/8/1 Dave Korn : >>> Mohamed Shafi wrote: >>>> I am looking for adding something to the end of each section in the >>>> generated .s file. Using TARGET_ASM_NAMED_SECTION i will be able to >>>> keep track of the sections that are being emitted. But from >>>> TARGET_ASM_FILE_END hook how can i re-enter into each section. Are the >>>> sections stored in some global variable? >>> I'm not sure I understand the question. You "enter a section" simply by >>> emitting the correct .section directive into the asm output. You re-enter >>> it by >>> the same method. > >> Ok, Then i don't understand your solution. > > Ah, it looks like I didn't quite understand your problem. > >>>> you could use the TARGET_ASM_FILE_END hook to output >>>> directives that re-enter each used section and then output your new >>>> directive. >> >> if i want to do the following in the assembly output >> >> section .code >> . >> . >> .. >> section_end > > I thought you just wanted to have > > .section .code > section_end > .section .data > section_end > > ... etc. for all used sections, at the very end of the file; after all, all > the > contributions to a section get concatenated in the assembler. Now you seem to > be saying that you want to have multiple section_end directives throughout the > file, every time the current section changes. > >> you are saying that if i emit a section directive the compiler will >> switch to the previously emitted section and then i have to somehow >> seek to the end of that section and emit my 'section_end' directive? > > I think you may need to re-read the assembler manual about sections, you are > a > little confused about the concepts. The compiler doesn't really "switch" > anything; the compiler emits ".section" directives, in response to which the > *assembler* switches to emit code in the chosen section. The compiler doesn't > keep track of sections; it just randomly emits directives for whichever one it > wants the assembly output to go into at any given time, according to whether > it's generating the assembly for a function or a variable or other data > object. > Ok. will TARGET_NAMED_SECTION get invoked for the normal sections like text, data, bss ? I tired to include this hook in my code, but the execution never reaches this hook for the sections. Shafi
How to set the alignment
Hello all, I am doing a private port in GCC 4.4.0. For my target the following are the alignment requirements: int - 4 bytes short - 2 bytes char - 1 byte pointer - 4 bytes stack pointer - 4 bytes i am not able to implement the alignment for short. The following is are the macros that i used for this #define PARM_BOUNDARY 8 #define STACK_BOUNDARY 64 I have also defined STACK_SLOT_ALIGNMENT. but this is not affecting the output. What should i be doing to get the required alignment? Regards, Shafi
Re: How to set the alignment
2009/8/3 Jim Wilson : > On 08/03/2009 02:14 AM, Mohamed Shafi wrote: >> >> short - 2 bytes >> i am not able to implement the alignment for short. >> The following is are the macros that i used for this >> #define PARM_BOUNDARY 8 >> #define STACK_BOUNDARY 64 > > You haven't explained what the actual problem is. Is there a problem with > global variables? Is the variable initialized or uninitialized? If it is > uninitialized, is it common? If this a local variable? Is this a function > argument or parameter? Is this a named or unnamed (stdarg) argument or > parameter? Etc. It always helps to include a testcase. > > You should also mention what gcc is currently emitting, why it is wrong, and > what the output should be instead. > > All this talk about stack and parm boundary suggests that it might be an > issue with function arguments, in which case you will probably have to > describe the calling conventions a bit so we can understand what you want. > This is the test case that i tried short funs (int a, int b, char w,short e,short r) { return e+r; } The target is 32bit . The first two parameters are passed in registers and the rest in stack. For the parameters that are passed in stack the alignment is that of the data type. The stack pointer is 8 byte aligned. char is 1 byte, int is 4 byte and short is 2 byte. The code that is getting generated is give below (-O0 -fomit-frame-pointer) funs: add 16,sp mov d0,(sp-16) mov d1,(sp-12) movh (sp-19),d0 movh d0,(sp-8) movh (sp-21),d0 movh d0,(sp-6) movh (sp-8),d1 movh (sp-6),d0 add d1,d0,d0 sub16,sp ret From the above code you can see that some of the half word access is not aligned on a 2byte boundary. So where am i going wrong. Hope this info is enough Regards, Shafi
Re: How to set the alignment
2009/8/5 Jim Wilson : > On Tue, 2009-08-04 at 11:09 +0530, Mohamed Shafi wrote: >> >> i am not able to implement the alignment for short. >> >> The following is are the macros that i used for this >> >> #define PARM_BOUNDARY 8 >> >> #define STACK_BOUNDARY 64 >> The target is 32bit . The first two parameters are passed in registers >> and the rest in stack. For the parameters that are passed in stack the >> alignment is that of the data type. The stack pointer is 8 byte >> aligned. char is 1 byte, int is 4 byte and short is 2 byte. The code >> that is getting generated is give below (-O0 -fomit-frame-pointer) > > Er, wait. You set PARM_BOUNDARY to 8. This means all arguments will be > padded to at most an 8-bit boundary, which means that yes, a short after > a char will have only 1 byte alignment. If you want all arguments to > have 2-byte alignment, then you need to set PARM_BOUNDARY to 16. But > you probably want a value of 32 here so that 4-byte ints get 4-byte > alignment. This will allocate a minimum 4-byte stack slot for every > argument. I don't know the calling convention, so I don't know exactly > how you want arguments arranged on the stack. > > If you are pushing arguments, then you can lie in the PUSH_ROUNDING > macro. You could say for instance that one byte pushes always push 2 > bytes. This ensures that the stack always has 2-byte alignment while > pushing arguments. If your push instruction doesn't actually do this, > then you need to modify the pushqi pattern to emit two pushes or use a > HImode push to get the right behaviour. > > Try looking at the code in store_one_arg in calls.c, and emit_push_insn > in expr.c. > What i did was to define FUNCTION_ARG_BOUNDARY macro to return the alignment as per the requirement. i.e 8bits for char, 16bits for short, 32bits for int and kept PARM_BOUNDARY to 8. Now the complier is emitting the alignment prperly. Is this OK? Regards, Shafi
Restrictive addressing mode
Hello all, I am trying to port a 32bit target in GCC 4.4.0 Of the addressing modes that are allowed by my target the one with (base register + offset) is restrictive in QImode. The restriction is that if the base register is not Stack Pointer then this kind of address cannot come in a load instruction but only in store instruction. So how can i implement this? Should i do a define_expand for movQi3 and force it to a register when i get this addressing mode? Please let me know your thoughts on this. Regards, shafi
Re: About feasibility of implementing an instruction
2009/7/3 Ian Lance Taylor : > Mohamed Shafi writes: > >> I just want to know about the feasibility of implementing an >> instruction for a port in gcc 4.4 >> The target has 40 bit register where the normal load/store/move >> instructions will be able to access the 32 bits of the register. In >> order to move data into the rest of the register [b32 to b39] the data >> has to be stored into a 32bit memory location. The data should be >> stored in such a way that if it is stored for 0-7 in memory the data >> can be moved to b32-b39 of a even register and if the data in the >> memory is stored in 16-23 of the memory word then it can be moved to >> b32-b39 of a odd register. Hope i make myself clear. >> >> Will it be possible to implement this in the gcc back-end so that the >> particular instruction is supported? > > In general, the gcc backend can do anything, so, yes, this can be > supported. It sounds like this is not a general purpose register, so I > would probably do it using a builtin function. If you need to treat it > as a general purpose register (i.e., the register is managed by the > register allocator) then you will need a secondary reload to handle > this. > This is a general purpose register. All the 40 bits are used only for fixed-point data types. When the register is used for fixed-point data type all the operations except initialization, are done through built-in functions. For initialization the immediate value should move through a memory ..i.e there is no immediate load when the data is 40bit. So i am planning to control this using LEGITIMATE_CONSTANT macro. But then i have a question. If all the operations are through intrinsics will there be a need for spilling for the variables used in the built-in functions? If so then depending on the register that get spilled is even or odd [b32 to b39] of the register gets stored in the memory to [b0 to b7] or [b16 tr b23] respectively. Will i be able to keep track of the spilling so that i can reload into the proper register? Hope i am clear. Regards Shafi
DI mode and endianess
HI, I am trying to port a 32bit target in GCC 4.4.0. My target supports big and little endian. This is selected using a target switch. So i have defined the macro #define WORDS_BIG_ENDIAN (TARGET_BIG_ENDIAN) Currently i have written pattens only for SImode moves. So GCC will synthesize DImode patterns for me. The problem is that GCC is generating the same code for both big and little endian i.e for the following code extern long long h; extern long long j; extern long long k; int temp() { k = j+h; return 0; } the compiler is generating the following code. section .text local ALIGN 16 GLOBAL _temp _temp: mov _h,d4 mov _h+4,d5 mov _j,d2 mov _j+4,d3 addd4,d2 adcd5,d3 mov d2,_k mov d3,_k+4 ret SIZE_temp,*-_temp irrespective of which endian it is. What could i be missing here? Should i add anything specific for this in the back-end? Regards, Shafi
Re: Function argument passing
2009/7/16 Richard Henderson : > On 07/13/2009 07:35 AM, Mohamed Shafi wrote: >> >> So i made both TARGET_STRICT_ARGUMENT_NAMING and >> PRETEND_OUTGOING_VARARGS_NAMED to return false. Is this correct? > > Yes. > >> How to make the varargs argument to be promoted to 32bits when the >> normal argument don't require promotion as mentioned in point (1) ? > > There is no way at present. You'll have to extend the promote_function_args > hook to accept a "bool named" parameter. > >> 4. A long long return value is returned in R6 and R7, R6 containing >> the most significant long word and R7 containing the least >> significant long word, regardless of the endianess mode. >> Solution: Used TARGET_RETURN_IN_MSB to return true when the mode is >> little endian > > I don't believe this is correct. RETURN_IN_MSB is supposed to be handling > the case where the data to be returned is smaller than the register in which > it is returned -- e.g. a 3 byte structure returned in a 32-bit register. I > believe you should be using... > >> 5. If the first argument is a long long , it is passed in R6 and R7, >> R6 containing the most significant long word and R7 containing the >> least significant long word, regardless of the endianess mode. >> For return value, i have done as mentioned in (4) but I am not sure >> how to control the argument passing so that R6 contains the msw and R7 >> contains lsw, regardless of the endianess mode. > > For both return values and arguments, we support a PARALLEL which allows the > target to indicate where each piece of the value is located. It's also true > that the generated rtl is more complicated, so you'd want to avoid this > solution in big-endian mode, when it isn't needed. > > So here you would do > > if (WORDS_BIG_ENDIAN) > return gen_rtx_REG (DImode, 6); > else > { > rtx r6, r7, par; > > r7 = gen_rtx_REG (SImode, 7); > r7 = gen_rtx_EXPR_LIST (SImode, r7, GEN_INT (0)); > r6 = gen_rtx_REG (SImode, 6); > r6 = gen_rtx_EXPR_LIST (SImode, r6, GEN_INT (4)); > par = gen_rtx_PARALLEL (DImode, gen_rtvec (2, r7, r6))); > return par; > } > > See the docs for FUNCTION_ARG for details. > I am getting the following error when i make a function call. (call_insn 18 17 19 3 1.c:29 (set (parallel:DI [ (expr_list:REG_UNUSED (reg:SI 7 d7) (const_int 0 [0x0])) (expr_list:REG_UNUSED (reg:SI 6 d6) (const_int 4 [0x4])) ]) (call:SI (mem:SI (symbol_ref:SI ("dd1") [flags 0x41] ) [0 S4 A8]) (const_int 8 [0x8]))) -1 (nil) (expr_list:REG_DEP_TRUE (use (reg:SI 7 d7)) (expr_list:REG_DEP_TRUE (use (reg:SI 6 d6)) (nil How do i write a pattern for this? Another question is in LITTLE ENDIAN mode for the return value will the compiler know that values are actually stored the other way.. in big endian format? And generate the code accordingly for the rest of the program? Regards, Shafi
How to write shift and add pattern?
Hello all, I am trying to port a 32bit arch in GCC 4.4.0. My target has support for 1bit, 2bit shift and add operations. I tried to write patterns for this , but gcc is not generating those. The following are the patterns that i have written in md file: (define_insn "shift_add_" [(set (match_operand:SI 0 "register_operand" "") (plus:SI (match_operand:SI 3 "register_operand" "") (ashift:SI (match_operand:SI 1 "register_operand" "") (match_operand:SI 2 "immediate_operand" ""] "" "shadd1\\t%1, %0" ) (define_insn "shift_add1_" [(set (match_operand:SI 0 "register_operand" "") (plus:SI (ashift:SI (match_operand:SI 1 "register_operand" "") (match_operand:SI 2 "immediate_operand" "")) (match_operand:SI 3 "register_operand" "")))] "" "shadd1\\t%1, %0" ) (define_insn "shift_n_add_" [(set (match_operand:SI 1 "register_operand" "") (ashift:SI (match_dup 1) (match_operand:SI 2 "immediate_operand" ""))) (set (match_operand:SI 0 "register_operand" "") (plus:SI (match_dup 0) (match_dup 1)))] "" "shadd2\\t%1, %0" ) As you can see i have tried combinations. Since i was looking for pattern matching i didnt bother to write according to the target. Thought i will do that after i get a matching pattern. When i debugged GCC was generating patterns with multiply. But that gets discarded since md file doesnt have those patterns. How can i make GCC generate shift and add pattern? Is GCC generating patterns with multiply due to cost issues? I havent mentioned any cost details. Regards, Shafi
Reloading is going wrong?
Hello all, I am doing a port for a 32bit target in GCC 4.4.0. Of the addressing modes that are allowed by my target the one with (base register + offset) is restrictive in QImode. The restriction is that if the base register is not Stack Pointer then this kind of address cannot come in a load instruction but only in store instruction. To implement this i added constrains for all supported memory operations in QImode. So the pattern is as follows (define_insn "movqi" [(set (match_operand:QI 0 "nonimmediate_operand" "=b,b,d,t,d, b,Ss0, Ss1, a,Se1, Sb2, b,Sd3, d,Se0") (match_operand:QI 1 "general_operand" "I, L,d,d,t, Ss0,b, b,Se1,a, b, Sd3,b, Se0,d"))] where d is data registers a is address registers b is data and address registers Sb2 is Rn + offset addressing mode Sd3 is SP + offset addressing mode Se0 - (Rn), (Rn)+, (Rn)-, (Rn + Ri) and Post modify register addressing mode Se1 - Se0 excluding Post modify register addressing mode I believe that there are enough combinations available for the reload to try for alternate addressing mode if it encounters the restrictive addressing mode. But I am still getting the following error main1.c:11: error: insn does not satisfy its constraints: (insn 30 29 7 2 main1.c:9 (set (reg:QI 2 d2 [orig:61 .a+1 ] [61]) (mem/s/j:QI (plus:SI (reg:SI 16 r0) (const_int 1 [0x1])) [0 .a+1 S1 A8])) 41 {movqi} (nil)) main1.c:11: internal compiler error: in reload_cse_simplify_operands, at postreload.c:396 So what am i doing wrong? Cant this scenario be solved by the reload pass? How can generate instructions with the QImode restriction? Regards, Shafi
Supporting FP cmp lib routines
Hi all, I am doing a GCC port for a 32bit target in GCC 4.4.0. The target uses hand coded floating point compare routines. Generally the function returns the values in the general purpose registers. But these fp cmp routines return the result in the Status Register itself. So there is no need to have compare instruction after the function call for FP compare. Is there a way to let GCC know that the result for FP compare are stored in the Status Register so that GCC generates directly a jump operation? How can i implement this? Regards, Shafi
How to split 40bit data types load/store?
Hello all, I am doing a port for a 32bit target in GCC 4.4.0. I have to support a 40bit data (_Accum) in the port. The target has 40bit registers which is a GPR and works as 32bit reg in other modes. The load and store for _Accum happens in two step. The lower 32bit in one instruction and the upper 8bit in the next instruction. I want to split the instruction after reload. I tired to have a pattern (for load) like this: (define_insn "fn_load_ext_sa" [(set (unspec:SA [(match_operand:DA 0 "register_operand" "")] UNSPEC_FN_EXT) (match_operand:SA 1 "memory_operand" ""))] (define_insn "fn_load_sa" [(set (unspec:SA [(match_operand:DA 0 "register_operand" "")] UNSPEC_FN) (match_operand:SA 1 "memory_operand" ""))] The above patterns works for O0. But with optimizations i am getting ICE. It seems that GCC won't accept unspec object in destination operand. So how can split the pattens for the load and store for these data types? Regards, Shafi
How to implement compare and branch instruction
Hello all, I am porting a 32bit target in GCC 4.4.0 The target has have distinct signed and unsigned compare instructions, and only one set of conditional branch instructions. Moreover the operands cannot be immediate values if the comparison is unsigned. I have implemented this using compare-and-branch instruction. This gets split after reload. The pattern that i have written are as follows: (define_expand "cmp" [(set (reg:CC CC_REGNUM) (compare (match_operand:INT 0 "register_operand" "") (match_operand:INT 1 "nonmemory_operand" "")))] "" " { compare_op0 = operands[0]; compare_op1 = operands[1]; DONE; } " ) (define_expand "b" [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (match_dup 2))) (set (pc) (if_then_else (comp_op:CC (reg:CC CC_REGNUM)(const_int 0)) (label_ref (match_operand 0 "" "")) (pc)))] "" "{ operands[1] = compare_op0; operands[2] = compare_op1; if (CONSTANT_P (operands[2]) && ( == LTU || == GTU || == LEU || == GEU)) operands[2] = force_reg (GET_MODE (operands[1]), operands[2]); operands[3] = gen_rtx_fmt_ee (, CCmode, gen_rtx_REG (CCmode,CC_REGNUM), const0_rtx); emit_jump_insn (gen_compare_and_branch_insn (operands[0], operands[1], operands[2], operands[3])); DONE; }" ) (define_insn_and_split "compare_and_branch_insn" [(set (pc) (if_then_else (match_operator:CC 3 "comparison_operator" [(match_operand 1 "register_operand" "d,d,a,a,d,t,k,t") (match_operand 2 "nonmemory_operand" "J,L,J,L,d,t,t,k")]) (label_ref (match_operand 0 "" "")) (pc)))] "!unsigned_immediate_compare_p (GET_CODE (operands[3]), operands[2])" "#" "reload_completed" [(set (reg:CC CC_REGNUM) (match_op_dup:CC 3 [(match_dup 1) (match_dup 2)])) (set (pc) (if_then_else (eq (reg:CC CC_REGNUM) (const_int 0)) (label_ref (match_dup 0)) (pc)))] "{ if (expand_compare_insn (operands, 0)) DONE; }" ) In the function "expand_compare_insn" i am asserting that operand[2] is not a immediate value if the comparison is unsigned. I am getting a assertion failure in this function. The problem is that reload pass will replace operand[2] with its equiv_constant. This breaks the pattern after reload pass. Before reload pass (jump_insn 58 56 59 10 20070129.c:73 (set (pc) (if_then_else (leu:CC (reg:QI 84) (reg:QI 91)) (label_ref 87) (pc))) 77 {compare_and_branch_insn} (expr_list:REG_DEAD (reg:QI 84) (expr_list:REG_BR_PROB (const_int 200 [0xc8]) (nil After reload pass: (jump_insn 58 56 59 10 20070129.c:73 (set (pc) (if_then_else (leu:CC (reg:QI 17 r1 [84]) (const_int 1 [0x1])) (label_ref 87) (pc))) 77 {compare_and_branch_insn} (expr_list:REG_BR_PROB (const_int 200 [0xc8]) (nil))) How can i overcome this error? Thanks for your help. Regards, Shafi
Segmentation fault when calling a library fun - GCC bug?
I am doing a port for a 32bit target in GCC 4.4.0 I am getting segmentation fault in the function assign_temp in the following line: if (DECL_P (type_or_decl)) After analyzing the issue i find that this might be a bug. I just want to confirm if that is the case or not. In order to reproduce i think the target should have the following properties a. Only 2 32bit registers available as argument registers. b. Second 64bit value will be pushed in stack c. ACCUMULATE_OUTGOING_ARGS is set d. STRICT_ALIGNMENT is set e. PARM_BOUNDARY is 32 When there is a library call for an operation that takes two 64bit arguments, say division of two long long values - _divdi3, the following sequence happens emit_library_call_value -> emit_library_call_value_1 -> emit_push_insn->assign_temp emit_push_insn is called because the second argument is pushed on the stack and ACCUMULATE_OUTGOING_ARGS is set. assign_temp is called because STRICT_ALIGNMENT && PARM_BOUNDARY < GET_MODE_ALIGNMENT (DImode) is true Can somebody please confirm whether this is due to some mistake in my port or a GCC bug? Thanks, Shafi
Reload going wrong for addition.
Hello all, I doing a port for a 32bit target for GCC 4.4.0. I am getting the following error: rd_er.c:19: error: insn does not satisfy its constraints: (insn 5 35 34 2 rd_er.c:8 (set (reg:SI 16 r0) (plus:SI (reg:SI 16 r0) (reg:SI 2 d2))) 57 {addsi3} (expr_list:REG_EQUAL (plus:SI (reg/f:SI 49 sp) (const_int -65544 [0xfffefff8])) (nil))) My target has 16 data registers and 16 address registers. All are 32bit registers. The target also has a dedicated stack pointer. There is no move operation possible between SP and data regs. There is no provision for addition between data and address registers. R7 is used as Frame Pointer. Pattern for addition --- (define_insn "add3" [(set (match_operand:INT 0 "register_operand" "=d, t, k, a, a, t, k, t, d") (plus:INT(match_operand:INT 1 "register_operand" "0, 0, 0, t, k, 0, 0, 0, 0") (match_operand:INT 2 "nonmemory_operand" "J, J, J, L, L, t, t, k, d")))] The constraints used are - ;;d - Data registers [D0 - D15] ;;a - Address registers [R0 - R15] ;;t - Address and Index registers ;;k - Stack Pointer ;;J - Unsigned 5bit immediate ;;L - Signed 16bit immediate Since there is no move operation between SP and data regs i have specified 12 as the register_move_cost between them. I also return the reload class as address register class in preferred_reload_class when the rtx is SP. b4 ira pass --- (insn 5 2 12 2 rd_er.c:8 (set (reg/v/f:SI 60 [ bufptr ]) (reg/f:SI 23 r7)) 43 {*movsi_internal} (nil)) Input for reload pass - (insn 5 2 12 2 rd_er.c:8 (set (reg/v/f:SI 7 d7 [orig:60 bufptr ] [60]) (plus:SI (reg/f:SI 49 sp) (const_int -65536 [0x]))) 57 {addsi3} (expr_list:REG_EQUAL (plus:SI (reg/f:SI 49 sp) (const_int -65536 [0x])) (nil))) After IRA --- Reloads for insn # 5 Reload 0: reload_in (SI) = (reg/f:SI 49 sp) reload_out (SI) = (reg/v/f:SI 7 d7 [orig:60 bufptr ] [60]) HIGH_OR_LOW, RELOAD_OTHER (opnum = 0) reload_in_reg: (reg/f:SI 49 sp) reload_out_reg: (reg/v/f:SI 7 d7 [orig:60 bufptr ] [60]) reload_reg_rtx: (reg:SI 16 r0) Reload 1: reload_in (SI) = (const_int -65544 [0xfffefff8]) DALU_REGS, RELOAD_FOR_INPUT (opnum = 2) reload_in_reg: (const_int -65544 [0xfffefff8]) reload_reg_rtx: (reg:SI 2 d2) (insn 5 35 34 2 rd_er.c:8 (set (reg:SI 16 r0) (plus:SI (reg:SI 16 r0) (reg:SI 2 d2))) 57 {addsi3} (expr_list:REG_EQUAL (plus:SI (reg/f:SI 49 sp) (const_int -65544 [0xfffefff8])) (nil))) The reload pass chooses the final alternative as the goal for reloading. Since the input instruction already has data register as the destination the constraint combination (t, 0, t) looses to (d, 0, d), since the last combination requires least amount copying for constraint matching (or so the reload pass believes). There are cases when reload fixes the add pattern and those are when either the destination is address register or there is no stack pointer involved. But otherwise i am getting this ICE. I am not sure how to over come this,. Hope someone suggests me a solution. Regards, Shafi P.S Can i have commutative operation for the constraint combination (t, 0, t) i.e (t, %0, t). If so what will be the output template?
define_memory_constraint and REG_OK_STRICT
Hello all, I am doing a port for a 32bit target in GCC 4.4.0. I have defined memory_constraints in predicates.c like this (define_memory_constraint "Sr0" "Memory refrence through base registers" (match_test "target_mem_constraint (\"r0\", op)")) In the function target_mem_constraint i have int target_mem_constraint (const char *str, rtx op) { char c0 = str[0]; char c1 = str[1]; rtx op0 = XEXP (op, 0); bool strict = reload_completed; if (!MEM_P (op)) return 0; switch (c0) { case 'r': return (!STACK_REG_RTX_P (op0) && BASE_REG_RTX_P (op0, strict)); ... ... My question is my definition of strict correct? or should it be reload_in_progress || reload_completed? Regards, Shafi
Re: define_memory_constraint and REG_OK_STRICT
2009/9/30 Richard Henderson : > On 09/29/2009 07:32 AM, Mohamed Shafi wrote: >> >> My question is my definition of strict correct? >> or should it be reload_in_progress || reload_completed? > > I'm tempted to say it should be the later, but I'm not sure it really makes > any difference since reload does not query the operand predicates; it only > queries the operand constraints. This is a memory_constraint. The memory constraint allows an address based on the definition of the bool variable strict > > And even that said, neither the ARM or IA64 ports do anything with strict at > all, which suggests that you may not have to either. It's possible that > this works because after reload we verify an instruction with both > predicates and constraints. > > Is this question in response to a particular problem, or just trying to > avoid possible problems? > Both i guess My pattern for DI (define_insn "*mov_internal" [(set (match_operand:DI_DF 0 "nonimmediate_operand" "=d,t,t,d,t,d,Se0,d,Ss0,e,Ss1,b,Sr0,c,Se0,b,Sb1,b,Sb2,b,Sd2,b,Sd3,e") (match_operand:DI_DF 1 "general_operand" " i,i,t,t,d,d,d,Se0,e,Ss0,b,Ss1,c,Sr0,b,Se0,b,Sb1,b,Sb2,b,Sd2,e,Sd3"))] post_inc and post_dec is allowed only by the constraint 'Se0'. Reload pass was not choosing this alternative for the following pattern: (insn 103 102 53 4 ch_addr.c:11 (set (mem:DF (post_inc:SI (reg:SI 90 [ __ivtmp_22 ])) [0 S8 A64]) (subreg:DF (reg:DI 116 [+4 ]) 0)) 40 {*movdf_internal} (expr_list:REG_DEAD (reg:DI 116 [+4 ]) (expr_list:REG_INC (reg:SI 90 [ __ivtmp_22 ]) (nil because in the mem_constraint function i have int target_mem_constraint (const char *str, rtx op) { char c0 = str[0]; char c1 = str[1]; rtx op0 = XEXP (op, 0); bool strict = (reload_completed || reload_in_progress); if (!MEM_P (op)) return 0; switch (c0) { case 'r': return (!STACK_REG_RTX_P (op0) && BASE_REG_RTX_P (op0, strict)); case 'e': if (GET_CODE (op0) == POST_INC || GET_CODE (op0) == POST_DEC) return (!STACK_REG_RTX_P (XEXP (op0, 0)) && BASE_REG_RTX_P (XEXP (op0, 0), strict)); ... ... So the alternative was getting rejected due to my definition of strict and thus results in an ICE later. But since there were only few in the testsuite , i will have to guess that reload was fixing other cases similar to this and thus maybe generating unoptimized code. So what should be the definition ? bool strict = (reload_completed || reload_in_progress); or bool strict = reload_completed ? true : false; Regards, Shafi
Re: define_memory_constraint and REG_OK_STRICT
2009/9/30 Richard Henderson : > On 09/29/2009 09:46 PM, Mohamed Shafi wrote: >> >> bool strict = reload_completed ? true : false; > > What happens if you set "strict = false" here? > That's what ARM does. That particular case works, and yes arm does it that way but there are other targets that uses (reload_completed || reload_in_progress) like s390. So thats why i had to ask if my definition of strict is proper or not. I am not sure which one to use? Shafi
Re: Reload going wrong for addition.
2009/9/28 Richard Henderson : > On 09/28/2009 07:25 AM, Mohamed Shafi wrote: >> >> Hope someone suggests me a solution. > > The solution is almost certainly something involving the > TARGET_SECONDARY_RELOAD hook. You need to inform reload that it's going to > need some scratch registers in order to perform the operation. > > It's been a long time since I had to fiddle with this sort of thing, so I > forget all the details involved. Perhaps someone else has some additional > advice. > Ok what i did was to remove the code from preferred_reload_class function, so that now it returns class i.e #define PREFERRED_RELOAD_CLASS(class, x) class And did in TARGET_SECONDARY_RELOAD i added the code to have a scratch register to do the move operation. Now things are working. So i guess i should as why we have PREFERRED_RELOAD_CLASS when we can do the same with TARGET_SECONDARY_RELOAD? Shafi
Re: How to split 40bit data types load/store?
2009/9/14 Richard Henderson : > On 09/14/2009 07:24 AM, Mohamed Shafi wrote: >> >> Hello all, >> >> I am doing a port for a 32bit target in GCC 4.4.0. I have to support a >> 40bit data (_Accum) in the port. The target has 40bit registers which >> is a GPR and works as 32bit reg in other modes. The load and store for >> _Accum happens in two step. The lower 32bit in one instruction and the >> upper 8bit in the next instruction. I want to split the instruction >> after reload. I tired to have a pattern (for load) like this: >> >> (define_insn "fn_load_ext_sa" >> [(set (unspec:SA [(match_operand:DA 0 "register_operand" "")] >> UNSPEC_FN_EXT) >> (match_operand:SA 1 "memory_operand" ""))] >> >> (define_insn "fn_load_sa" >> [(set (unspec:SA [(match_operand:DA 0 "register_operand" "")] >> UNSPEC_FN) >> (match_operand:SA 1 "memory_operand" ""))] > > Unspec on the left-hand-side isn't something that's supposed to happen, and > is more than likely the cause of your problems. Try moving the unspec to > the right-hand-side like: > > (set (reg:SI reg) (mem:SI addr)) > > (set (reg:SA reg) > (unspec:SA [(reg:SI reg) (mem:QI addr)] > UNSPEC_ACCUM_INSERT)) > > and > > (set (mem:SI addr) (reg:SI reg)) > > (set (mem:QI addr) > (unspec:QI [(reg:SA reg)] > UNSPEC_ACCUM_EXTRACT)) > > Note that after reload it's perfectly acceptable for a hard register to > appear with the different SI and SAmodes. > > It's probably not too hard to define this with zero_extract sequences > instead of unspecs, but given that these only appear after reload, it may > not be worth the effort. > I was able to implement this with unspecs. But now it seems that i need to split the pattern before reload also. So i am thinking of removing this and doing a split before reload. The issue is that there is no support to for register indirect addressing mode for accessing the upper eight bits of the 40bit register. The only addressing mode supported for accessing this section is (SP+offset). So what i thought was to allow this addressing mode and at the time of reloading, at the time of secondary reload with the help of a scratch register and a scratch memory. But it seems that in GCC it is not possible to have both scratch memory and a scratch register for the same operation. Am i right? So what i did was to implement this at the define_expand stage itself. The idea is to generate the following sequence: for load (R0), D0 generate load (R0), D0// 32bit mode , SAmode move load (R0+4), scratch_reg // 32bit mode, SAmode store scratch_reg, (SP+off) //32bit mode, SAmode load.ext (SP+off), D0.u8 and similarly for store. Here are the patterns that i used for this purpose: (define_expand "movda" [(set (match_operand:DA 0 "nonimmediate_operand" "") (match_operand:DA 1 "nonimmediate_operand" ""))] "" "{ if (MEM_P (operands[1]) && REG_P (XEXP (operands[1], 0)) && XEXP (operands[1], 0) != virtual_stack_vars_rtx)) { rtx lo_half, hi_half; rtx scratch_mem, scratch_reg, subreg; gcc_assert (can_create_pseudo_p ()); scratch_reg = gen_reg_rtx (SAmode); scratch_mem = assign_stack_temp (SAmode, GET_MODE_SIZE (SAmode), 0);\ subreg = gen_rtx_SUBREG (SAmode, operands[0], 0); lo_half = adjust_address (operands[1], SAmode, 0); hi_half = adjust_address (operands[1], SAmode, 4); emit_insn (gen_rtx_SET (SAmode, subreg, lo_half)); emit_insn (gen_rtx_SET (SAmode, scratch_reg, hi_half)); emit_insn (gen_rtx_SET (SAmode, scratch_mem, scratch_reg)); emit_insn (gen_load_reg_ext (operands[0], scratch_mem)); DONE; } /* and similarly for store operation */ }" ) (define_insn "load_reg_ext" [(set (subreg:SA (zero_extract:DA (match_operand:DA 0 "register_operand" "=d") (const_int 8) (const_int 24)) 4) (match_operand:SA 1 "memory_operand" "Sd3"))] (define_insn "store_reg_ext" [(set (match_operand:SA 0 "memory_operand" "=Sd3") (zero_extract:SA (match_operand:DA 1 "register_operand" "d") (const_int 8) (const_int 24)))] (define_insn "*movsa_internal" [(set (match_operand:SA 0 "nonimmediate_operand" "=m,d,d") (match_operand:SA 1 "nonimmediate_operand" "d,m,d"))] By default -fomit-frame-pointer will passed to the complier. Without optimization compiler generates the expected output. But with optimization that is not the case. It seems that the pattern that i have written above are not proper. For the simple function like the following _Accum foo (_Accum *a) { _Accum b = *a; return b; } with optimization enabled the complier generates only load (R0), D0// 32bit mode , SAmode move the 1st instruction in the expected 4 instruction sequence. How can i write the patterns properly? Regards Shafi
How to support 40bit GP register
HI all, I am porting GCC 4.4.0 for a 32bit target. The target has 40bit data registers and 32bit address registers that can be used as general purpose registers. When 40bit registers are used for arithmetic operations or comparison operations GCC generates code assuming that its a 32bit register. Whenever there is a move from address register to data register sign extension is automatically performed by the target. Since the data register is 40bit after some operations sign/zero extension has to be performed for the result to be proper. Take the following test case for example : typedef struct { char b0; char b1; char b2; char b3; char b4; char b5; } __attribute__ ((packed)) b_struct; typedef struct { short a; long b; short c; short d; b_struct e; } __attribute__ ((packed)) a_struct; int main(void) { volatile a_struct *a; volatile a_struct b; a = &b; *a = (a_struct){1,2,3,4}; a->e.b4 = 'c'; if (a->b != 2) abort (); exit (0); } For accessing a->b GCC generates the following code: move.l (sp-16), d3 lsrr.l #<16, d3 move.l (sp-12),d2 asll#<16,d2 or d3,d2 cmpeq.w #<2,d2 jf _L2 Because data registers are 40 bit for 'asll' operation the shift count should be 16+8 or there should be sign extension from 32bit to 40 bits after the 'or' operation. The target has instruction to sign extend from 32bit to 40 bit. Similarly there are other operation that requires sign/zero extension. So is there any way to tell GCC that the data registers are 40bit and there by expect it to generate sign/zero extension accordingly ? Regards, Shafi
Typo in internals
Hi, The internal doc says : — Target Hook: bool TARGET_CAN_INLINE_P (tree caller, tree callee) This target hook returns false if the caller function cannot inline callee, based on target specific information. By default, inlining is not allowed if the callee function has function specific target options and the caller does not use the same options. But looking in the sources i think this really should have been TARGET_OPTION_CAN_INLINE_P Shafi.
Re: Supporting FP cmp lib routines
2009/9/14 Richard Henderson : > Another thing to look at, since you have hand-written routines and may be > able to specify that e.g. only a subset of the normal call clobbered > registers are actually modified, is to leave the call as a "compare" insn. > Something like > > (define_insn "*cmpsf" > [(set (reg:CC status-reg) > (compare:CC > (match_operand:SF 0 "register_operand" "R0") > (match_operand:SF 1 "register_operand" "R1"))) > (clobber (reg:SI r2)) > (clobber (reg:SI r3))] > "" > "call __compareSF" > [(set_attr "type" "call")]) > > Where the R0 and R1 constraints resolve to the input registers for the > routine. Depending on your ISA and ABI, you may not even need to split this > pattern post-reload. > I have implemented the above solution and it works. I have to support the same for DF also. But with DF i have a problem with the constraints. My target generates code for both big and little endian. The ABI specifies that when a 64bit value is passed as an argument they are passed in R6 and R7, R6 containing the most significant long word and R7 containing the least significant long word, regardless of the endianess mode. How can i do this in the DF compare pattern? Regards, Shafi
IRA is not looking into the predicates ?
Hi, I am doing a port for a 32bit target in GCC 4.4.0. The target does not have support for symbolic address in QImode for load operations. In order to do this what i have done is in define_expand for moveqi reject symbolic address it they come in source operands and i have also written a predicate for *moveqi_internal to reject such cases. But i get the following ICE: insn does not satisfy its constraints: (insn 24 5 6 2 ice4.c:4 (set (reg:QI 17 r1) (mem/c/i:QI (symbol_ref:SI ("s") [flags 0x2] ) [0 s+0 S1 A32])) 0 {*movqi_internal} (nil)) >From ice4.c.172r.ira (insn 24 5 6 2 ice4.c:4 (set (reg:QI 17 r1) (mem/c/i:QI (symbol_ref:SI ("s") [flags 0x2] ) [0 s+0 S1 A32])) 0 {*movqi_internal} (nil)) (insn 6 24 7 2 ice4.c:4 (set (reg:QI 16 r0 [62]) (plus:QI (reg:QI 17 r1) (const_int -100 [0xff9c]))) 16 {addqi3} (nil)) >From ice4.c.168r.asmcons (insn 5 2 6 2 ice4.c:4 (set (reg:SI 61 [ s ]) (mem/c/i:SI (symbol_ref:SI ("s") [flags 0x2] ) [0 s+0 S4 A32])) 2 {*movsi_internal} (nil)) (insn 6 5 7 2 ice4.c:4 (set (reg:QI 62) (plus:QI (subreg:QI (reg:SI 61 [ s ]) 0) (const_int -100 [0xff9c]))) 16 {addqi3} (expr_list:REG_DEAD (reg:SI 61 [ s ]) (nil))) How can i prevent this ICE ? Regards, Shafi
Re: How to support 40bit GP register
2009/10/22 Richard Henderson : > On 10/21/2009 07:25 AM, Mohamed Shafi wrote: >> >> For accessing a->b GCC generates the following code: >> >> move.l (sp-16), d3 >> lsrr.l #<16, d3 >> move.l (sp-12),d2 >> asll #<16,d2 >> or d3,d2 >> cmpeq.w #<2,d2 >> jf _L2 >> >> Because data registers are 40 bit for 'asll' operation the shift count >> should be 16+8 or there should be sign extension from 32bit to 40 bits >> after the 'or' operation. The target has instruction to sign extend >> from 32bit to 40 bit. >> >> Similarly there are other operation that requires sign/zero extension. >> So is there any way to tell GCC that the data registers are 40bit and >> there by expect it to generate sign/zero extension accordingly ? > > Define a machine mode for your 40-bit type in cpu-modes.def. Depending on > how your 40-bit type is stored in memory, you'll use either > > INT_MODE (RI, 5) // load-store uses exactly 5 bytes > FRACTIONAL_INT_MODE (RI, 40, 8) // load-store uses 8 bytes > Richard thanks for the reply. Load-store uses 32bits. Sign extension happens automatically. So i have choosen INT_MODE (RI, 5) and copied movsi and renamed it to movri. I have also specified that RImode need only one register. > Where I've arbitrarily chosen "RImode" as a mnemonic for Register Integral > Mode. Now you define arithmetic operations, as needed, on > RImode. You define the "extendsiri" pattern to be that sign-extend from > 32-to-40-bit instruction. You define your comparison patterns on RImode, > and not on SImode, since your comparison instruction works on the entire 40 > bits. I have defined extendsiri and cbranchri4 patterns. When i compile a program like unsigned long xh = 1; int main () { unsigned long yh = 0xull; unsigned long z = xh * yh; if (z != yh) abort (); return 0; } I get the following ICE internal compiler error: in immed_double_const, at emit-rtl.c:553 This happens from cse_insn () calls insert() -> gen_lowpart -> gen_lowpart_common -> simplify_gen_subreg -> simplfy_immed_subreg. simplify_immed_subreg is called with the parameters (outermode=RImode, (const_int 65535), innermode=DImode, byte=0) cse_insn is called for the following insn (insn 10 9 11 3 bug7.c:14 (set (reg:RI 67) (const_int 65535 [0x])) 4 {movri} (nil)) How can i overcome this? Regards, Shafi > > You'll wind up with a selection of patterns in your machine description that > have a sign-extension pattern built in, depending on the exact behaviour of > your ISA. There are plenty of examples on x86_64, mips64, and Alpha (to > name a few) that have similar properties with SI and DImodes. Examine the > -fdump-rtl-combine-details dump for exemplars of the canonical forms that > the combiner creates when it tries to merge sign-extension instructions into > preceeding patterns. >
Re: IRA is not looking into the predicates ?
2009/10/30 Jeff Law : > On 10/30/09 07:13, Mohamed Shafi wrote: >> >> Hi, >> >> I am doing a port for a 32bit target in GCC 4.4.0. The target does not >> have support for symbolic address in QImode for load operations. > > You'll need to make sure to reject such addresses for QImode in > GO_IF_LEGITIMATE_ADDRESS. > > >> In >> order to do this what i have done is in define_expand for moveqi >> reject symbolic address it they come in source operands and i have >> also written a predicate for *moveqi_internal to reject such cases. >> > > OK. Nothing wrong with these steps. Though you really need to make sure > GO_IF_LEGITIMATE_ADDRESS is defined correctly. > > IRA doesn't look at operand predicates or insn conditions. It assumes that > any insns are valid assuming any pseudo registers appearing in the insn get > suitable hard registers. > > Based on the dumps you provided it appears that reg61 does not get a hard > register and reload is generating the problematical insn #24. This is a > good indication that your GO_IF_LEGITIMATE_ADDRESS is incorrectly > implemented. > I the GO_IF_LEGITIMATE_ADDRESS address macro i am allowing this address because the target supports symbolic address in QImode for store operations. And in the macro GO_IF_LEGITIMATE_ADDRESS there is no option to check if the address is used in load or store. Thats why in define_expand for moveqi i reject symbolic address it they come in source operands and a predicate for *moveqi_internal to reject such cases. But still i am getting the ICE. IIRC the control does not come to TARGET_SECONDARY_RELOAD also. How can i overcome this? Regards, Shafi
Re: IRA is not looking into the predicates ?
2009/10/30 Ian Lance Taylor : > Mohamed Shafi writes: > >>>From ice4.c.168r.asmcons >> >> (insn 5 2 6 2 ice4.c:4 (set (reg:SI 61 [ s ]) >> (mem/c/i:SI (symbol_ref:SI ("s") [flags 0x2] > 0xb7bfd000 s>) [0 s+0 S4 A32])) 2 {*movsi_internal} (nil)) >> >> (insn 6 5 7 2 ice4.c:4 (set (reg:QI 62) >> (plus:QI (subreg:QI (reg:SI 61 [ s ]) 0) >> (const_int -100 [0xff9c]))) 16 {addqi3} >> (expr_list:REG_DEAD (reg:SI 61 [ s ]) >> (nil))) >> >> How can i prevent this ICE ? > > If asmcons is the first place that this appears, then I think it must > be coming from some asm statement. So the first step would be to look > at the asm statement and see if it can be rewritten using a different > constraint. > No this appears from the rtl expand onwards. Shafi
Re: How to write shift and add pattern?
2009/11/6 Richard Henderson : > On 11/06/2009 05:29 AM, Mohamed Shafi wrote: >> >> The target that i am working on has 1& 2 bit shift-add patterns. >> GCC is not generating shift-add patterns when the shift count is 1. It >> is currently generating add operations. What should be done to >> generate shift-add pattern instead of add-add pattern? > > I'm not sure. You may have to resort to matching > > (set (match_operand 0 "register_operand" "") > (plus (plus (match_operand 1 "register_operand" "") > (match_dup 1)) > (match_operand 2 "register_operand" "" > > But you should debug make_compound_operation first to > figure out what's going on for your port, because it's > working for x86_64: > > long foo(long a, long b) { return a*2 + b; } > > leaq (%rsi,%rdi,2), %rax # 8 *lea_2_rex64 > ret # 26 return_internal > > > r~ > I have fixed this. The culprit was the cost factor. I added the case in targetm.rtx_costs and now it works properly. But i am having issues with the reload. Regards, Shafi
Re: How to write shift and add pattern?
2009/11/6 Ian Lance Taylor : > Mohamed Shafi writes: > >> It is generating with data registers. Here is the pattern that i have >> written: >> >> >> (define_insn "*saddl" >> [(set (match_operand:SI 0 "register_operand" "=r,d") >> (plus:SI (mult:SI (match_operand:SI 1 "register_operand" "r,d") >> (match_operand:SI 2 "const24_operand" "J,J")) >> (match_operand:SI 3 "register_operand" "0,0")))] >> >> How can i do this. Will the constraint modifiers '?' or '!' help? >> How can make GCC generate shift and add sequence when the shift count is 1? > > Does 'd' represent a data register? I assume that 'r' is a general > register, as it always is. What is the constraint character for an > address register? You don't seem to have an alternative here for > address registers, so I'm not surprised that the compiler isn't > picking it. No doubt I misunderstand something. > Ok the constrain for address register is 'a'. Thats typo in the pattern that i given here. The proper pattern is (define_insn "*saddl" [(set (match_operand:SI 0 "register_operand" "=a,d") (plus:SI (mult:SI (match_operand:SI 1 "register_operand" "a,d") (match_operand:SI 2 "const24_operand" "J,J")) (match_operand:SI 3 "register_operand" "0,0")))] So how can i choose the address registers over data registers if that is more profitable? Regards, Shafi
Re: How to support 40bit GP register
2009/10/22 Richard Henderson : > On 10/21/2009 07:25 AM, Mohamed Shafi wrote: >> >> For accessing a->b GCC generates the following code: >> >> move.l (sp-16), d3 >> lsrr.l #<16, d3 >> move.l (sp-12),d2 >> asll #<16,d2 >> or d3,d2 >> cmpeq.w #<2,d2 >> jf _L2 >> >> Because data registers are 40 bit for 'asll' operation the shift count >> should be 16+8 or there should be sign extension from 32bit to 40 bits >> after the 'or' operation. The target has instruction to sign extend >> from 32bit to 40 bit. >> >> Similarly there are other operation that requires sign/zero extension. >> So is there any way to tell GCC that the data registers are 40bit and >> there by expect it to generate sign/zero extension accordingly ? > > Define a machine mode for your 40-bit type in cpu-modes.def. Depending on > how your 40-bit type is stored in memory, you'll use either > > INT_MODE (RI, 5) // load-store uses exactly 5 bytes > FRACTIONAL_INT_MODE (RI, 40, 8) // load-store uses 8 bytes > > Where I've arbitrarily chosen "RImode" as a mnemonic for Register Integral > Mode. Now you define arithmetic operations, as needed, on > RImode. You define the "extendsiri" pattern to be that sign-extend from > 32-to-40-bit instruction. You define your comparison patterns on RImode, > and not on SImode, since your comparison instruction works on the entire 40 > bits. > > You'll wind up with a selection of patterns in your machine description that > have a sign-extension pattern built in, depending on the exact behaviour of > your ISA. There are plenty of examples on x86_64, mips64, and Alpha (to > name a few) that have similar properties with SI and DImodes. Examine the > -fdump-rtl-combine-details dump for exemplars of the canonical forms that > the combiner creates when it tries to merge sign-extension instructions into > preceeding patterns. > Ok i have comparison patterns written in RImode. When you say that i will wind up with a selection of patterns do you mean to say that i should have patterns for operations that operate on full 40bits in RImode and disable the corresponding SImode patterns? Or is it that i have to write nameless patterns in RImode for arithmetic operations and look at the dumps to see how the combiner will merge the patterns so that it can match the comparison operations? Regards, Shafi
How to split mulsi3 pattern
Hello all, I am doing a port for a 32bit target in GCC 4.4.0. In my target 32bit multiply instruction is carried out in two instructions. Dn = Da x Db is executed as Dn = (Da.L * Db.H + Da.H * Db.L) << 16 Dn = Dn + (Da.L * Db.L) Currently the pattern that i have for this is as follows: (define_insn "mulsi3" [(set (match_operand:SI 0 "register_operand" "=&d") (mult:SI (match_operand:SI 1 "register_operand" "%d") (match_operand:SI 2 "register_operand" "d")))] I would like to split this pattern into two (either after of before reload). Currently i am doing something like this: (define_insn_and_split "mulsi3" [(set (match_operand:SI 0 "register_operand" "=&d") (mult:SI (match_operand:SI 1 "register_operand" "%d") (match_operand:SI 2 "register_operand" "d")))] "" "#" "reload_completed" [(set (match_dup 0) (ashift:SI (plus:SI (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_LOW) (unspec:HI [(match_dup 1)] UNSPEC_REG_HIGH)) (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_HIGH) (unspec:HI [(match_dup 1)] UNSPEC_REG_LOW))) (const_int 16))) (set (match_dup 0) (plus:SI (match_dup 0) (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_LOW) (unspec:HI [(match_dup 1)] UNSPEC_REG_LOW] "" ) But in few testcases this is creating problems. So i would like to know better patterns to split mulsi3 pattern. Can someone help me out. Regards, Shafi
Re: How to split mulsi3 pattern
2009/11/10 Richard Henderson : > On 11/10/2009 05:48 AM, Mohamed Shafi wrote: >> >> (define_insn "mulsi3" >> [(set (match_operand:SI 0 "register_operand" "=&d") >> (mult:SI (match_operand:SI 1 "register_operand" "%d") >> (match_operand:SI 2 "register_operand" "d")))] > > Note that "%" is only useful if the constraints for the two operands are > different (e.g. only one operand accepts an immediate input). When they're > identical, you simply waste cpu cycles asking reload to try the operands in > the other order. > >> [(set (match_dup 0) >> (ashift:SI >> (plus:SI (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_LOW) >> (unspec:HI [(match_dup 1)] UNSPEC_REG_HIGH)) >> (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_HIGH) >> (unspec:HI [(match_dup 1)] UNSPEC_REG_LOW))) >> (const_int 16))) >> (set (match_dup 0) >> (plus:SI (match_dup 0) >> (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_LOW) >> (unspec:HI [(match_dup 1)] UNSPEC_REG_LOW] > > Well for one, your modes don't match. You actually want your unspecs and > MULTs to be SImode. > > You could probably usefully model the second insn as > > (define_insn "mulsi3_part2" > [(set (match_operand:SI 0 "register_operand" "=d") > (plus:SI > (mult:SI (zero_extend:SI > (match_operand:HI 1 "register_operand" "d")) > (zero_extend:SI > (match_operand:HI 2 "register_operand" "d"))) > (match_operand:SI 3 "register_operand" "0")))] > "" > ...) So i need to change the mode of the register from SI to HI after reloading. Is that allowed? Regards, Shafi
How to support 40bit GP register - Take two
Hello all, I am porting GCC 4.4.0 for a 32bit target. The target has 40bit data registers and 32bit address register. Both can be used as general purpose registers. All load and store operations are 32bit. If 40bit data register is involved in load/sore the register gets sign extended. Whenever there is a move from address register to data register sign extension is automatically performed. Currently GCC generates code for 32bit register target. Since the data register is 40bit after/before some operations sign/zero extension has to be performed for the result to be proper. So at present for the port the results are not proper. I would need a solution to fix this. I had mailed about this previously. You can see about this here http://www.mail-archive.com/gcc@gcc.gnu.org/msg47224.html I tried implementing the suggestion given by Richard, but got into issues. The GCC frame work is written assuming that there are no modes with HOST_BITS_PER_WIDE_INT < GET_MODE_BITSIZE (mode) < 2 * HOST_BITS_PER_WIDE_INT. Moreover i am getting ICEs when there is an optimization/operation related to subreg. (GCC tries to split RImode values).RImode is 5byte and uses SImode load/store instructions. So GCC generates offsets/addresses that are not 32bit aligned. Currently i am hacking the complier all the way to get an executable (though i have not tested the output of the obtained executables) Even if i somehow manage to get proper output there is the issue of using 32bit registers in RImode instructions. RImode values is meant for 40bit register, i.e data register. That means i will not be able to use address registers(32bit registers) in RImode patterns even though the instructions accept them. This will definitely hamper efficiency. So i was wondering if anybody has any alternative solution that i can try. All i can think is to flag an insn for unsigned operation so that i will be able to insert sign/zero extension during say reorg pass. Can this be implemented? How feasible is this? Regards, Shafi
Re: How to support 40bit GP register - Take two
2009/12/18 Hans-Peter Nilsson : > On Fri, 20 Nov 2009, Mohamed Shafi wrote: >> I tried implementing the suggestion given by Richard, but got into >> issues. The GCC frame work is written assuming that there are no modes >> with HOST_BITS_PER_WIDE_INT < GET_MODE_BITSIZE (mode) < 2 * >> HOST_BITS_PER_WIDE_INT. > > (Not seeing a reply regarding this issue, so here's mine, belated:) > > Perhaps a wart, but with a 64-bit HOST_BITS_PER_WIDE_INT, would > that affect your port? It's not? Just set need_64bit_hwint=yes > in config.gcc. And send a patch for the introductory comment in > that file, unless your port already matches the "BITS_PER_WORD > > 32 bits" condition. > Thanks Hans for yourr reply I have already tried that. What you are suggesting is the first solution that i got from Richard Henderson. I have mentioned the issues if faced with this in my mail. The GCC frame work is written assuming that there are no modes with HOST_BITS_PER_WIDE_INT < GET_MODE_BITSIZE (mode) < 2 * HOST_BITS_PER_WIDE_INT. So i had to hack at places to get things working. For my target the BITS_PER_WORD == 32. The mode that i am using is RImode (5bytes) Regards, Shafi
How to implement pattens with more that 30 alternatives
Hi all, I am doing a port in GCC 4.4.0 for a 32 bit target. As a part of scheduling framework i have to write the move patterns with more clarity, so that i could control the scheduling with the help of attributes. Re-writting the pattern resulted in movsi pattern with 41 alternatives :( When i specify the attributes it seems that all the alternatives above 31 are allocated with the default value of the attribute. This is done in the generated file insn-attrtab.c. The following is one such piece of code: case 2: /* *movsi_internal */ extract_constrain_insn_cached (insn); if (((1 << which_alternative) & 0xf)) { return DELAY_SLOT_TYPE_CLOB_SR; } else if (((1 << which_alternative) & 0x30)) { return DELAY_SLOT_TYPE_RW_SP; } else if (which_alternative == 6) { return DELAY_SLOT_TYPE_CLOB_SR; } else if (((1 << which_alternative) & 0x1fff80)) { return DELAY_SLOT_TYPE_COMMON; } else if (((1 << which_alternative) & 0x1e0)) { return DELAY_SLOT_TYPE_RW_SP; } else if (which_alternative == 25) { return DELAY_SLOT_TYPE_READ_SR; } else if (which_alternative == 26) { return DELAY_SLOT_TYPE_READ_EMR; } else if (which_alternative == 27) { return DELAY_SLOT_TYPE_COMMON; } else if (which_alternative == 28) { return DELAY_SLOT_TYPE_WRITE_SR; } else { return DELAY_SLOT_TYPE_COMMON; } As you can see from the above code all the alternatives which are more that 31 will always get the default value of the attribute. This is because GCC assumes that the target has only 31 alternatives. Even changing the macro #define MAX_RECOG_ALTERNATIVES 30 in the file recog.h there is no change in this assumption. (Which i think should have affected the attribute calulation). I guess that if i make need_64bit_hwint=yes , then this problem should go away. I havent check this. But i dont want to do that, since this means that i will have to change all the dependencies that are affected by this change. Is there any other solution for my problem? Any help is appreciated. Regards, Shafi
Re: How to implement pattens with more that 30 alternatives
2009/12/22 Richard Earnshaw : > > On Mon, 2009-12-21 at 18:44 +, Paul Brook wrote: >> > > > I am doing a port in GCC 4.4.0 for a 32 bit target. As a part of >> > > > scheduling framework i have to write the move patterns with more >> > > > clarity, so that i could control the scheduling with the help of >> > > > attributes. Re-writting the pattern resulted in movsi pattern with 41 >> > > > alternatives :( >> > > >> > > Use rtl expressions instead of alternatives. e.g. arm.md:arith_shiftsi >> > >> > Or use the more modern iterators approach. >> >> Aren't iterators for generating multiple insns (e.g. movsi and movdi) from >> the >> same pattern, whereas in this case we have a single insn that needs to >> accept >> many different operand combinartions? > > Yes, but that is often better, I suspect, than having too fancy a > pattern that breaks the optimization simplifications that genrecog does. > > Note that the attributes that were requested could be made part of the > iterator as well, using a mode_attribute. > I can't find a back-end that does this. Can you show me a example? Regards, Shafi
Question about peephole2 and addressing mode
Hello all, I am doing a port for a 32bit a target in GCC 4.4.0. The target supports (base + offset) addressing mode for QImode store instructions but not for QImode load instructions. GCC doesn't take the middle path. It either supports an addressing mode completely and doesn't support at all. I tried lot of hacks to support (base + offset) addressing mode only for QI mode store instructions. After a lot of fight i finally gave up and removed the QImode support for this addressing mode completely in GO_IF_ LEGITIMATE_ADDRESS macro. Now i am pursing an alternate solution. Have peephole2 patterns to implement QImode (base+offset) addressing mode for store instructions. How does it sound? So now i have written a peephole2 pattern like: (define_peephole2 [(parallel [(set (match_operand:SI 0 "register_operand" "") (plus:SI (match_operand:SI 1 "register_operand" "") (match_operand:SI 2 "const_int_operand" ""))) (clobber (reg:CCC CC_REGNUM)) (clobber (reg:CCO EMR_REGNUM))]) (set (mem:QI (match_dup 0)) (match_operand:QI 3 "register_operand" ""))] "REGNO_OK_FOR_BASE_P (REGNO (operands[1])) && constraint_satisfied_p (operands[2], CONSTRAINT_N)" [(set (mem:QI (plus:SI (match_dup 1) (match_dup 2))) (match_dup 3))] "") In the rtl dumps just before peephole2 pass i get (insn 213 211 215 39 20010408-1.c:71 (parallel [ (set (reg/f:SI 16 r0 [121]) (plus:SI (reg/v/f:SI 18 r2 [orig:93 p ] [93]) (const_int -1 [0x]))) (clobber (reg:CCC 50 sr)) (clobber (reg:CCO 54 emr)) ]) 18 {addsi3} (expr_list:REG_UNUSED (reg:CCO 54 emr) (expr_list:REG_UNUSED (reg:CCC 50 sr) (nil (insn 215 213 214 39 20010408-1.c:71 (set (mem/f/c/i:SI (plus:SI (reg/f:SI 23 r7) (const_int -32 [0xffe0])) [5 s+0 S4 A32]) (reg/v/f:SI 18 r2 [orig:93 p ] [93])) 2 {*movsi_internal} (expr_list:REG_DEAD (reg/v/f:SI 18 r2 [orig:93 p ] [93]) (nil))) (insn 214 215 284 39 20010408-1.c:71 (set (mem:QI (reg/f:SI 16 r0 [121]) [0 S1 A8]) (reg/v:QI 6 d6 [orig:92 ch ] [92])) 0 {*movqi_internal} (expr_list:REG_DEAD (reg/f:SI 16 r0 [121]) (expr_list:REG_DEAD (reg/v:QI 6 d6 [orig:92 ch ] [92]) (nil This is not match by the peephole2 pattern. After debugging i see that the function 'peephole2_insns' matches only consecutive patterns. Is that true? Is there a way to over come this? Another issue. In another instance peephole2 matched but the generated pattern did not get recognized because GO_IF_ LEGITIMATE_ADDRESS was rejecting the addressing mode. Since peephole2 pass was run after reload i changed GO_IF_ LEGITIMATE_ADDRESS macro to allow the addressing mode after reload is completed. So now the check is something like this: case PLUS: { rtx offset = XEXP (x, 1); rtx base = XEXP (x, 0); if ( !(BASE_REG_RTX_P (base, strict) || STACK_REG_RTX_P (base))) return 0; /* For QImode the target does not suppport (base + offset) address in the load instructions. So we disable this addressing mode till reload is completed. */ if (!reload_completed && mode == QImode && BASE_REG_RTX_P (base, strict)) return 0; I haven't run the testsuite, but Is this ok to have like this? Please let me know your thoughts on this. Thanks for your time. Regards Shafi
Re: Question about peephole2 and addressing mode
2010/1/22 Richard Henderson : > On 01/21/2010 06:22 AM, Mohamed Shafi wrote: >> >> Hello all, >> >> I am doing a port for a 32bit a target in GCC 4.4.0. The target >> supports (base + offset) addressing mode for QImode store instructions >> but not for QImode load instructions. GCC doesn't take the middle >> path. It either supports an addressing mode completely and doesn't >> support at all. I tried lot of hacks to support (base + offset) >> addressing mode only for QI mode store instructions. After a lot of >> fight i finally gave up and removed the QImode support for this >> addressing mode completely in GO_IF_ LEGITIMATE_ADDRESS macro. Now i >> am pursing an alternate solution. Have peephole2 patterns to implement >> QImode (base+offset) addressing mode for store instructions. How does >> it sound? > > It doesn't sound totally implausible. But as you notice, peepholes only act > on sequential instructions. In order to assist generating sequential > instructions, you could try allowing base+offset in non-strict mode. > > I fear you'll likely have to use a combination of methods in order to get > decent code here. can you point out the combination of the methods in your mind? And by the way is it possible to allow the addressing mode completely and then break it down into register indirect addressing mode after reload pass? Regards, Shafi
How to write legitimize_reload_address
Hi all, I am doing a port of a 32bit target in GCC 4.4.0. I have written the macro legitimize_reload_address which does something similar to what the target pa does. i.e For the PA, transform: memory(X + ) into: if ( & mask) >= 16 Y = ( & ~mask) + mask + 1 Round up. else Y = ( & ~mask) Round down. Z = X + Y memory (Z + ( - Y)); The input for the macro is (plus:SI (reg/f:SI 23 r7) (const_int 65536 [0x1])) and the target support only 15bit signed offset so in legitimize_reload_address i have mask = 0x3fff; offset = INTVAL (XEXP ((addr), 1)); /* Choose rounding direction. Round up if we are >= halfway. */ if ((offset & mask) >= ((mask + 1) / 2)) newoffset = (offset & ~mask) + mask + 1; else newoffset = offset & ~mask; /* Ensure that long displacements are aligned. */ newoffset &= ~(GET_MODE_SIZE (mode) - 1); if (newoffset) { temp = gen_rtx_PLUS (Pmode, XEXP (addr, 0), GEN_INT (newoffset)); addr = gen_rtx_PLUS (Pmode, temp, GEN_INT (offset - newoffset)); push_reload (XEXP (addr, 0), 0, &XEXP (addr, 0), 0, BASE_REG_CLASS, Pmode, VOIDmode, 0, 0, opnum, type); return addr; } The macro is defined like this: #define LEGITIMIZE_RELOAD_ADDRESS(X,MODE,OPNUM,TYPE,IND_L,WIN) \ do { \ rtx new_x = legitimize_reload_address (X, MODE, OPNUM, TYPE, IND_L); \ if (new_x) \ { \ X = new_x; \ goto WIN;\ } \ } while (0) I issue that i am facing is that if i return null_rtx without doing any processing the complier works propely. But if legitimize_reload_address gets executed and jumbs to the label WIN iget ICE. ice1.c:5: error: unrecognizable insn: (insn 45 44 20 2 ice1.c:5 (set (mem/c:SI (plus:SI (reg:SI 16 r0) (const_int 65536 [0x1])) [4 S4 A32]) (reg:SI 2 d2)) -1 (nil)) Is there something wrong with my legitimize_relaod_address? Thanks for your time. Regards, shafi
how to identify a part of a multi-word register
Hi, I am doing a port for a 32bit target in GCC 4.4.0. I need a way to identify that a register is part of a multiword register. I need to emit an instruction that works on LSW of the double word register on move instructions. Currently the target splits the DImode and DFmode moves after reloading. So i am able to generate the required instruction while doing the split. But it seems that sometimes the subreg pass splits the multiword register into SImode or SFmode register references before reg-alloc. Since it is not required to split these moves, I am not able to insert the required instruction for LSW. So I was wondering if it is possible to recognize a register as a part of a multiword register? In the rtl-dumps there are expressions like : (insn 255 254 256 2 pr28634.c:13 (set (mem/v/c/i:SI (plus:SI (reg/f:SI 49 sp) (const_int -16 [0xfff0])) [2 y+0 S4 A64]) (reg:SI 2 d2)) 2 {*movsi_internal} (nil)) (insn 256 255 257 2 pr28634.c:13 (set (mem/v/c/i:SI (plus:SI (reg/f:SI 49 sp) (const_int -12 [0xfff4])) [2 y+4 S4 A32]) (reg:SI 3 d3 [+4 ])) 2 {*movsi_internal} (nil)) which points out that d3 is part of a multiword register. Looking into the gcc sources I find that this is done with the help of REG_OFFSET macro. So can I use this macro to identify a register as a part of multiword register? Is there any other way to do this? Regards, Shafi
Variable Length Execution Set?
Hi all, Does GCC support architectures that has Variable Length Execution Set (VLES)? Are there any developments happening in this direction? Regards, Shafi
Re: Variable Length Execution Set?
2009/5/27 Ian Lance Taylor : > Mohamed Shafi writes: > >> Does GCC support architectures that has Variable Length Execution Set (VLES)? >> Are there any developments happening in this direction? > > gcc supports many instruction sets whose instructions are not all the > same size, including x86. In particular, gcc supports ia64, which uses > bundling. If you mean something else, I think you need to give more > details. I know that GCC supports VLIW. VLES is similar to VLIW, except that in a packet i can have variable number of instruction. ie. each packet should contain at least one instruction with a max of 6 instructions in a packet. Shafi
About feasibility of implementing an instruction
Hello all, I just want to know about the feasibility of implementing an instruction for a port in gcc 4.4 The target has 40 bit register where the normal load/store/move instructions will be able to access the 32 bits of the register. In order to move data into the rest of the register [b32 to b39] the data has to be stored into a 32bit memory location. The data should be stored in such a way that if it is stored for 0-7 in memory the data can be moved to b32-b39 of a even register and if the data in the memory is stored in 16-23 of the memory word then it can be moved to b32-b39 of a odd register. Hope i make myself clear. Will it be possible to implement this in the gcc back-end so that the particular instruction is supported? Regards, Shafi
CALL_USED_REGISTERS vs CALL_REALLY_USED_REGISTERS
Hello all, The GCC 4.4.0 internal says : [Macro] CALL_REALLY_USED_REGISTERS Like CALL_USED_REGISTERS except this macro doesn’t require that the entire set of FIXED_REGISTERS be included. (CALL_USED_REGISTERS must be a superset of FIXED_ REGISTERS). This macro is optional. If not specifed, it defaults to the value of CALL_USED_REGISTERS. But it doesn't say why one needs to use this. What is the need for the macro CALL_REALLY_USED_REGISTERS when compared to CALL_USED_REGISTERS? regards, Shafi
Help for target with BITS_PER_UNIT = 16
Hello all, I am trying to port GCC 4.5.1 for a processor that has the following addressing capability: The data memory address space of 64K bytes is represented by a total of 15 bits, with each address selecting a 16-bit element. When using the address register, the LSB of address reg (AD) points to a 16-bit field in data memory. If a data memory line is 128 bits there are 8, 16-bit elements per data memory line. We use little endian addressing, so if AD=0, bits [15:0] of data memory line address 0 would be selected. If AD=1, bits [31:16] of data memory line address 0 would be selected. If AD=9, bits [31:16] of data memory line address 1 would be selected. So if i have the following program short arr[5] = {11,12,13,14,15}; int foo () { short a = arr[0] + arr[3]; return a; } Assume that short is 16bits and short address is 2byte aligned.Then I expect the following code to get generated: mov a0,#arr // Load the address mov a1, a0 // Copy the address add a1, 1 // Increment the location by 1 so that the address points to arr[1] ld.16 g0, (a1) // Load the value 12 into g0 mov a1, a0 // Copy the address add a1, 3 // Increment the location by 3 so that the address points to arr[3] ld.16 g1, (a1) // Load the value 14 into g0 add g1, g1, g0 // Add 12 and 14 For the following code: short arr[5] = {11,12,13,14,15}; int foo () { short a,b; a = (short) (&arr[3] - &arr[1]); // a is 2 after this operation b = (short) ((char*)&arr[3] - (char*)&arr[1]); // b is 4 after this operation return a; } My question is should i set the macro BITS_PER_UNIT = 16 to get a code generated like this? From IRC chat i realize that BITS_PER_UNIT != 8 is seriously rotten. If that is the case how can i proceed to port this target? Regards, Shafi
Need help in deciding the instruction set for a new target.
Hello all, I am trying to do a port on GCC 4.5. The target has a memory resolution of 32bits i.e. char is 32bits in the target (addr 0 selects 1st 32bit and addr 1 selects 2nd 32bit). It has only word (32bit) access. In terms of address resolution this target is similar to c4x which became obsolete in GCC 4.2. There are two ways to implement this port. One is to have BITS_PER_UNIT ==32, like c4x and other is to have a normal C like char == 8, short == 16, and int == 32. We are thinking about having BITS_PER_UNIT == 32. Yes I know the support for such a target is bit rotten in GCC. I am currently trying to removing it. In the mean time, we are in the process of finalizing the instructions. The current instruction set has support for 32bit immediate data only in move operations. i.e. move src1GP, #imm32 For all other operations like div, sub, add, compare, modulus, load, store the support is only for 16bit immediate. For all these instruction there is separate flavor for sign and zero extension. i.e. mod.s32 srcdstGP, #imm16 // 32%imm16 signed modulus mod.u32 srcdstGP, #imm16 // 32%imm16 unsigned modulus cmp.s32 src1GP, #imm16 // signed register to 16-bit immediate compare cmp.u32 src1GP, #imm16 // unsigned register to 16-bit immediate compare sub.s32 srcdstGP, #imm16 // signed 16-bit register to immediate subtract sub.u32 srcdstGP, #imm16 // unsigned 16-bit register to immediate subtract I want to know if it is good to have both sign and zero extension for 16bit immediate. Will it be of any use with a configuration where char == short == int == 32bit? Will I be able to support these kinds of instructions in a GCC port? Or will it good to have a separate sign and zero extension instruction, which the current instruction set doesn’t have. Do I need a separate sign and zero ext instructions along with the above instructions? It would be of great help if you could guide me in deciding these instructions. Regards, Shafi
Help with reloading FP + offset addressing mode
Hi, I am doing a port in GCC 4.5.1. For the port 1. there is only (reg + offset) addressing mode only when reg is SP. Other base registers are not allowed 2. FP cannot be used as a base register. (FP based addressing is done by copying it into a base register) In order to take advantage of FP elimination (this will create SP + offset addressing), what i did the following 1. Created a new register class (address registers + FP) and used this new class as the BASE_REG_CLASS 2. Defined HARD_REGNO_OK_FOR_BASE_P like the following : #define HARD_REGNO_OK_FOR_BASE_P(NUM) \ ((NUM) < FIRST_PSEUDO_REGISTER \ && (((reload_completed || reload_in_progress)? 0 : (NUM) == FP_REG) \ || REGNO_REG_CLASS(NUM) == ADD_REGS)) 3. In legitimate_address_p i have the followoing: if (REGNO (x) == FP_REG) { if (strict) return false; else return true; } else if (strict) return STRICT_REG_OK_FOR_BASE_P (REGNO (x)); else return NONSTRICT_REG_OK_FOR_BASE_P (REGNO (x)); But when FP doesn't get eliminated i will get address of the form (plus:QI (reg/f:QI 27 as15) (const_int 2)) which gets reloaded by replacing FP with address register, other than SP. I am guessing this happens because of modified BASE_REG_CLASS. I haven't confirmed this. So in order to over come this what i have done is, in legitimize_reload_address i have the following : if (GET_CODE (*x) == PLUS && REG_P (XEXP (*x, 0)) && REGNO (XEXP (*x, 0)) < FIRST_PSEUDO_REGISTER && GET_CODE (XEXP (*x, 1)) == CONST_INT && XEXP (*x, 0) == frame_pointer_rtx) { /* GCC will by default reload the FP into a BASE_CLASS_REG, which results in an invalid address. For us, the best thing to do is move the whole expression to a REG. */ push_reload (*x, NULL_RTX, x, NULL, SPAA_REGS, mode, VOIDmode,0, 0, opnum, (enum reload_type)type); return 1; } Does my logic makes sense? Is there any better way to implement this? With this implementation for the following sequence : (insn 9 6 10 2 fun_calls.c:12 (set (reg/f:QI 42) (mem/f/c/i:QI (plus:QI (reg/f:QI 33 AP) (const_int -2 [0xfffe])) [0 f+0 S1 A32])) 9 {movqi_op} (nil)) (insn 10 9 11 2 fun_calls.c:12 (set (reg:QI 43) (const_int 60 [0x3c])) 7 {movqi_op} (nil)) I am getting the following output: (insn 45 6 47 2 fun_calls.c:12 (set (reg:QI 28 a0) (const_int 2 [0x2])) 9 {movqi_op} (nil)) (insn 47 45 48 2 fun_calls.c:12 (set (reg:QI 28 a0) (reg/f:QI 27 as15)) 9 {movqi_op} (nil)) (insn 48 47 49 2 fun_calls.c:12 (set (reg:QI 28 a0) (plus:QI (reg:QI 28 a0) (const_int 2 [0x2]))) 14 {addqi3} (expr_list:REG_EQUIV (plus:QI (reg/f:QI 27 as15) (const_int 2 [0x2])) (nil))) (insn 49 48 10 2 fun_calls.c:12 (set (reg/f:QI 0 g0 [42]) (mem/f/c/i:QI (reg:QI 28 a0) [0 f+0 S1 A32])) 9 {movqi_op} (nil)) insn 45 is redundant. Is this generated because the legitimize_reload_address is wrong? Any hints as to why the redundant instruction gets generated? Regards, Shafi
Re: Help with reloading FP + offset addressing mode
On 29 October 2010 00:06, Joern Rennecke wrote: > Quoting Mohamed Shafi : > >> Hi, >> >> I am doing a port in GCC 4.5.1. For the port >> >> 1. there is only (reg + offset) addressing mode only when reg is SP. >> Other base registers are not allowed >> 2. FP cannot be used as a base register. (FP based addressing is done >> by copying it into a base register) >> In order to take advantage of FP elimination (this will create SP + >> offset addressing), what i did the following >> >> 1. Created a new register class (address registers + FP) and used this >> new class as the BASE_REG_CLASS > > Stop right there. You need to distinguish between FRAME_POINTER_REGNUM > and HARD_FRAME_POINTER_REGNUM. > From the description given in the internals, i am not able to understand why you suggested this. Could you please explain this? Shafi
Re: Help with reloading FP + offset addressing mode
On 30 October 2010 05:45, Joern Rennecke wrote: > Quoting Mohamed Shafi : > >> On 29 October 2010 00:06, Joern Rennecke >> wrote: >>> >>> Quoting Mohamed Shafi : >>> >>>> Hi, >>>> >>>> I am doing a port in GCC 4.5.1. For the port >>>> >>>> 1. there is only (reg + offset) addressing mode only when reg is SP. >>>> Other base registers are not allowed >>>> 2. FP cannot be used as a base register. (FP based addressing is done >>>> by copying it into a base register) >>>> In order to take advantage of FP elimination (this will create SP + >>>> offset addressing), what i did the following >>>> >>>> 1. Created a new register class (address registers + FP) and used this >>>> new class as the BASE_REG_CLASS >>> >>> Stop right there. You need to distinguish between FRAME_POINTER_REGNUM >>> and HARD_FRAME_POINTER_REGNUM. >>> >> >> From the description given in the internals, i am not able to >> understand why you suggested this. Could you please explain this? > > In order to trigger reloading of the address, you have to have a register > elimination, even if the stack pointer is not a suitable destinatination > for the elimination. Also, if you want to reload do the work for you, > you must not lie to it about the addressing capabilities of an actual hard > register. Hence, you need separate hard and soft frame pointers. > Debugging sessions of the reload pass tells me that if the reload_pass get the address of the form (reg + off), it assumes one of the following: 1. the address is invalid because 'reg' is not a suitable base register 2. the offset is out of range 3. the address has an eliminatable register as a base register. Depending on what it finds, reload_pass reloads the address accordingly. So for my target when the pass encounters the address of the form: (plus:QI (reg/f:QI 33 ArgP) (const_int -2 [0xfffe])) it eliminates the arg pointer to either stack or frame pointer and reloads it. If the base register is FP, during reloading it just reloads the FP with a valid base register, but then the address becomes invalid. Relaod_pass cannot figure out that the addressing mode itself is invalid due to wrong base register. Since SP is the only valid register among the base registers that can form (reg + off) addressing mode, for the reload to work properly i will have to allow this addressing mode only when SP is base register - even in non-strict mode. But then i will loose lot of oppurtunities when elimination happens in favour of SP. Hence i allow the above form of address for all frame related pesudos. So to respond to your comments, i agree that as far as possible the port has to be truthful to reload pass about the addressing mode capabilities, but then i am not sure if distinguishing between FRAME_POINTER_REGNUM and HARD_FRAME_POINTER_REGNUM will help my cause. Do you agree? Or am i not understanding what your suggestion implies? Shafi
Opinion on a hardware feature for conditional instructions
Hi all, I need a opinion on a design front. I am doing a port for a private target in GCC 4.5.1. We are also in the process of designing the hardware along with the development of the build tools. Currently we don't have enough bits in the encoding to support conditional instruction like arm does. i.e. you have the option to decide whether to update the status flags or not. So what is the next best thing to have? 1. Allow both conditional and non-conditional instructions to update the status flags 2. Allow only non-conditional instructions to update the status flags Could you please let me know your thoughts on this and the reason for choosing it? Regards, Shafi
A question about combining constraints
Hi, For a private target that i am porting in GCC 4.5 I have the following pattern in my md file for call value: (define_insn "call_value_op" [(set (match_operand 0 "register_operand" "=da") (call (mem:QI (match_operand:QI 1 "call_operand" "Wd")) (match_operand:QI 2 "" "")))] "" "jsr\\t%1" [(set_attr "slottable" "has_slot")] ) All the constraints are one letter constraints for my target. Here 'W' is for symbol_ref and all others are register constraints. So for a particular combination when operand 0 is 'a' and operand 1 is 'W' i got the following ICE : error: unable to generate reloads for: (call_insn 11 4 12 2 test.c:7 (set (reg:QI 12 as0) (call (mem:QI (symbol_ref:QI ("malloc") [flags 0x41] ) [0 S1 A32]) (const_int 0 [0x0]))) 50 {call_value_op} (expr_list:REG_DEAD (reg:QI 0 g0) (expr_list:REG_EH_REGION (const_int 0 [0x0]) (nil))) (expr_list:REG_DEP_TRUE (use (reg:QI 0 g0)) (nil))) I get this ICE because the constraints are not matched properly. I ICE goes away when i write the constraints as: "=ad", "Wd" or "a,a,d,d," , "W,W,d,d" So i have the following questions: 1. Why is that constraints are not matched here? 2. When can i combine the constrains? Regards, Shafi
Re: A question about combining constraints
On 12 November 2010 18:39, Joern Rennecke wrote: > Quoting Mohamed Shafi : > >> So i have the following questions: >> >> 1. Why is that constraints are not matched here? > > Please read the node "Register Classes" in doc/tm.texi . > I am sorry , could you please highlight the relevant portion for me? In the pattern that i have given the combination (a,W) satisfies the pattern. But its not matched because i have given then like (da,Wd). I know that we can combine the constraints together. Shafi
Re: Help with reloading FP + offset addressing mode
On 30 October 2010 05:45, Joern Rennecke wrote: > Quoting Mohamed Shafi : > >> On 29 October 2010 00:06, Joern Rennecke >> wrote: >>> >>> Quoting Mohamed Shafi : >>> >>>> Hi, >>>> >>>> I am doing a port in GCC 4.5.1. For the port >>>> >>>> 1. there is only (reg + offset) addressing mode only when reg is SP. >>>> Other base registers are not allowed >>>> 2. FP cannot be used as a base register. (FP based addressing is done >>>> by copying it into a base register) >>>> In order to take advantage of FP elimination (this will create SP + >>>> offset addressing), what i did the following >>>> >>>> 1. Created a new register class (address registers + FP) and used this >>>> new class as the BASE_REG_CLASS >>> >>> Stop right there. You need to distinguish between FRAME_POINTER_REGNUM >>> and HARD_FRAME_POINTER_REGNUM. >>> >> >> From the description given in the internals, i am not able to >> understand why you suggested this. Could you please explain this? > > In order to trigger reloading of the address, you have to have a register > elimination, even if the stack pointer is not a suitable destinatination > for the elimination. Also, if you want to reload do the work for you, > you must not lie to it about the addressing capabilities of an actual hard > register. Hence, you need separate hard and soft frame pointers. > > If you have them, but conflate them when you describe what you are doing > in your port, you are not only likely to confuse the listener/reader, > but also your documentation, your code, and ultimately yourself. > Having a FRAME_POINTER_REGNUM and HARD_FRAME_POINTER_REGNUM will trigger reloading of address. But for the following pattern (insn 3 2 4 2 test.c:120 (set (mem/c/i:QI (plus:QI (reg/f:QI 35 SFP) (const_int 1 [0x1])) [0 c+0 S1 A32]) (reg:QI 0 g0 [ c ])) 7 {movqi_op} (nil)) where SFP is FRAME_POINTER_REGNUM, an elimination will result in (insn 3 2 4 2 test.c:120 (set (mem/c/i:QI (plus:QI (reg/f:QI 27 as15) (const_int 1 [0x1])) [0 c+0 S1 A32]) (reg:QI 0 g0 [ c ])) 7 {movqi_op} (nil)) where as15 is the HARD_FRAME_POINTER_REGNUM. But remember this new address is not valid (as only SP is allowed in this addressing mode). When the above pattern is reloaded i get: (insn 28 27 4 2 test.c:120 (set (mem/c/i:QI (plus:QI (reg:QI 28 a0) (const_int 1 [0x1])) [0 c+0 S1 A32]) (reg:QI 3 g3)) -1 (nil)) I get unrecognizable insn ICE, because this addressing mode is not valid. I believe this happens because when the reload_pass get the address of the form (reg + off), it assumes that the address is invalid due to one of the following: 1. 'reg' is not a suitable base register 2. the offset is out of range 3. the address has an eliminatable register as a base register. Is there any way to over come this one? Any help is appreciated. Shafi
Help with reloading
Hi, I am doing a port in GCC 4.5.1. The target supports storing immediate values into memory location represented by a symbolic address. So in the move pattern i have given constraints to represent this. (define_insn "movqi_op" [(set (match_operand:QI 0 "nonimmediate_operand" "=!Q,!Q,d,d,d,d,d,d,d,Q,R,S") (match_operand:QI 1 "general_operand" "I,J,i,W,J,d,Q,R,S,d,d,d"))] "" "@ st.s32\t%0, %1; st.u32\t%0, %1; set\t%0, %1; set.u32\t%0, %1; set.u32\t%0, %1; move\t%0, %1; ld%u0\t%0, %1; ld%u0\t%0, %1; ld%u0\t%0, %1; st%u0\t%0, %1; st%u0\t%0, %1; st%u0\t%0, %1;" ) where Q represents symbolic address, R represents all address formed using SP S represents all address formed using address registers I, J,W,i represents various const_ints d represents general registers. Whenever reload get a pattern to store const_int to a memory that is scheduled for reloading, the reload pass will match it with Q constraints. So to avoid those i added the constrain modifier '!' to 'Q'. But even then there is one particular case that causes trouble. This happens when reload pass gets a pattern where the destination is an illegal address and source is a pesudo register (no register allocated) for which reg_equiv_constant[regno] != 0. Before IRA pass: (insn 27 25 33 4 pr23848-3.c:12 (set (mem/s/j:QI (reg/f:PQI 69) [0 S1 A32]) (reg:QI 93)) 7 {movqi_op} (expr_list:REG_DEAD (reg/f:PQI 69) (expr_list:REG_EQUAL (const_int 49 [0x31]) (nil Just before reloading phase: (insn 27 25 33 4 pr23848-3.c:12 (set (mem/s/j:QI (reg/f:PQI 12 as0 [69]) [0 S1 A32]) (reg:QI 93)) 7 {movqi_op} (expr_list:REG_DEAD (reg/f:PQI 12 as0 [69]) (expr_list:REG_EQUAL (const_int 0 [0x0]) (nil Since reg93 is not allocated with any register, its replaced with reg_equiv_constant[regno], and this combination wins the (Q, I) constraint pair and in that process 'losers' (variable in loop over alternatives) becomes 0 and hence breaks out and returns. Due to this compiler crashes with "insn does not satisfy its constraints:" error. Any pointers in fixing this? Regards, Shafi P.S. When can we merge constraints? What are the criteria to decide which all constraints to merge
Re: Help with reloading
On 20 December 2010 10:56, Jeff Law wrote: > On 12/15/10 07:14, Mohamed Shafi wrote: >> >> Hi, >> >> I am doing a port in GCC 4.5.1. >> The target supports storing immediate values into memory location >> represented by a symbolic address. So in the move pattern i have given >> constraints to represent this. > > Presumably the target does not support storing an immediate value into other > MEMs? ie, the only store-immediate is to a symbolic memory operand, right? > yes you are right. > I think this is a case where you're going to need a secondary reload to > force the immediate into a register if the destination is a non-symbolic MEM > or a pseudo without a hard reg and its equivalent address is non-symbolic. > I am not sure how i should be implementing this. Currently in define_expand for move i have code to force the immediate value into a register if the destination is not a symbolic address. If i understand correctly this is the only place where i can decide what to do with the source depending on the destination. right? Moreover for the pattern (insn 27 25 33 4 pr23848-3.c:12 (set (mem/s/j:QI (reg/f:PQI 12 as0 [69]) [0 S1 A32]) (reg:QI 93)) 7 {movqi_op} (expr_list:REG_DEAD (reg/f:PQI 12 as0 [69]) (expr_list:REG_EQUAL (const_int 0 [0x0]) (nil destination is the src operand gets converted by /* This is equivalent to calling find_reloads_toplev. The code is duplicated for speed. When we find a pseudo always equivalent to a constant, we replace it by the constant. We must be sure, however, that we don't try to replace it in the insn in which it is being set. */ int regno = REGNO (recog_data.operand[i]); if (reg_equiv_constant[regno] != 0 && (set == 0 || &SET_DEST (set) != recog_data.operand_loc[i])) { /* Record the existing mode so that the check if constants are allowed will work when operand_mode isn't specified. */ if (operand_mode[i] == VOIDmode) operand_mode[i] = GET_MODE (recog_data.operand[i]); substed_operand[i] = recog_data.operand[i] = reg_equiv_constant[regno]; } and since the destination is already selected for reload /* If the address was already reloaded, we win as well. */ else if (MEM_P (operand) && address_reloaded[i] == 1) win = 1; the reload phase never reaches secondary reload. So i do not understand your answer. Could you explain it briefly. Regards, Shafi
Re: Help with reloading
On 20 December 2010 19:30, Jeff Law wrote: > On 12/20/10 01:47, Mohamed Shafi wrote: >> >> >>> I think this is a case where you're going to need a secondary reload to >>> force the immediate into a register if the destination is a non-symbolic >>> MEM >>> or a pseudo without a hard reg and its equivalent address is >>> non-symbolic. >>> >> I am not sure how i should be implementing this. >> Currently in define_expand for move i have code to force the >> immediate value into a register if the destination is not a symbolic >> address. If i understand correctly this is the only place where i can >> decide what to do with the source depending on the destination. right? > > Just changing the movxx expander is not sufficient since for this case you > do not know until reload time whether or not a particular insn needs an > extra register to implement the move. That's the whole point of the > secondary reload mechanism -- to allow you to allocate a scratch register > during reloading to handle oddball cases like this. > > > In your secondary reload code you'll need to check for the case where the > destination is a MEM and the source is an unallocated pseudo with a constant > equivalent and return a suitable register class for that case. > Jeff, thanks for the reply. I didn't know that you could do that in TARGET_SECONDARY_RELOAD hook. Can you point me to some target that does this - figuring out what the destination is based on the source or vice versa. In my case only the address operand comes into TARGET_SECONDARY_RELOAD hook during the reload pass. I am not sure how to find out the source for the pattern which has this particular address as the destination. Sorry for the trouble. Shafi
attempt to use poisoned "CONST_COSTS"
Hello everyone, I am upgrading a cross compiler from 3.2 to 3.4.6 I had to change some of the TARGET description macros in target.h file for otherwise while building i am getting "attempt to use poisoned" errors Presently what is written in target.h is 1. #define CPP_PREDEFINES \ "-Dtargetname -D__targetname__ -Amachine=targetname" corresponding macro in 3.4.6 is "TARGET_CPU_CPP_BUILTINS" 2. #define CONST_COSTS(RTX, CODE, OUTER_CODE) \ case CONST_INT: \ return target_const_costs (RTX, OUTER_CODE); \ case CONST: \ return 5; \ case LABEL_REF: \ return 1; \ case SYMBOL_REF:\ return ((TARGET_SMALL_MODEL)? 2: 3);\ case CONST_DOUBLE: \ return 10; i dont know the corresponding macro in 3.4.6 3. #define ADDRESS_COST(RTX) 1 corresponding macro in 3.4.6 is "int TARGET_ADDRESS_COST (rtx address)" 4. #define RTX_COSTS(X, CODE, OUTER_CODE) \ case MULT: \ return COSTS_N_INSNS (2); \ case DIV: \ case UDIV: \ case MOD: \ case UMOD: \ return COSTS_N_INSNS (30); \ case FLOAT: \ case FIX: \ return COSTS_N_INSNS (100); corresponding macro in 3.4.6 is "bool TARGET_RTX_COSTS (rtx x, int code, int outer_code, int *total)" 5.#define ASM_GLOBALIZE_LABEL(STREAM,NAME)\ do \ { \ fputs ("\t.globl ", STREAM);\ assemble_name (STREAM, NAME); \ fputs ("\n", STREAM); \ } \ while (0) corresponding macro in 3.4.6 is "void TARGET_ASM_GLOBALIZE_LABEL (FILE *stream, const char *name)" Now to my problem : except for TARGET_CPU_CPP_BUILTINS i dont know how to rewrite the existing macros for 3.4.6 Can anybody help me in this regard? Thanks in advance Regards, Shafi. __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com