Abt definition for structure tree

2006-10-11 Thread Mohamed Shafi
Hello all,

Can anyone tell me where i can find the definition of tree.
One structure is typedef-ed to tree. But i cant find  that structure. 
I have been hunting it for sometime.
Can some one help me.

Thanks in advance.

Regards,
Shafi






abt compiler flags

2006-10-12 Thread Mohamed Shafi
Hello all,

During regression tests if i want to disable some features like trampolines i 
can give -DNO_TRAMPOLINES
as an compiler flag.

Do i have similar flags for profiling and PIC?

thanks in advance

Regards,
Shafi






Abt SIMD Emulation

2006-10-13 Thread Mohamed Shafi
Hello all,

For targets which doesn't have simd hardware support like fr30 ,  simd stuff is 
emulated?

Is there some flags/macros in gcc to indicate that?

How is it done in other targets which deosnt have the hardware support?

Thanks in advance

Regards,
Shafi.






Re: Abt SIMD Emulation

2006-10-13 Thread Mohamed Shafi
First thanks for the reply.

I want to know what can be done in the back end of a target to indicate that 
SIMD stuff should be emulated all the way.

__attribute__ ((vector_size (NN))) is something that can be done in programs.

Is there any target macros or hooks available for that.
Will the target hook  TARGET_VECTOR_MODE_SUPPORTED_P hep me to indicate that?

Guess this is the right mailing list for my question.

Thanks in advance.

Regards,
Shafi.


- Original Message 
From: Ian Lance Taylor <[EMAIL PROTECTED]>
To: Mohamed Shafi <[EMAIL PROTECTED]>
Cc: gcc@gcc.gnu.org
Sent: Friday, October 13, 2006 8:01:11 PM
Subject: Re: Abt SIMD Emulation

Mohamed Shafi <[EMAIL PROTECTED]> writes:

This question is more appropriate for the gcc-help mailing list than
for the gcc mailing list.

> For targets which doesn't have simd hardware support like fr30 ,  simd stuff 
> is emulated?

Yes, if you use __attribute__ ((vector_size (NN))) for a target which
does not support vector registers of that size, gcc will emulate the
vector handling.

> Is there some flags/macros in gcc to indicate that?

To indicate what?

> How is it done in other targets which deosnt have the hardware support?

In the obvious tedious way: as a loop over the elements.

Ian







Abt RTL expression

2006-10-16 Thread Mohamed Shafi
hello all,

Sorry i am asking this kind of question.This might be weird to most of you but 
i am new to GCC.
Can somebody tell me how to analyze the below instruction pattern

(insn 8 6 9 1 (parallel [
(set (reg/f:SI 32)
(symbol_ref:SI ("t") ))
(clobber (reg:CC 21 cc))
]) -1 (nil)
(nil))

Will i be able to find this pattern in .md files?
what does insn 8 6 9 1 mean?
reg/f ?
 for varible declaration why is it needed to clobber CC?

Hope somebody will help me.

Thanks in advance.

Regards,
Shafi






Re: Abt RTL expression

2006-10-16 Thread Mohamed Shafi
A lot of thanks for the pointer...

>Probably. Look for at pattern with "movsi" in the name.

In th document 
http://gcc.gnu.org/onlinedocs/gccint/Insns.html#Insns

it is said that "An integer that says which pattern in the machine description 
matches
this insn, or −1 if the matching has not yet been attempted.Such matching is 
never attempted and this field remains −1 on an insn
whose pattern consists of a single use, clobber "

The integer in my pattern is -1. In fact integer for all the insn for a program 
(980602-2.c) like 

struct {
unsigned bit : 30;
} t;

int main()
{
if (!(t.bit++))
exit (0);
else
abort ();
}

is -1 for my target. Can you explain this?

Thanks in advance.

Regards,
Shafi


- Original Message 
From: Rask Ingemann Lambertsen <[EMAIL PROTECTED]>
To: Mohamed Shafi <[EMAIL PROTECTED]>
Cc: gcc@gcc.gnu.org
Sent: Monday, October 16, 2006 7:28:42 PM
Subject: Re: Abt RTL expression

On Mon, Oct 16, 2006 at 05:20:44AM -0700, Mohamed Shafi wrote:
> hello all,
> 
> Sorry i am asking this kind of question.This might be weird to most of you 
> but i am new to GCC.
> Can somebody tell me how to analyze the below instruction pattern
> 
> (insn 8 6 9 1 (parallel [
> (set (reg/f:SI 32)
> (symbol_ref:SI ("t") ))
> (clobber (reg:CC 21 cc))
> ]) -1 (nil)
> (nil))
> 
> Will i be able to find this pattern in .md files?

   Probably. Look for at pattern with "movsi" in the name.

> what does insn 8 6 9 1 mean?

   8 is the number of the insn, 6 is the number of the previous insn, 9 is
the number of the next insn and 1 is the number of the basic block to which
the insn belongs.

> reg/f ?

   This is actually documented (look for REG_POINTER):
http://gcc.gnu.org/onlinedocs/gccint/Flags.html#Flags>.

>  for varible declaration why is it needed to clobber CC?

   It depends on the target. Some targets modify the condition codes when
copying a value into a register while other targets don't. A third
possibility is that of the m68k, where storing a value in a data register
sets the condition codes while storing a value in an address register leaves
the condition codes unmodified. A fourth possibility is that of the PowerPC,
where this is optional on a per insn basis, but then you wouldn't normally
include the (clobber (reg:CC 21 cc)) version in the machine description.

-- 
Rask Ingemann Lambertsen.







Re: Abt RTL expression

2006-10-17 Thread Mohamed Shafi
> It is because matching has not yet been attempted.

ok.. so what is the option to get hold of a rtl dump after all the matching is 
done

- Original Message 
From: Rask Ingemann Lambertsen <[EMAIL PROTECTED]>
To: Mohamed Shafi <[EMAIL PROTECTED]>
Cc: gcc@gcc.gnu.org; Revital1 Eres <[EMAIL PROTECTED]>
Sent: Tuesday, October 17, 2006 1:21:30 PM
Subject: Re: Abt RTL expression

On Mon, Oct 16, 2006 at 09:32:58PM -0700, Mohamed Shafi wrote:
> 
> In th document 
> http://gcc.gnu.org/onlinedocs/gccint/Insns.html#Insns
> 
> it is said that "An integer that says which pattern in the machine 
> description matches
> this insn, or -1 if the matching has not yet been attempted.Such matching is 
> never attempted and this field remains -1 on an insn
> whose pattern consists of a single use, clobber "
> 
> The integer in my pattern is -1. In fact integer for all the insn for a 
> program (980602-2.c) like 

   It is quite normal for the first 6-7 dump files. It is because matching
has not yet been attempted.

-- 
Rask Ingemann Lambertsen







Abt code generation

2006-10-19 Thread Mohamed Shafi
Hello,

For the code (20020611-1.c)

int p;int k;unsigned int n;
void x ()
{

unsigned int h;//line 1
 h = n <= 30; //line 2 
// printf("%u\n",h);

  if (h)
p = 1;
  else
p = 0;

  if (h)
k = 1;
  else
k = 0;
}
unsigned int n = 30;

main ()
{
  x ();
  if (p != 1 || k != 1)
abort ();
  exit (0);
}


By looking rtl dump generated with -dump-rtl-rnreg for code optimization Os, my 
target generates no code for line 1 and 2 .It generates code starting with 
checking CC value for if (h). For other optimization level, it generates proper 
code.

Again, if  a printf statement is added below line 2 (commented in above 
example), then even for Optimization Os it generates proper code.

1. Can anyone suggest the probable areas for this kind of behaviour.

2. What part of Optimzation level Os deals with removing redundant codes or is 
there a way to disable them?


Thanks in advance.

Regards,
Shafi.






Abt gcses-1.c testcase

2006-10-30 Thread Mohamed Shafi

Hello all,

Can anybody tell me the purpose of the testcase
testsuite\gcc.dg\special\gcsec-1.c in the gcc testsuite ?
Is it something related with garbage clooection?

What exactly doec this testcase test ?


Thanks in advance.
Regards ,
Shafi.


Abt an RTL expression

2006-10-31 Thread Mohamed Shafi

Hello all,

Can anyone tell me what the below expression means ?


(insn 38 37 40 4 (parallel [
   (asm_operands/v ("") ("") 0 [  //line 2
   (reg:SI 32 [ s5.1 ])
 //line 3
   ]
[
   (asm_input:SI ("r"))
//line 6
   ] ("test55.c") 42)
 //line 7
   (clobber (mem:BLK (scratch) [0 A8]))
 //line 8
   ]) -1 (nil)
   (nil))

in line 2,  what is the 0 for?

what does line 3 mean?what is it purpose ?

In line 7 test55.c is the file name . why is it needed and what is 42?
In line 8 what does [0 A8] mean?

Thanks in advance,

Regards,
shafi


Abt long long support

2006-11-05 Thread Mohamed Shafi

Hello all,

Looking at a .md file of a backend it there a way to know whether a
target supports long long

Should i look for patterns with machine mode DI?
Is there some other way?

Thanks in advance for the help.

Regards,
Shafi


Re: Abt long long support

2006-11-06 Thread Mohamed Shafi

Thanks for the reply

My target (non gcc/private one) fails for long long testcases and
there are cases (with long long) which gets through, but not with the
right output. When i replace long long with long the testcases runs
fine, even those giving wrong output.
The target is not able to compile properly for simple statements like

long long a = 10;

So when i looked into the .md file i saw no patterns with DI machine
mode ,used for long long(am i right?), execpt

define_insn "adddi3"  and   define_insn "subdi3"

The .md file says that this is to prevent gcc from synthesising it,
though i didnt understand what that means.

Thats when i started to doubt if the backend provides support for long long.But
if what Rask is saying is true , which has to be i guess since you
guys are saying that,then middle end should take care of synthesizing
long long.
The 32 bit target has this defined in the .h file
LONG_TYPE_SIZE  32
LONG_LONG_TYPE_SIZE 64

Is there anything else thati should provide in the bach end to make
sure that rest of gcc
is synthesizing long long properly?

Any thoughts?



On 11/7/06, Rask Ingemann Lambertsen <[EMAIL PROTECTED]> wrote:

On Mon, Nov 06, 2006 at 10:52:00AM +0530, Mohamed Shafi wrote:
> Hello all,
>
> Looking at a .md file of a backend it there a way to know whether a
> target supports long long
> Should i look for patterns with machine mode DI?

   No. For example, 8-bit, 16-bit and 32-bit targets should normally not
define patterns such as anddi3, iordi3 and xordi3. It is possible that a
target could have no patterns with mode DI but still support long long,
although probably with significant slowdown. E.g. the middle end can
synthesize adddi3 and subdi3 from SImode operations, but I think most
targets can easily improve 10x in terms of speed and size on that code.

   Watch out for targets where units are larger than 8 bits. An example is
the c4x where a unit is 32-bits and HImode is 64-bits.

> Is there some other way?

   This depends a lot on exactly what you mean when you say support, but
grep for LONG_TYPE_SIZE and LONG_LONG_TYPE_SIZE in the .h file and compare
the two.

--
Rask Ingemann Lambertsen



Re: Abt long long support

2006-11-09 Thread Mohamed Shafi

On 11/7/06, Mike Stump <[EMAIL PROTECTED]> wrote:

On Nov 6, 2006, at 9:30 PM, Mohamed Shafi wrote:
> My target (non gcc/private one) fails for long long testcases

Does it work flawlessly otherwise, if not, fix all those problems
first.  After those are all fixed, then you can see if  it then just
works.  In particular, you will want to ensure that 32 bit things
work fine, first.


Well, the test cases fails only for one condition.
when main calls a function, like llabs ,to find the absolute value of
a negative number and the function performs the action with

return (arg<0 ? -arg : arg );

The program works fine if i pass a
1.positive value
2.use -fomit-frame-pointer flag while compiling (with negative value)
3.use another variable in function body to return  i.e

long long foo(long long x){

 long long k;
 k=(x<0 ? -x : x);
 return k;
}

When i diff the rtl dumps for programs passing negative value with and
without frame pointer i find  changes from file.greg . Thats when the
frame pointer issue kicks in.


This is a small test case which produces the bug

#include

long long fun(long long k)
{
 return ( k>0 ? k : -k);
}

int main()
{
   long long i= -1;

 if(fun(i) == 1)
   printf("\nsuccess \n");

   else
   printf("\nfailure \n");

}


here the relevant rtl dump for the function fun from .greg file

; Hard regs used:  0 1 2 3 12 13 14 21

(note 2 0 9 NOTE_INSN_DELETED)

;; Start of basic block 0, registers live: 0 [d0] 1 [d1] 14 [a6] 15
[a7] 22 [vAP]
(note 9 2 4 0 [bb 0] NOTE_INSN_BASIC_BLOCK)

(insn 4 9 5 0 (parallel [
   (set (reg/f:SI 13 a5 [31])
   (plus:SI (reg/f:SI 14 a6)
   (const_int -8 [0xfff8])))
   (clobber (reg:CC 21 cc))
   ]) 29 {addsi3} (nil)
   (nil))

(insn 5 4 6 0 (set (mem/c/i:SI (reg/f:SI 13 a5 [31]) [0 k+0 S4 A32])
   (reg:SI 0 d0 [ k ])) 16 {movsi_store} (nil)
   (nil))

(insn 6 5 7 0 (set (mem/c/i:SI (plus:SI (reg/f:SI 13 a5 [31])
   (const_int 4 [0x4])) [0 k+4 S4 A32])
   (reg:SI 1 d1 [orig:0 k+4 ] [0])) 16 {movsi_store} (nil)
   (nil))

(note 7 6 13 0 NOTE_INSN_FUNCTION_BEG)

(insn 13 7 14 0 (parallel [
   (set (reg/f:SI 13 a5 [33])
   (plus:SI (reg/f:SI 14 a6)
   (const_int -8 [0xfff8])))
   (clobber (reg:CC 21 cc))
   ]) 29 {addsi3} (nil)
   (nil))

(insn 14 13 63 0 (set (reg:SI 0 d0)
   (mem/c/i:SI (reg/f:SI 13 a5 [33]) [0 k+0 S4 A32])) 15 {movsi_load} (nil)
   (nil))

(insn 63 14 64 0 (set (reg:SI 12 a4)
   (const_int -16 [0xfff0])) 17 {movsi_short_const} (nil)
   (nil))

(insn 64 63 65 0 (parallel [
   (set (reg:SI 12 a4)
   (plus:SI (reg:SI 12 a4)
   (reg/f:SI 14 a6)))
   (clobber (reg:CC 21 cc))
   ]) 29 {addsi3} (nil)
   (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6)
   (const_int -16 [0xfff0]))
   (nil)))

(insn 65 64 15 0 (set (mem/c:SI (reg:SI 12 a4) [0 D.1863+0 S8 A32])
   (reg:SI 0 d0)) 16 {movsi_store} (nil)
   (nil))

(insn 15 65 68 0 (set (reg:SI 0 d0)
   (mem/c/i:SI (plus:SI (reg/f:SI 13 a5 [33])
   (const_int 4 [0x4])) [0 k+4 S4 A32])) 15 {movsi_load} (nil)
   (nil))

(insn 68 15 69 0 (set (reg:SI 12 a4)
   (const_int -12 [0xfff4])) 17 {movsi_short_const} (nil)
   (nil))

(insn 69 68 70 0 (parallel [
   (set (reg:SI 12 a4)
   (plus:SI (reg:SI 12 a4)
   (reg/f:SI 14 a6)))
   (clobber (reg:CC 21 cc))
   ]) 29 {addsi3} (nil)
   (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6)
   (const_int -12 [0xfff4]))
   (nil)))

(insn 70 69 73 0 (set (mem/c:SI (reg:SI 12 a4) [0 D.1863+0 S8 A32])
   (reg:SI 0 d0)) 16 {movsi_store} (nil)
   (nil))

(insn 73 70 74 0 (set (reg:SI 12 a4)
   (const_int -16 [0xfff0])) 17 {movsi_short_const} (nil)
   (nil))

(insn 74 73 75 0 (parallel [
   (set (reg:SI 12 a4)
   (plus:SI (reg:SI 12 a4)
   (reg/f:SI 14 a6)))
   (clobber (reg:CC 21 cc))
   ]) 29 {addsi3} (nil)
   (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6)
   (const_int -16 [0xfff0]))
   (nil)))

(insn 75 74 17 0 (set (reg:SI 12 a4)
   (mem/c:SI (reg:SI 12 a4) [0 D.1863+0 S8 A32])) 15 {movsi_load} (nil)
   (nil))

(insn 17 75 18 0 (set (reg:CC 21 cc)
   (compare:CC (reg:SI 12 a4)
   (const_int 0 [0x0]))) 67 {*cmpsi_internal0} (nil)
   (nil))

(jump_insn 18 17 50 0 (set (pc)
   (if_then_else (gt:CC (reg:CC 21 cc)
   (const_int 0 [0x0]))
   (label_ref 32)
   (pc))) 41 {*branch_true} (nil)
   (nil))
;; End of basic block 0, registers live:
14 [a6] 15 [a7] 22 [vAP] 28

;; Start of basic block 2, registers live: 14 [a6] 15 [a7] 22 [vAP] 28
(note 50 18 78 2 [bb 2] NOTE_INSN_BASIC_BLOCK)

(insn 78 50 79 2 (set (reg:SI 13 a5)
   (const_int -16 [0xf

Re: Abt long long support

2006-11-09 Thread Mohamed Shafi

Thanks for the input and the questions




Did you examine:

   long long l, k;
   l = -k;

for correctness by itself?  Was it valid or invalid?


Yes this is working.



[ read ahead for spoilers, I'd rather you pull this information out
of the dump and present it to us... ]

A quick glance at the rtl shows that insn 95 tries to use [a4+4] but
insn 94 clobbered a4 already, also d3 is used by insn 93, but there
isn't a set for it.


Looks like you have found out the problem.But i need to look more into it.



The way the instructions are numbered suggests that the code went
wrong before this point.  You have to read and understand all the


The instructions are numbered randomly and not in the increasing order ...
But looking at the diff of working and non working code i thought it
was not an issue.
Is this natural ? Those are ids for insns and each insns have unique id.
Is it wrong to the the insns ids to be in jumbled fashion.

and one more thing. In the dumps i noticed that before using a
register in DI mode they are all clobbred first, like

(insn 30 54 28 6 (clobber (reg:DI 34)) -1 (nil)
   (nil))

What is the use of this insns ... Why do we need to clobber these
registers befor the use? After some pass they are not seen in the
dump.


Regards,
Shafi.


Re: Abt long long support

2006-11-10 Thread Mohamed Shafi

On 11/10/06, Mike Stump <[EMAIL PROTECTED]> wrote:

On Nov 9, 2006, at 6:39 AM, Mohamed Shafi wrote:
> When i diff the rtl dumps for programs passing negative value with and
> without frame pointer i find  changes from file.greg .

A quick glance at the rtl shows that insn 95 tries to use [a4+4] but
insn 94 clobbered a4 already, also d3 is used by insn 93, but there
isn't a set for it.



The following part of the  rtl dump of greg pass is the one which is
giving the wrong output.


(insn 90 29 91 6 (set (reg:SI 12 a4)
   (const_int -16 [0xfff0])) 17 {movsi_short_const} (nil)
   (nil))

(insn 91 90 94 6 (parallel [
   (set (reg:SI 12 a4)
   (plus:SI (reg:SI 12 a4)
   (reg/f:SI 14 a6)))
   (clobber (reg:CC 21 cc))
   ]) 29 {addsi3} (nil)
   (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6)
   (const_int -16 [0xfff0]))
   (nil)))

(insn 94 91 95 6 (set (reg:SI 12 a4)
   (mem/c:SI (reg:SI 12 a4) [0 D.1863+0 S4 A32])) 15 {movsi_load} (nil)
   (nil))

(insn 95 94 31 6 (set (reg:SI 13 a5 [orig:12+4 ] [12])
   (mem/c:SI (plus:SI (reg:SI 12 a4)
   (const_int 4 [0x4])) [0 D.1863+4 S4 A32])) 15 {movsi_load} (nil)
   (nil))

(insn 31 95 87 6 (parallel [
   (set (reg:DI 2 d2)
   (minus:DI (reg:DI 0 d0 [34])
   (reg:DI 12 a4)))
   (clobber (reg:CC 21 cc))
   ]) 33 {subdi3} (nil)
   (nil))

Setting of register d3 is actually done in insns 31 . (set (reg:DI 2 d2)
Since this is in DI mode it is using d2 and d3 in DI mode.Similary d0
and a4 is accessed in DI mode. So d1 and a5 is also being used in this
insns.Hence negations is proper.

Just like Mike pointed out 95 tries to use [a4+4] but insn 94
clobbered a4 already.
The compiler should actually generate insn similar to  insn 91 and 92
in between insn 94 and 95, but not using a4,or after saving a4. This
is not happening. Insn 90 to 94 are emitted  only from greg pass
onwards.

When i inserted the necessary assembly instructions correspoinding to
movsi_short_const  and addsi3 between insns 91 and 92 in the assemble
file , the program worked fine.

There are spill codes for insns 31 in the beginning of the the .greg
file but i cant understand anything of that.

Spilling for insn 31.
Using reg 2 for reload 2
Using reg 12 for reload 3
Using reg 13 for reload 0
Using reg 13 for reload 1

The same program works for gcc 3.2 and gcc3.4.6 ports of the same private target

I am not sure whether this is because of reload pass or global
register allocation.

1. What could be the reason for this behavior?
2. How to overcome this type of behavior

Regards,
Shafi


Re: Abt long long support

2006-11-12 Thread Mohamed Shafi

First thanks very much for your thoughts



If those two instructions appear for the first time in the .greg dump
file, then they have been created by reload.


Yes they appear for the first time in .greg dump file.



> 1. What could be the reason for this behavior?

I'm really shooting in the dark here, but my guess is that you have a
define_expand for movdi that is not reload safe.  You can do this
operation correctly, you just have to reverse the instructions: load
a5 from (a4 + 4) before you load a4 from (a4).  See, e.g.,
mips_split_64bit_move in mips.c and note the use of
reg_overlap_mentioned_p.


I have already mentioned earlier in this conversation that adddi3 and
subdi3 are the only DI mode patterns in the  .md file. Then Rask
pointed out that middile end will synthesize other patterns for DI
mode looking at similar SI mode patters in the backend.

As this is the case am i to assume that the synthesized movdi pattern
is not safe for reload? Should i tweak the movsi pattern to to correct
this issue or should i write a explicit movdi pattern ?

With this in mind how come this worked fine in gcc 3.4.6 port of the
target. Has the behavior of reload changed very much in gcc 4.1.1?

Regards,
Shafi


Re: Abt long long support

2006-11-13 Thread Mohamed Shafi

I'm really shooting in the dark here, but my guess is that you have a
define_expand for movdi that is not reload safe.  You can do this
operation correctly, you just have to reverse the instructions: load
a5 from (a4 + 4) before you load a4 from (a4).  See, e.g.,
mips_split_64bit_move in mips.c and note the use of
reg_overlap_mentioned_p.



Sir, the following is a the part of .lreg dump file which is being
changed in .greg file.

(insn 29 28 31 6 (set (subreg:SI (reg:DI 34) 4)
   (const_int 0 [0x0])) 17 {movsi_short_const} (nil)
   (nil))

(insn 31 29 32 6 (parallel [
   (set (reg:DI 28 [ D.1863 ])
   (minus:DI (reg:DI 34)
   (reg:DI 28 [ D.1863 ])))
   (clobber (reg:CC 21 cc))
   ]) 33 {subdi3} (nil)
   (expr_list:REG_UNUSED (reg:CC 21 cc)
   (expr_list:REG_DEAD (reg:DI 34)
   (expr_list:REG_UNUSED (reg:CC 21 cc)
   (nil)

In greg pass some instructions are inserted between insns 29 and
31.These instruction are inserted by reload. In .greg file there is
spill code for insn 31, which is given below

Reloads for insn # 31
Reload 0: reload_in (SI) = (plus:SI (reg/f:SI 14 a6)
   (const_int -16
[0xfff0]))
ADDR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine
reload_in_reg: (plus:SI (reg/f:SI 14 a6)
   (const_int -16
[0xfff0]))
reload_reg_rtx: (reg:SI 13 a5)
Reload 1: reload_in (SI) = (plus:SI (reg/f:SI 14 a6)
   (const_int -16
[0xfff0]))
ADDR_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 2), can't combine
reload_in_reg: (plus:SI (reg/f:SI 14 a6)
   (const_int -16
[0xfff0]))
reload_reg_rtx: (reg:SI 12 a4)
Reload 2: reload_out (DI) = (mem/c:DI (plus:SI (reg/f:SI 14 a6)
   (const_int -16
[0xfff0])) [0 D.1863+0 S8 A32])
GENERAL_REGS, RELOAD_OTHER (opnum = 0)
reload_out_reg: (reg:DI 28 [ D.1863 ])
reload_reg_rtx: (reg:DI 2 d2)
Reload 3: reload_in (DI) = (mem/c:DI (plus:SI (reg/f:SI 14 a6)
   (const_int -16
[0xfff0])) [0 D.1863+0 S8 A32])
GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 2), can't combine
reload_in_reg: (reg:DI 28 [ D.1863 ])
reload_reg_rtx: (reg:DI 12 a4)

I didnt understand what these means.

The following pattern is from .greg file

(insn 29 28 90 6 (set (reg:SI 1 d1 [orig:34+4 ] [34])
   (const_int 0 [0x0])) 17 {movsi_short_const} (nil)
   (nil))

(insn 90 29 91 6 (set (reg:SI 12 a4)
   (const_int -16 [0xfff0])) 17 {movsi_short_const} (nil)
   (nil))

(insn 91 90 94 6 (parallel [
   (set (reg:SI 12 a4)
   (plus:SI (reg:SI 12 a4)
   (reg/f:SI 14 a6)))
   (clobber (reg:CC 21 cc))
   ]) 29 {addsi3} (nil)
   (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6)
   (const_int -16 [0xfff0]))
   (nil)))

(insn 94 91 95 6 (set (reg:SI 12 a4)
   (mem/c:SI (reg:SI 12 a4) [0 D.1863+0 S4 A32])) 15 {movsi_load} (nil)
   (nil))

(insn 95 94 31 6 (set (reg:SI 13 a5 [orig:12+4 ] [12])
   (mem/c:SI (plus:SI (reg:SI 12 a4)
   (const_int 4 [0x4])) [0 D.1863+4 S4 A32])) 15 {movsi_load} (nil)
   (nil))

(insn 31 95 87 6 (parallel [
   (set (reg:DI 2 d2)
   (minus:DI (reg:DI 0 d0 [34])
   (reg:DI 12 a4)))
   (clobber (reg:CC 21 cc))
   ]) 33 {subdi3} (nil)
   (nil))


As you can see insns 90,91,94 and 95 are inserted in this pass, and
the code goes wrong in insns 95/94

Why are these insns inserted in between ?
With only subdi3 and adddi3 pattern available in the md file, and no
other define_split or define_insns or define_expand for DI mode, how
can i control the instructions generated due to reload?

Regards,
Shafi


Re: poisened macro definitions

2006-12-06 Thread Mohamed Shafi

From: Markus Franke <[EMAIL PROTECTED]>
To: gcc@gcc.gnu.org
Date: Tue, 05 Dec 2006 21:37:30 +0100
Subject: poisened macro definitions
Dear GCC Developers,

I want to port an existing backend (based on version gcc-2.7.2.3) on the
most recent release (gcc-4.1.1). During compilation process I get
several messages about some poisened macro definitions. The macros which
make problems are listed below:

---snip---

--snap---

I read something about poisened macros and that they shouldn't be used
anymore. But in fact I was not able to find any documentation about
these macros. When were they declared as poisened and especially why?
What should be done instead of using this macros? Just uncommenting
everything can't be a solution. I was also looking in GCC-Internals
manual without any success.



Most of the target macros in older version of gcc have been converted
into target hooks.
These macros which are converted are now poisoned macros.
So you will have to replace the macros with the corresponding target hooks.
Some macros will be mergerd into one target hook while some will be
replaced with a target hook. You will have to look into internals of
4.1.1 to find them out.

The following messages should help you out
http://gcc.gnu.org/ml/gcc/2006-08/msg00451.html

http://gcc.gnu.org/ml/gcc-help/2006-08/msg00213.html

Hope this helps.

Regards,
Shafi


Defining cmd line symbolic literals

2006-12-17 Thread Mohamed Shafi

Hello all,

I am building a GCC Compiler. I have some ifdef checks in the compiler
source code. In case i define a symbolic literal in command line while
compiling a sample program, I want that set of statements to be
invoked after ifdef checks.

e.g.
GCC Source:
#ifdef SHAFI_DEBUG
printf("\n Shafi Debugging!!\n");
#endif

compiling 1.c:
gcc -DSHAFI_DEBUG 1.c

Is there any way to do this ?

Thanks in advance

Regards,
Shafi


Arithmetic conversions between two different data types

2007-01-31 Thread Mohamed Shafi

Hello all,

In arithmetic expressions we need to conversion when the operands are
of different data types.
In gcc 4.1.1 where is this process started?

Is this in c-typeck.c, particularly in the function c_common_type ?


Thanks in advance,

Regards,
Shafi.


Strange behavior for scan-tree-dump testing

2007-02-25 Thread Mohamed Shafi

Hello all,

I added few testcases to the existing testsuite in gcc 4.1.1 for a
private target.
After running the testsuite i found out that all my test cases with
scan-tree-dump testing failed for one particular situation.
The values are scanned from gimple tree dump and its fails for cases like

b4 = 6.3e+1
c1 = 1.345286102294921875e+0

but it was not failing for other values in the same tree dump which
has values like

some_identifier = x.xe-1
some_identifier = x.xe0

The failures are only when the tree dump values are positive and
represented in the above format. I checked the tree dumps manually and
found out that all the values are proper and the scan lines in the
test-cases are also proper. this is way i used them

/* { dg-final { scan-tree-dump "b4 = 6.3e+1" "gimple" } } */

Why is this behavior ? For positive values should i be writing it in
some other way?

One other question is that i am getting "test for excess errors" Fails
for some cases which produce lot of warnings but otherwise proper.

Can anyone help me?

Thanks in advance.

Regards,
Shafi.


Providing default option to GAS through gcc driver

2007-02-28 Thread Mohamed Shafi

Hello all,

I would like to know if there is any way to control the gcc driver
program to pass a default option  to the assembler if no option/switch
is given.
say
-march=arch1
if no -march option is provided by the user.

The macro TARGET_OPTION_TRANSLATE_TABLE does something like this but
with this, one can only override the option and cannot provide one if
none is given

Is there anyway to do this?

Regards,
Shafi.


What to do when constraint doesn't match

2007-03-14 Thread Mohamed Shafi

Hello all,

Looking at the internals i couldn't find an answer for my problem.

I have a define_expand with the pattern name mov and a
define_insn mov_store
The predicate in define_expand is general_operand, so that all
operands are matched.
While in define_insn i have a predicate which allows only two class of
registers say 'a' and 'b'. But the constraint for define_insn only
allows registers of class 'b'.

I also have a pattern for register move from 'a' to 'b', call it
mova2b. So if for mov_store define_insn  constraint
doesn't satisfy why is that the compiler is not trying to match the
constraint by generating a mova2b pattern? Is there something
that i am missing here?

Regards,
Shafi


peephole patterns are not matching

2007-04-12 Thread Mohamed Shafi

hello everyone,

I have the following 2 patterns which are consecutive. (from shorten
rtl dump file)

(insn 69 34 70 (set (reg:SQ 0 d0)
   (reg:SQ 18 f2)) 79 {movsq} (nil)
   (nil))

(insn 70 69 35 (set (reg:SQ 16 f0 [orig:38 D.3693 ] [38])
   (reg:SQ 0 d0)) 79 {movsq} (nil)
   (nil))


For the above pattern i wrote a peephole like this

(define_peephole
 [(set (match_operand:SF 0 "data_reg" "=d")
  (match_operand:SF 1 "float_reg"  "f"))
   (set (match_operand:SF 2 "float_reg" "=f")
  (match_operand:SF 3 "data_reg"  "d"))]
 "REGNO(operands[0]) == REGNO(operands[3])"
 "movf\\t%1, %3"
)

even i wrote define_peephole2 which is similar to the above.
But the above patterns are not matched at all. But i can find these
patterns in the rtl dumps.

What could be the reason for this behavior?

Regards,
Shafi


Re: peephole patterns are not matching

2007-04-12 Thread Mohamed Shafi

On 4/12/07, Andreas Schwab <[EMAIL PROTECTED]> wrote:

"Mohamed Shafi" <[EMAIL PROTECTED]> writes:

> hello everyone,
>
> I have the following 2 patterns which are consecutive. (from shorten
> rtl dump file)
>
> (insn 69 34 70 (set (reg:SQ 0 d0)
>(reg:SQ 18 f2)) 79 {movsq} (nil)
>(nil))
>
> (insn 70 69 35 (set (reg:SQ 16 f0 [orig:38 D.3693 ] [38])
>(reg:SQ 0 d0)) 79 {movsq} (nil)
>(nil))
>
>
> For the above pattern i wrote a peephole like this
>
> (define_peephole
>  [(set (match_operand:SF 0 "data_reg" "=d")
> (match_operand:SF 1 "float_reg"  "f"))
>(set (match_operand:SF 2 "float_reg" "=f")
> (match_operand:SF 3 "data_reg"  "d"))]

The patterns match mode SF, but the insns have mode SQ.


sorry actually the patterns are like this

(insn 69 34 70 (set (reg:SF 0 d0)
   (reg:SF 18 f2)) 79 {movsf} (nil)
   (nil))

(insn 70 69 35 (set (reg:SF 16 f0 [orig:38 D.3693 ] [38])
   (reg:SF 0 d0)) 79 {movsf} (nil)
   (nil))

and the peephole is same as the above



Andreas.

--
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



How to control the offset for stack operation?

2007-04-16 Thread Mohamed Shafi

hello all,

Depending on the machine mode the compiler will generate automatically
the offset required for the stack operation i.e for a machine with
word size is 32, for char type the offset is 1, for int type the
offset is 2 and so on..

Is there a way to control this ? i mean say for long long the offset
is 4 if long long is mapped to TI mode and i want the generate the
offset such that it is 2.

Is there a way to do this in gcc ?

Regards,
Shafi


Re: How to control the offset for stack operation?

2007-04-16 Thread Mohamed Shafi

On 4/16/07, J.C. Pizarro <[EMAIL PROTECTED]> wrote:

2007/4/16, Mohamed Shafi <[EMAIL PROTECTED]>:
> hello all,
>
> Depending on the machine mode the compiler will generate automatically
> the offset required for the stack operation i.e for a machine with
> word size is 32, for char type the offset is 1, for int type the
> offset is 2 and so on..
>
> Is there a way to control this ? i mean say for long long the offset
> is 4 if long long is mapped to TI mode and i want the generate the
> offset such that it is 2.
>
> Is there a way to do this in gcc ?
>
> Regards,
> Shafi
>

For a x86 machine, the stack's offset always is multiple of 4 bytes.

long long is NOT 4 bytes, is 8 bytes!


  I was not talking about the size of long long but the offset i.e
4x32, required for stack operation.
I want gcc to generate the code such that the offset is 2 (64
bytes)and not 4 (128 bytes)

Is there a way to do this?



Sincerely J.C. Pizarro :)



Re: How to control the offset for stack operation?

2007-04-16 Thread Mohamed Shafi

On 4/16/07, J.C. Pizarro <[EMAIL PROTECTED]> wrote:

2007/4/16, Mohamed Shafi <[EMAIL PROTECTED]> wrote:
> > > Depending on the machine mode the compiler will generate automatically
> > > the offset required for the stack operation i.e for a machine with
> > > word size is 32, for char type the offset is 1, for int type the
> > > offset is 2 and so on..
>
>I was not talking about the size of long long but the offset i.e
> 4x32, required for stack operation.
> I want gcc to generate the code such that the offset is 2 (64
> bytes)and not 4 (128 bytes)
>

Offset in bytes? Offset in 32-bit words?
Please, define offset? You confuse.


Offset in 32-bit words.



J.C. Pizarro



ICE in get_constraint_for_component_ref

2011-02-09 Thread Mohamed Shafi
Hi all,

I am trying to port a private target in GCC 4.5.1. Following are the
properties of the target

#define BITS_PER_UNIT   32
#define BITS_PER_WORD32
#define UNITS_PER_WORD   1


#define CHAR_TYPE_SIZE32
#define SHORT_TYPE_SIZE   32
#define INT_TYPE_SIZE 32
#define LONG_TYPE_SIZE32
#define LONG_LONG_TYPE_SIZE   32



I am getting an ICE
internal compiler error: in get_constraint_for_component_ref, at
tree-ssa-structalias.c:3031

For the following testcase:

struct fb_cmap {
 int start;
 int len;
 int *green;
};

extern struct fb_cmap fb_cmap;

void directcolor_update_cmap(void)
{
  fb_cmap.green[0] = 34;
}

The following is the output of debug_tree of the argument thats given
for the function get_constraint_for_component_ref


unit size 
align 32 symtab 0 alias set -1 canonical type
0x2b6a4554a498 precision 32 min  max 
pointer_to_this >
unsigned PQI size  unit size

align 32 symtab 0 alias set -1 canonical type 0x2b6a45559930>

arg 0 
unit size 
align 32 symtab 0 alias set -1 canonical type
0x2b6a45602888 fields  context

chain >
used public external common BLK file pr28675.c line 7 col 23
size  unit size 
align 32
chain 
public static QI file pr28675.c line 9 col 6 align 32
initial  result 
(mem:QI (symbol_ref:PQI ("directcolor_update_cmap") [flags
0x3] ) [0 S1
A32])
struct-function 0x2b6a455453f0>>
arg 1 
unsigned PQI file pr28675.c line 4 col 7 size  unit size 
align 32 offset_align 32
offset 
bit offset  context
>
pr28675.c:11:10>

I was wondering if this ICE is due to the fact that this is a 32bit
char target ? Can somebody help me with pointers to debug this issue?

Regards,
Shafi


Re: ICE in get_constraint_for_component_ref

2011-02-10 Thread Mohamed Shafi
On 10 February 2011 15:57, Richard Guenther  wrote:
> On Thu, Feb 10, 2011 at 6:23 AM, Mohamed Shafi  wrote:
>> Hi all,
>>
>> I am trying to port a private target in GCC 4.5.1. Following are the
>> properties of the target
>>
>> #define BITS_PER_UNIT           32
>> #define BITS_PER_WORD        32
>> #define UNITS_PER_WORD       1
>>
>>
>> #define CHAR_TYPE_SIZE        32
>> #define SHORT_TYPE_SIZE       32
>> #define INT_TYPE_SIZE         32
>> #define LONG_TYPE_SIZE        32
>> #define LONG_LONG_TYPE_SIZE   32
>>
>>
>>
>> I am getting an ICE
>> internal compiler error: in get_constraint_for_component_ref, at
>> tree-ssa-structalias.c:3031
>>
>> For the following testcase:
>>
>> struct fb_cmap {
>>  int start;
>>  int len;
>>  int *green;
>> };
>>
>> extern struct fb_cmap fb_cmap;
>>
>> void directcolor_update_cmap(void)
>> {
>>  fb_cmap.green[0] = 34;
>> }
>>
>> The following is the output of debug_tree of the argument thats given
>> for the function get_constraint_for_component_ref
>>
>> >    type >        type >            size 
>>            unit size 
>>            align 32 symtab 0 alias set -1 canonical type
>> 0x2b6a4554a498 precision 32 min > -2147483648> max 
>>            pointer_to_this >
>>        unsigned PQI size  unit size
>> 
>>        align 32 symtab 0 alias set -1 canonical type 0x2b6a45559930>
>>
>>    arg 0 >        type >            size 
>>            unit size 
>>            align 32 symtab 0 alias set -1 canonical type
>> 0x2b6a45602888 fields  context
>> 
>>            chain >
>>        used public external common BLK file pr28675.c line 7 col 23
>> size  unit size > 0x2b6a455fc488 3>
>>        align 32
>>        chain > type 
>>            public static QI file pr28675.c line 9 col 6 align 32
>> initial  result > D.1200>
>>            (mem:QI (symbol_ref:PQI ("directcolor_update_cmap") [flags
>> 0x3] ) [0 S1
>> A32])
>>            struct-function 0x2b6a455453f0>>
>>    arg 1 
>>        unsigned PQI file pr28675.c line 4 col 7 size > 0x2b6a4553c460 32> unit size 
>>        align 32 offset_align 32
>>        offset 
>>        bit offset  context
>> >
>>    pr28675.c:11:10>
>>
>> I was wondering if this ICE is due to the fact that this is a 32bit
>> char target ? Can somebody help me with pointers to debug this issue?
>
> Try fixing the * 8 in bitpos_of_field to use BITS_PER_UNIT.
>

That did the trick. Looking at the code i assume that this is proper
and hence should be committed in the trunk and 4.5 branch.  Will that
be done?

Shafi


Re: ICE in get_constraint_for_component_ref

2011-02-10 Thread Mohamed Shafi
On 10 February 2011 17:16, Richard Guenther  wrote:
> On Thu, Feb 10, 2011 at 12:42 PM, Mohamed Shafi  wrote:
>> On 10 February 2011 15:57, Richard Guenther  
>> wrote:
>>> On Thu, Feb 10, 2011 at 6:23 AM, Mohamed Shafi  wrote:
>>>> Hi all,
>>>>
>>>> I am trying to port a private target in GCC 4.5.1. Following are the
>>>> properties of the target
>>>>
>>>> #define BITS_PER_UNIT           32
>>>> #define BITS_PER_WORD        32
>>>> #define UNITS_PER_WORD       1
>>>>
>>>>
>>>> #define CHAR_TYPE_SIZE        32
>>>> #define SHORT_TYPE_SIZE       32
>>>> #define INT_TYPE_SIZE         32
>>>> #define LONG_TYPE_SIZE        32
>>>> #define LONG_LONG_TYPE_SIZE   32
>>>>
>>>>
>>>>
>>>> I am getting an ICE
>>>> internal compiler error: in get_constraint_for_component_ref, at
>>>> tree-ssa-structalias.c:3031
>>>>
>>>> For the following testcase:
>>>>
>>>> struct fb_cmap {
>>>>  int start;
>>>>  int len;
>>>>  int *green;
>>>> };
>>>>
>>>> extern struct fb_cmap fb_cmap;
>>>>
>>>> void directcolor_update_cmap(void)
>>>> {
>>>>  fb_cmap.green[0] = 34;
>>>> }
>>>>
>>>> The following is the output of debug_tree of the argument thats given
>>>> for the function get_constraint_for_component_ref
>>>>
>>>> >>>    type >>>        type >>>            size 
>>>>            unit size 
>>>>            align 32 symtab 0 alias set -1 canonical type
>>>> 0x2b6a4554a498 precision 32 min >>> -2147483648> max 
>>>>            pointer_to_this >
>>>>        unsigned PQI size  unit size
>>>> 
>>>>        align 32 symtab 0 alias set -1 canonical type 0x2b6a45559930>
>>>>
>>>>    arg 0 >>>        type >>>            size 
>>>>            unit size 
>>>>            align 32 symtab 0 alias set -1 canonical type
>>>> 0x2b6a45602888 fields  context
>>>> 
>>>>            chain >
>>>>        used public external common BLK file pr28675.c line 7 col 23
>>>> size  unit size >>> 0x2b6a455fc488 3>
>>>>        align 32
>>>>        chain >>> type 
>>>>            public static QI file pr28675.c line 9 col 6 align 32
>>>> initial  result >>> D.1200>
>>>>            (mem:QI (symbol_ref:PQI ("directcolor_update_cmap") [flags
>>>> 0x3] ) [0 S1
>>>> A32])
>>>>            struct-function 0x2b6a455453f0>>
>>>>    arg 1 >>> 0x2b6a45559930>
>>>>        unsigned PQI file pr28675.c line 4 col 7 size >>> 0x2b6a4553c460 32> unit size 
>>>>        align 32 offset_align 32
>>>>        offset 
>>>>        bit offset  context
>>>> >
>>>>    pr28675.c:11:10>
>>>>
>>>> I was wondering if this ICE is due to the fact that this is a 32bit
>>>> char target ? Can somebody help me with pointers to debug this issue?
>>>
>>> Try fixing the * 8 in bitpos_of_field to use BITS_PER_UNIT.
>>>
>>
>> That did the trick. Looking at the code i assume that this is proper
>> and hence should be committed in the trunk and 4.5 branch.  Will that
>> be done?
>
> I'll include it in one of my next bootstraps/tests and commit it.
>
Thanks Richard :)

Shafi


Reloading an auto-increment addresses

2011-02-11 Thread Mohamed Shafi
Hello all,

I am porting GCC 4.5.1 for a private target. For one particular test
reloading pass is being asked to reload the following instruction:

(insn 45 175 46 11 pr20601-1.c:90 (set (reg/f:PQI 3 g3 [70])
(mem/f:PQI (pre_inc:PQI (reg/f:PQI 1 g1 [orig:55 prephitmp.16
] [55])) [2 S1 A32])) 9 {movpqi_op} (expr_list:REG_INC (reg/f:PQI 1 g1
[orig:55 prephitmp.16 ] [55])
(nil)))

The address is invalid in this. Base address should always be stored
in the address register. This instruction gets reloaded  in the
following manner:

(insn 175 43 202 11 pr20601-1.c:90 (set (reg/f:PQI 1 g1 [orig:55
prephitmp.16 ] [55])
(reg/f:PQI 12 as0 [orig:49 e.4 ] [49])) 9 {movpqi_op} (nil))

(insn 202 175 203 11 pr20601-1.c:90 (set (reg/f:PQI 1 g1 [orig:55
prephitmp.16 ] [55])
(plus:PQI (reg/f:PQI 1 g1 [orig:55 prephitmp.16 ] [55])
(const_int 1 [0x1]))) 14 {addpqi3} (nil))

(insn 203 202 45 11 pr20601-1.c:90 (set (reg:PQI 28 a0)
(reg/f:PQI 1 g1 [orig:55 prephitmp.16 ] [55])) 9 {movpqi_op} (nil))

(insn 45 203 46 11 pr20601-1.c:90 (set (reg/f:PQI 3 g3 [70])
(mem/f:PQI (reg:PQI 28 a0) [2 S1 A32])) 9 {movpqi_op} (nil))

The issue with this reload is that there is no move operation between
GP registers and address registers. So insn 203 is invalid. I am
catching these kinds in secondary reloads, but auto-increment
addressing modes are not handled in that . So if i try to do that in
TARGET_SECONDARY_RELOAD i am getting assert failure from
reload1.c:emit_input_reload_insns() due to the following code:

  /* Auto-increment addresses must be reloaded in a special way.  */
  if (rl->out && ! rl->out_reg)
{
  /* We are not going to bother supporting the case where a
 incremented register can't be copied directly from
 OLDEQUIV since this seems highly unlikely.  */
  gcc_assert (rl->secondary_in_reload < 0);

How can i overcome this failure?  Can some one suggest a solution?

Thanks for the help.

Regards,
Shafi


Re: Reloading an auto-increment addresses

2011-02-11 Thread Mohamed Shafi
On 11 February 2011 15:28, Paulo J. Matos  wrote:
>
>
> On 11/02/11 09:46, Mohamed Shafi wrote:
>>
>> How can i overcome this failure?  Can some one suggest a solution?
>>
>
>
> Have you defined TARGET_LEGITIMATE_ADDRESS_P and also BASE_REG_CLASS
> correctly for your target?
>
>

Yes, I have. Register allocator is allocating the wrong registers for
the base registers. This probably is due to the fact that address
registers cannot be saved and restored directly, a secondary reload is
required. There is also the restriction that there is no move
operation between the address registers. For that also a secondary
reload is required. (I know its weird). I am trying to figure out why
register allocator is not assigning a base register. But even then,
reload could be asked to reload a auto-increment addresses.

Shafi


How to generate loop counter with a different mode ?

2011-05-16 Thread Mohamed Shafi
Hi all,

I am trying to add support for hardware loops for a 32bit target. In
the target QImode is 32bit. The loop counter used in hardware loop
construct is 17bit address registers. This is represented using
PQImode. Since mode for the doloop pattern is found out after loop
discovery it need not be always PQImode . So what i did was to convert
the mode of the counter variable to PQImode then emit the a new
pattern with PQImode along with other bells and whistles required by
the target loop construct. I am able to generate the assembly files
with the proper loop initialization instructions and all. But the
issue is that the loop counter is set to 0 in the body of the loop.

In define_expand (in doloop_end and doloop_begin) I am converting to
PQImode using the following construct:

operands[0] = convert_to_mode (PQImode, operands[0], 0);

So the above construct will result in an rtl pattern like:

(insn 33 17 34 4 loop.c:52 (set (reg:PQI 50)
(truncate:PQI (reg:QI 49))) -1 (nil))

But GCC will extract the loop counter from the define_expand generated
doloop pattern, which is in PQImode.

(insn 33 17 34 4 loop.c:52 (set (reg:PQI 50)
(truncate:PQI (reg:QI 49))) -1 (nil))

(jump_insn 34 33 20 4 loop.c:52 (parallel [
(set (pc)
(if_then_else (ne (reg:PQI 50)
(const_int 1 [0x1]))
(label_ref:PQI 30)
(pc)))
(set (reg:PQI 50)
(plus:PQI (reg:PQI 50)
(const_int -1 [0x])))
(unspec [
(const_int 0 [0x0])
] 3)
(clobber (scratch:PQI))
]) 62 {doloop_end_pqi} (expr_list:REG_BR_PROB (const_int 9100 [0x238c])
(nil))
 -> 30)


This is the counter value that gets used for doloop begin. Hence the
original loop counter (reg:QI 49) never gets initialized. Due to this
'if-conversion' pass will modify the statement to:

(insn 33 38 34 4 loop.c:52 (set (reg:PQI 50)
(const_int 0 [0x0])) 9 {movpqi_op} (nil))

This results in loop counter being set to 0 in the body of the loop.
Can someone suggest me solution to get out of this?

Regards,
Shafi


Issue with delay slot scheduling?

2011-09-06 Thread Mohamed Shafi
Hi,

I am doing a private port in GCC 4.5.1. For the my target i see some
strange behavior in delay slot scheduling. For my target the
instruction in the delay slots gets executed irrespective of whether
the branch is taken or not. I have generated the following code after
commenting out the call to 'relax_delay_slots' in the function
'dbr_schedule'.

RTL:

(insn 97 42 51 del1.c:19 (sequence [
    (jump_insn 61 42 38 del1.c:19 (set (pc)
    (if_then_else (ne (reg:CCF 34 CC)
    (const_int 0 [0x0]))
    (label_ref:PQI 86)
    (pc))) 56 {conditional_branch}
(expr_list:REG_BR_PRED (const_int 5 [0x5])
    (expr_list:REG_DEAD (reg:CCF 34 CC)
    (expr_list:REG_BR_PROB (const_int 5000 [0x1388])
    (nil
 -> 86)
    (insn 38 61 43 (set (mem/s/j:QI (reg/f:PQI 28 a0 [orig:62
D.1955 ] [62]) [0 bytes S1 A32])
    (reg:QI 1 g1 [orig:65 D.1938 ] [65])) 7 {movqi_op} (nil))
    (insn 43 38 51 (set (reg:QI 1 g1 [75])
    (ior:QI (reg:QI 1 g1 [orig:65 D.1938 ] [65])
    (reg:QI 3 g3 [77]))) 31 {iorqi3}
(expr_list:REG_EQUAL (ior:QI (reg:QI 1 g1 [orig:65 D.1938 ] [65])
    (const_int 128 [0x80]))
    (nil)))
    ]) -1 (nil))

(code_label 51 97 52 1 "" [2 uses])

(note 52 51 73 [bb 4] NOTE_INSN_BASIC_BLOCK)

(jump_insn 73 52 72 (return) 72 {return_rts} (expr_list:REG_BR_PRED
(const_int 12 [0xc])
    (nil)))

(barrier 72 73 86)

(code_label 86 72 41 5 "" [1 uses])

(note 41 86 45 [bb 5] NOTE_INSN_BASIC_BLOCK)

(insn 45 41 44 del1.c:20 (set (reg:QI 2 g2 [orig:68 ivtmp.7 ] [68])
    (plus:QI (reg:QI 2 g2 [orig:68 ivtmp.7 ] [68])
    (const_int 1 [0x1]))) 13 {addqi3} (nil))

(insn 44 45 101 del1.c:20 (set (mem/s/j:QI (reg/f:PQI 28 a0 [orig:62
D.1955 ] [62]) [0 bytes S1 A32])
    (reg:QI 1 g1 [75])) 7 {movqi_op} (expr_list:REG_DEAD
(reg/f:PQI 28 a0 [orig:62 D.1955 ] [62])
    (expr_list:REG_DEAD (reg:QI 1 g1 [75])
    (nil

(code_label 101 44 79 7 "" [1 uses])


Corresponding code:

jmp.ne  .L5;
st  [a0], g1; (INSN 38)
or  g1, g1, g3;  (INSN 43)
.L1:
rts;
nop;
nop;
.L5:
add   g2, g2, 1;   (INSN 45)
st  [a0], g1;(INSN 44)  -> deleted
.L7:



You can see that INSN 44 and INSN 38 are identical. In
'relax_delay_slots' while processing INSN 97, the second call to
'try_merge_delay_insns' deletes the INSN 44 because of which
unexpected result is generated.

  /* If we own the thread opposite the way this insn branches, see if we
 can merge its delay slots with following insns.  */
  if (INSN_FROM_TARGET_P (XVECEXP (pat, 0, 1))
  && own_thread_p (NEXT_INSN (insn), 0, 1))
try_merge_delay_insns (insn, next);
  else if (! INSN_FROM_TARGET_P (XVECEXP (pat, 0, 1))
   && own_thread_p (target_label, target_label, 0))
try_merge_delay_insns (insn, next_active_insn (target_label));

Deleting the INSN 44 would have been proper if the 2nd delay slot insn
had not modified G1. But looking at the comments from the function
'try_merge_delay_insns'

/* Try merging insns starting at THREAD which match exactly the insns in
   INSN's delay list.

   If all insns were matched and the insn was previously annulling, the
   annul bit will be cleared.

   For each insn that is merged, if the branch is or will be non-annulling,
   we delete the merged insn.  */

I think REGOUT dependency of g1 between instructions 38 and 43 in the
delay slot is not being considered by 'try_merge_delay_insns'.

Is this a bug?

Regards,
Shafi


Re: Issue with delay slot scheduling?

2011-09-06 Thread Mohamed Shafi
On 6 September 2011 20:50, Jeff Law  wrote:
>
> On 09/06/11 08:46, Mohamed Shafi wrote:
>> Hi,
>>
>> I am doing a private port in GCC 4.5.1. For the my target i see some
>> strange behavior in delay slot scheduling. For my target the
>> instruction in the delay slots gets executed irrespective of whether
>> the branch is taken or not. I have generated the following code
>> after commenting out the call to 'relax_delay_slots' in the function
>> 'dbr_schedule'.
> [ ... ]
> It looks like you have found a bug.  While reorg.c is supposed to work
> with targets that have multiple delay slots, it's not something that has
> been extensively tested.
>
>>>
>> I think REGOUT dependency of g1 between instructions 38 and 43 in
>> the delay slot is not being considered by 'try_merge_delay_insns'.
> You're probably correct.
>
> Jeff

How do raise a bug report, mine being a private target?

Regards,
Shafi


Reloading going wrong. Bug in GCC?

2011-09-14 Thread Mohamed Shafi
Hi,

I am working on a 32bit private target which has the following restriction

1. store/load can happen only through a general purpose register (GP_REGS)
2. base register should be an address register (AD_REGS)
3. moves between GP_REGS and AD_REGS can happen only through PT_REGS

In a PRE_MODIFY instruction when both the base register and the output
register gets spilled the reloading is going wrong.

befor IRA pass
~~~
(insn 259 336 317 2 ../rld_bug.c:94 (set (reg:QI 234 [+1 ])
(mem/s/j/c:QI (pre_modify:PQI (reg/f:PQI 233)
(plus:PQI (reg/f:PQI 233)
(const_int 1 [0x1]))) [0+1 S1 A32])) 7 {movqi_op}
(expr_list:REG_INC (reg/f:PQI 233)
(nil)))

after IRA pass
~~~
Reloads for insn # 259
Reload 0: GP_REGS, RELOAD_FOR_OPADDR_ADDR (opnum = 1), can't combine,
secondary_reload_p
reload_reg_rtx: (reg:PQI 11 g11)
Reload 1: PT_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), can't
combine, secondary_reload_p
reload_reg_rtx: (reg:PQI 12 as0)
secondary_in_reload = 0
Reload 2: GP_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), can't
combine, secondary_reload_p
reload_reg_rtx: (reg:PQI 11 g11)
Reload 3: PT_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), can't
combine, secondary_reload_p
reload_reg_rtx: (reg:PQI 13 as1)
secondary_out_reload = 2

Reload 4: reload_in (PQI) = (reg/f:PQI 233)
reload_out (PQI) = (reg/f:PQI 233)
AD_REGS, RELOAD_OTHER (opnum = 1)
reload_in_reg: (reg/f:PQI 233)
reload_out_reg: (reg/f:PQI 233)
reload_reg_rtx: (reg:PQI 31 a3)
secondary_in_reload = 1, secondary_out_reload = 3

Reload 5: reload_out (QI) = (reg:QI 234 [+1 ])
GP_REGS, RELOAD_FOR_OUTPUT (opnum = 0)
reload_out_reg: (reg:QI 234 [+1 ])
reload_reg_rtx: (reg:QI 11 g11)


(insn 744 336 745 2 ../rld_bug.c:94 (set (reg:PQI 11 g11)
(mem/c:PQI (plus:PQI (reg/f:PQI 32 sp)
(const_int -24 [0xffe8])) [99 %sfp+8 S1
A32])) 9 {movpqi_op} (nil))

(insn 745 744 746 2 ../rld_bug.c:94 (set (reg:PQI 12 as0)
(reg:PQI 11 g11)) 9 {movpqi_op} (nil))

(insn 746 745 259 2 ../rld_bug.c:94 (set (reg:PQI 31 a3)
(reg:PQI 12 as0)) 9 {movpqi_op} (nil))

(insn 259 746 747 2 ../rld_bug.c:94 (set (reg:QI 11 g11)
(mem/s/j/c:QI (pre_modify:PQI (reg:PQI 31 a3)
(plus:PQI (reg:PQI 31 a3)
(const_int 1 [0x1]))) [0+1 S1 A32])) 7 {movqi_op}
(expr_list:REG_INC (reg:PQI 31 a3)
(nil)))

(insn 747 259 748 2 ../rld_bug.c:94 (set (reg:PQI 13 as1)
(reg:PQI 31 a3)) 9 {movpqi_op} (nil))

(insn 748 747 749 2 ../rld_bug.c:94 (set (reg:PQI 11 g11)
(reg:PQI 13 as1)) 9 {movpqi_op} (nil))

(insn 749 748 750 2 ../rld_bug.c:94 (set (mem/c:PQI (plus:PQI (reg/f:PQI 32 sp)
(const_int -24 [0xffe8])) [99 %sfp+8 S1 A32])
(reg:PQI 11 g11)) 9 {movpqi_op} (nil))

(insn 750 749 751 2 ../rld_bug.c:94 (set (mem/c:QI (plus:PQI (reg/f:PQI 32 sp)
(const_int -29 [0xffe3])) [99 %sfp+3 S1 A32])
(reg:QI 11 g11)) 7 {movqi_op} (nil))


After IRA pass for insn 259 1st the modified address is stored into
its spilled location and then the modified value is stored. As you can
see from the instructions same register (g11) is used for Reload 5 and
2, and hence the modified value is getting corrupted and hence the
modified address gets stored instead of modified value (insn 749 and
insn 750). I am not able to figure out where this is going wrong in
the reload phase. I suspect that this is a GCC issue.

Can some one give me some pointers to resolve this issue?

Regards,
Shafi


Function argument passing

2009-07-13 Thread Mohamed Shafi
Hello all,

I am doing a port for a private target in GCC 4.4.0. It generates code
for both little & big endian.

The ABI for the target is as follows:

1. All arguments passed in stack are passed using their alignment constrains.
Solution: For this to happen no argument promotion should be done.

2. Functions with a variable number of arguments pass the last fixed
argument and all subsequent variable arguments on the stack. Such
arguments of fewer than 4 bytes are located on the stack as if the
argument had been promoted to 32 bits.

Solution:
For TARGET_STRICT_ARGUMENT_NAMING the internals says the following :

This hook controls how the named argument to FUNCTION_ARG is set for
varargs and stdarg functions. If this hook returns true, the named
argument is always true for named arguments, and false for unnamed
arguments. If it returns false, but
TARGET_PRETEND_OUTGOING_VARARGS_NAMED returns true, then all arguments
are treated as named. Otherwise, all named arguments except the last
are treated as named.

So i made both TARGET_STRICT_ARGUMENT_NAMING and
PRETEND_OUTGOING_VARARGS_NAMED to return false. Is this correct?

How to make the varargs argument to be promoted to 32bits when the
normal argument don't require promotion as mentioned in point (1) ?

3. A function returning a structure or union receives in D0 the
address of the returned structure or union. The caller allocates space
for the returned object.
Solution: Used TARGET_FUNCTION_VALUE and returned D0 reg_rtx for
structure and unions.

4. A long long return value is returned in R6 and R7, R6 containing
the most significant  long word and R7 containing the least
significant long word, regardless of the endianess mode.
Solution: Used TARGET_RETURN_IN_MSB to return true when the mode is
little endian

5. If the first argument is a long long , it is passed in R6 and R7,
R6 containing the most significant long word and R7 containing the
least significant long word, regardless of the endianess mode.
For return value, i have done as mentioned in (4) but I am not sure
how to control the argument passing so that R6 contains the msw and R7
contains lsw, regardless of the endianess mode.


Regards,
Shafi


Output sections

2009-07-18 Thread Mohamed Shafi
Hello all,

Is it possible to emit a assembler directive at the end of each sections?
Say like section_end
Is there any support for doing something like this in the back-end files?
Or should i need to the make changes in the gcc sources?
Is so do does anyone know in which function it should happen?

Regards,
Shafi


current_function_outgoing_args_size

2009-07-18 Thread Mohamed Shafi
Hello all,

The change logs says that current_function_outgoing_args_size is no
more available. But it doesnt say with what it is replaced. Looking at
the other targets i find that its replaced with some field in a
structure crtl. Where is this defined/declared.

I am working in GCC 4.4.0. I checked with the mainline internals. Even
there the references of these deleted variables are not replaced.
Could somebody please take care of this.

Regards,
Shafi


Re: current_function_outgoing_args_size

2009-07-19 Thread Mohamed Shafi
2009/7/18 Ian Lance Taylor :
> Mohamed Shafi  writes:
>
>> The change logs says that current_function_outgoing_args_size is no
>> more available. But it doesnt say with what it is replaced. Looking at
>> the other targets i find that its replaced with some field in a
>> structure crtl. Where is this defined/declared.
>
> crtl is declared in function.h.
>
>> I am working in GCC 4.4.0. I checked with the mainline internals. Even
>> there the references of these deleted variables are not replaced.
>> Could somebody please take care of this.
>
And also references to "regs_ever_live".

Regards,
Shafi


Re: Output sections

2009-07-31 Thread Mohamed Shafi
2009/7/18 Dave Korn :
> Mohamed Shafi wrote:
>> Hello all,
>>
>> Is it possible to emit a assembler directive at the end of each sections?
>> Say like section_end
>> Is there any support for doing something like this in the back-end files?
>> Or should i need to the make changes in the gcc sources?
>> Is so do does anyone know in which function it should happen?
>
>  There isn't really such a concept as 'end of a section' until you get to
> final-link time and get all the contributions from different .o files to a
> given section.  During assembler output GCC treats sections as random access,
> switching freely from one to another and back; it doesn't have any concept of
> starting/stopping/opening/closing a section but just jumps into any one it
> likes completely ad-hoc.
>
>  Assuming you're happy with adding something to the end of each section in
> each generated .s file, you could use the TARGET_ASM_FILE_END hook to output
> directives that re-enter each used section and then output your new directive.
>  You may find it hard to know which sections have been used or not in a given
> file - you can define TARGET_ASM_NAMED_SECTION and make a note of which
> sections get invoked there, but I'm not sure if that gets called for all
> sections e.g. init/fini, you may have to try it and see.
>

I am looking for adding something to the end of each section in the
generated .s file. Using TARGET_ASM_NAMED_SECTION i will be able to
keep track of the sections that are being emitted. But from
TARGET_ASM_FILE_END hook how can i re-enter into each section. Are the
sections stored in some global variable?

Shafi


Re: Output sections

2009-07-31 Thread Mohamed Shafi
2009/8/1 Dave Korn :
> Mohamed Shafi wrote:
>> I am looking for adding something to the end of each section in the
>> generated .s file. Using TARGET_ASM_NAMED_SECTION i will be able to
>> keep track of the sections that are being emitted. But from
>> TARGET_ASM_FILE_END hook how can i re-enter into each section. Are the
>> sections stored in some global variable?
>
>  I'm not sure I understand the question.  You "enter a section" simply by
> emitting the correct .section directive into the asm output.  You re-enter it 
> by
> the same method.
>
>    cheers,
>      DaveK
>

Ok, Then i don't understand your solution.

>> you could use the TARGET_ASM_FILE_END hook to output
>> directives that re-enter each used section and then output your new 
>> directive.

if i want to do the following in the assembly output

section .code
.
.
..
section_end


you are saying that if i emit a section directive the compiler will
switch to the previously emitted section and then i have to somehow
seek to the end of that section and emit my 'section_end' directive?

Shafi


Re: Output sections

2009-08-01 Thread Mohamed Shafi
2009/8/1 Dave Korn :
> Mohamed Shafi wrote:
>> 2009/8/1 Dave Korn :
>>> Mohamed Shafi wrote:
>>>> I am looking for adding something to the end of each section in the
>>>> generated .s file. Using TARGET_ASM_NAMED_SECTION i will be able to
>>>> keep track of the sections that are being emitted. But from
>>>> TARGET_ASM_FILE_END hook how can i re-enter into each section. Are the
>>>> sections stored in some global variable?
>>>  I'm not sure I understand the question.  You "enter a section" simply by
>>> emitting the correct .section directive into the asm output.  You re-enter 
>>> it by
>>> the same method.
>
>> Ok, Then i don't understand your solution.
>
>  Ah, it looks like I didn't quite understand your problem.
>
>>>> you could use the TARGET_ASM_FILE_END hook to output
>>>> directives that re-enter each used section and then output your new 
>>>> directive.
>>
>> if i want to do the following in the assembly output
>>
>> section .code
>> .
>> .
>> ..
>> section_end
>
>  I thought you just wanted to have
>
>   .section .code
>   section_end
>   .section .data
>   section_end
>
> ... etc. for all used sections, at the very end of the file; after all, all 
> the
> contributions to a section get concatenated in the assembler.  Now you seem to
> be saying that you want to have multiple section_end directives throughout the
> file, every time the current section changes.
>
>> you are saying that if i emit a section directive the compiler will
>> switch to the previously emitted section and then i have to somehow
>> seek to the end of that section and emit my 'section_end' directive?
>
>  I think you may need to re-read the assembler manual about sections, you are 
> a
> little confused about the concepts.  The compiler doesn't really "switch"
> anything; the compiler emits ".section" directives, in response to which the
> *assembler* switches to emit code in the chosen section.  The compiler doesn't
> keep track of sections; it just randomly emits directives for whichever one it
> wants the assembly output to go into at any given time, according to whether
> it's generating the assembly for a function or a variable or other data 
> object.
>

Ok. will TARGET_NAMED_SECTION get invoked for the normal sections like
text, data, bss ? I tired to include this hook in my code, but the
execution never reaches this hook for the sections.

Shafi


How to set the alignment

2009-08-03 Thread Mohamed Shafi
Hello all,

I am doing a private port in GCC 4.4.0. For my target the following
are the alignment requirements:

int - 4 bytes
short - 2 bytes
char - 1 byte
pointer - 4 bytes
stack pointer - 4 bytes

i am not able to implement the alignment for short.

The following is are the macros that i used for this

#define PARM_BOUNDARY 8
#define STACK_BOUNDARY 64

I have also defined STACK_SLOT_ALIGNMENT. but this is not affecting the output.

What should i be doing to get the required alignment?

Regards,
Shafi


Re: How to set the alignment

2009-08-03 Thread Mohamed Shafi
2009/8/3 Jim Wilson :
> On 08/03/2009 02:14 AM, Mohamed Shafi wrote:
>>
>> short - 2 bytes
>> i am not able to implement the alignment for short.
>> The following is are the macros that i used for this
>> #define PARM_BOUNDARY 8
>> #define STACK_BOUNDARY 64
>
> You haven't explained what the actual problem is.  Is there a problem with
> global variables?  Is the variable initialized or uninitialized? If it is
> uninitialized, is it common?  If this a local variable?  Is this a function
> argument or parameter?  Is this a named or unnamed (stdarg) argument or
> parameter?  Etc.  It always helps to include a testcase.
>
> You should also mention what gcc is currently emitting, why it is wrong, and
> what the output should be instead.
>
> All this talk about stack and parm boundary suggests that it might be an
> issue with function arguments, in which case you will probably have to
> describe the calling conventions a bit so we can understand what you want.
>
  This is the test case that i tried

short funs (int a, int b, char w,short e,short r)
{
  return e+r;
}

The target is 32bit . The first two parameters are passed in registers
and the rest in stack. For the parameters that are passed in stack the
alignment is that of the data type. The stack pointer is 8 byte
aligned. char is 1 byte, int is 4 byte and short is 2 byte. The code
that is getting generated is give below (-O0 -fomit-frame-pointer)

funs:
add  16,sp
mov  d0,(sp-16)
mov  d1,(sp-12)
movh  (sp-19),d0
movh  d0,(sp-8)
movh  (sp-21),d0
movh  d0,(sp-6)
movh  (sp-8),d1
movh  (sp-6),d0
add d1,d0,d0
sub16,sp
ret


From the above code you can see that some of the half word access is
not aligned on a 2byte boundary.

So where am i going wrong.
Hope this info is enough

Regards,
Shafi


Re: How to set the alignment

2009-08-05 Thread Mohamed Shafi
2009/8/5 Jim Wilson :
> On Tue, 2009-08-04 at 11:09 +0530, Mohamed Shafi wrote:
>> >> i am not able to implement the alignment for short.
>> >> The following is are the macros that i used for this
>> >> #define PARM_BOUNDARY 8
>> >> #define STACK_BOUNDARY 64
>> The target is 32bit . The first two parameters are passed in registers
>> and the rest in stack. For the parameters that are passed in stack the
>> alignment is that of the data type. The stack pointer is 8 byte
>> aligned. char is 1 byte, int is 4 byte and short is 2 byte. The code
>> that is getting generated is give below (-O0 -fomit-frame-pointer)
>
> Er, wait.  You set PARM_BOUNDARY to 8.  This means all arguments will be
> padded to at most an 8-bit boundary, which means that yes, a short after
> a char will have only 1 byte alignment.  If you want all arguments to
> have 2-byte alignment, then you need to set PARM_BOUNDARY to 16.  But
> you probably want a value of 32 here so that 4-byte ints get 4-byte
> alignment.  This will allocate a minimum 4-byte stack slot for every
> argument.  I don't know the calling convention, so I don't know exactly
> how you want arguments arranged on the stack.
>
> If you are pushing arguments, then you can lie in the PUSH_ROUNDING
> macro.  You could say for instance that one byte pushes always push 2
> bytes.  This ensures that the stack always has 2-byte alignment while
> pushing arguments.  If your push instruction doesn't actually do this,
> then you need to modify the pushqi pattern to emit two pushes or use a
> HImode push to get the right behaviour.
>
> Try looking at the code in store_one_arg in calls.c, and emit_push_insn
> in expr.c.
>
What i did was to define FUNCTION_ARG_BOUNDARY macro to return the
alignment as per the requirement. i.e 8bits for char, 16bits for
short, 32bits for int and kept PARM_BOUNDARY to 8. Now the complier is
emitting the alignment prperly.

Is this OK?

Regards,
Shafi


Restrictive addressing mode

2009-08-10 Thread Mohamed Shafi
Hello all,

I am trying to port a 32bit target in GCC 4.4.0
Of the addressing modes that are allowed by my target the one with
(base register + offset) is restrictive in QImode.
The restriction is that if the base register is not Stack Pointer then
this kind of address cannot come in a load instruction but only in
store instruction. So how can i implement this? Should i do a
define_expand for movQi3 and force it to a register when i get this
addressing mode?

Please let me know your thoughts on this.

Regards,
shafi


Re: About feasibility of implementing an instruction

2009-08-14 Thread Mohamed Shafi
2009/7/3 Ian Lance Taylor :
> Mohamed Shafi  writes:
>
>> I just want to know about the feasibility of implementing an
>> instruction for a port in gcc 4.4
>> The target has 40 bit register where the normal load/store/move
>> instructions will be able to access the 32 bits of the register. In
>> order to move data into the rest of the register [b32 to b39] the data
>> has to be stored into a 32bit memory location. The data should be
>> stored in such a way that if it is stored for 0-7 in memory the data
>> can be moved to b32-b39 of a even register and if the data in the
>> memory is stored in 16-23 of the memory word then it can be moved to
>> b32-b39 of a odd register. Hope i make myself clear.
>>
>> Will it be possible to implement this in the gcc back-end so that the
>> particular instruction is supported?
>
> In general, the gcc backend can do anything, so, yes, this can be
> supported.  It sounds like this is not a general purpose register, so I
> would probably do it using a builtin function.  If you need to treat it
> as a general purpose register (i.e., the register is managed by the
> register allocator) then you will need a secondary reload to handle
> this.
>

This is a general purpose register. All the 40 bits are used only for
fixed-point data types. When the register is used for fixed-point data
type all the operations except initialization, are done through
built-in functions. For initialization the immediate value should move
through a memory ..i.e there is no immediate load when the data is
40bit. So i am planning to control this using LEGITIMATE_CONSTANT
macro. But then i have a question. If all the operations are through
intrinsics will there be a need for spilling for the variables used in
the built-in functions? If so then depending on the register that get
spilled is even or odd [b32 to b39] of the register gets stored in the
memory to [b0 to b7] or [b16 tr b23] respectively. Will i be able to
keep track of the spilling so that i can reload into the proper
register?

Hope i am clear.

Regards
Shafi


DI mode and endianess

2009-08-19 Thread Mohamed Shafi
HI,

I am trying to port a 32bit target in GCC 4.4.0. My target supports
big and little endian. This is selected using a target switch. So i
have defined the macro

#define WORDS_BIG_ENDIAN (TARGET_BIG_ENDIAN)

Currently i have written pattens only for SImode moves. So GCC will
synthesize DImode patterns for me. The problem is that GCC is
generating the same code for both big and little endian i.e  for the
following code

extern long long h;
extern long long j;
extern long long k;
int temp()
{
  k = j+h;
  return 0;
}

the compiler is generating the following code.

section .text local
ALIGN   16
GLOBAL  _temp
_temp:
mov  _h,d4
mov  _h+4,d5
mov  _j,d2
mov  _j+4,d3
addd4,d2
adcd5,d3
mov  d2,_k
mov  d3,_k+4
ret
SIZE_temp,*-_temp


irrespective of which endian it is.
What could i be missing here? Should i add anything specific for this
in the back-end?

Regards,
Shafi


Re: Function argument passing

2009-08-23 Thread Mohamed Shafi
2009/7/16 Richard Henderson :
> On 07/13/2009 07:35 AM, Mohamed Shafi wrote:
>>
>> So i made both TARGET_STRICT_ARGUMENT_NAMING and
>> PRETEND_OUTGOING_VARARGS_NAMED to return false. Is this correct?
>
> Yes.
>
>> How to make the varargs argument to be promoted to 32bits when the
>> normal argument don't require promotion as mentioned in point (1) ?
>
> There is no way at present.  You'll have to extend the promote_function_args
> hook to accept a "bool named" parameter.
>
>> 4. A long long return value is returned in R6 and R7, R6 containing
>> the most significant  long word and R7 containing the least
>> significant long word, regardless of the endianess mode.
>> Solution: Used TARGET_RETURN_IN_MSB to return true when the mode is
>> little endian
>
> I don't believe this is correct.  RETURN_IN_MSB is supposed to be handling
> the case where the data to be returned is smaller than the register in which
> it is returned -- e.g. a 3 byte structure returned in a 32-bit register.  I
> believe you should be using...
>
>> 5. If the first argument is a long long , it is passed in R6 and R7,
>> R6 containing the most significant long word and R7 containing the
>> least significant long word, regardless of the endianess mode.
>> For return value, i have done as mentioned in (4) but I am not sure
>> how to control the argument passing so that R6 contains the msw and R7
>> contains lsw, regardless of the endianess mode.
>
> For both return values and arguments, we support a PARALLEL which allows the
> target to indicate where each piece of the value is located.  It's also true
> that the generated rtl is more complicated, so you'd want to avoid this
> solution in big-endian mode, when it isn't needed.
>
> So here you would do
>
> if (WORDS_BIG_ENDIAN)
>  return gen_rtx_REG (DImode, 6);
> else
>  {
>    rtx r6, r7, par;
>
>    r7 = gen_rtx_REG (SImode, 7);
>    r7 = gen_rtx_EXPR_LIST (SImode, r7, GEN_INT (0));
>    r6 = gen_rtx_REG (SImode, 6);
>    r6 = gen_rtx_EXPR_LIST (SImode, r6, GEN_INT (4));
>    par = gen_rtx_PARALLEL (DImode, gen_rtvec (2, r7, r6)));
>    return par;
>  }
>
> See the docs for FUNCTION_ARG for details.
>

I am getting the following error when i make a function call.

(call_insn 18 17 19 3 1.c:29 (set (parallel:DI [
(expr_list:REG_UNUSED (reg:SI 7 d7)
(const_int 0 [0x0]))
(expr_list:REG_UNUSED (reg:SI 6 d6)
(const_int 4 [0x4]))
])
(call:SI (mem:SI (symbol_ref:SI ("dd1") [flags 0x41]
) [0 S4 A8])
(const_int 8 [0x8]))) -1 (nil)
(expr_list:REG_DEP_TRUE (use (reg:SI 7 d7))
(expr_list:REG_DEP_TRUE (use (reg:SI 6 d6))
(nil

How do i write a pattern for this?
Another question is in LITTLE ENDIAN mode for the return value will
the compiler know that values are actually stored the other way.. in
big endian format? And generate the code accordingly for the rest of
the program?

Regards,
Shafi


How to write shift and add pattern?

2009-08-28 Thread Mohamed Shafi
Hello all,

I am trying to port a 32bit arch in GCC 4.4.0. My target has support
for 1bit, 2bit shift and add operations. I tried to write patterns for
this , but gcc is not generating those. The following are the patterns
that i have written in md file:

(define_insn "shift_add_"
 [(set (match_operand:SI 0 "register_operand" "")
   (plus:SI (match_operand:SI 3 "register_operand" "")
 (ashift:SI (match_operand:SI 1 "register_operand" "")
 (match_operand:SI 2 "immediate_operand" ""]
 ""
 "shadd1\\t%1, %0"
)

(define_insn "shift_add1_"
 [(set (match_operand:SI 0 "register_operand" "")
   (plus:SI (ashift:SI (match_operand:SI 1 "register_operand" "")
 (match_operand:SI 2 "immediate_operand" ""))
 (match_operand:SI 3 "register_operand" "")))]
 ""
 "shadd1\\t%1, %0"
)

(define_insn "shift_n_add_"
 [(set (match_operand:SI 1 "register_operand" "")
   (ashift:SI (match_dup 1)
   (match_operand:SI 2 "immediate_operand" "")))
  (set (match_operand:SI 0 "register_operand" "")
   (plus:SI (match_dup 0)
 (match_dup 1)))]
 ""
 "shadd2\\t%1, %0"
)


As you can see i have tried combinations. Since i was looking for
pattern matching i didnt bother to write according to the target.
Thought i will do that after i get a matching pattern. When i debugged
GCC was generating patterns with multiply. But that gets discarded
since md file doesnt have those patterns. How can i make GCC generate
shift and add pattern? Is GCC generating patterns with multiply due to
cost issues? I havent mentioned any cost details.

Regards,
Shafi


Reloading is going wrong?

2009-09-03 Thread Mohamed Shafi
Hello all,

I am doing a port for a 32bit target in GCC 4.4.0. Of the addressing
modes that are allowed by my target the one with (base register +
offset) is restrictive in QImode. The restriction is that if the base
register is not Stack Pointer then this kind of address cannot come in
a load instruction but only in store instruction.

 To implement this i added constrains for all supported memory
operations in QImode. So the pattern is as follows

(define_insn "movqi"
  [(set (match_operand:QI 0 "nonimmediate_operand"
 "=b,b,d,t,d, b,Ss0, Ss1, a,Se1, Sb2,  b,Sd3,  d,Se0")
(match_operand:QI 1 "general_operand"
  "I,  L,d,d,t, Ss0,b,  b,Se1,a,  b, Sd3,b,  Se0,d"))]

where
d is data registers
a is address registers
b is data and address registers
Sb2 is Rn + offset addressing mode
Sd3 is SP + offset addressing mode

Se0 - (Rn), (Rn)+, (Rn)-, (Rn + Ri) and Post modify register addressing mode
Se1 - Se0 excluding Post modify register addressing mode

I believe that there are enough combinations available for the reload
to try for alternate addressing mode if it encounters the restrictive
addressing mode. But I am still getting the following error

main1.c:11: error: insn does not satisfy its constraints:
(insn 30 29 7 2 main1.c:9 (set (reg:QI 2 d2 [orig:61 .a+1 ] [61])
(mem/s/j:QI (plus:SI (reg:SI 16 r0)
(const_int 1 [0x1])) [0 .a+1 S1 A8])) 41
{movqi} (nil))
main1.c:11: internal compiler error: in reload_cse_simplify_operands,
at postreload.c:396


So what am i doing wrong?
Cant this scenario be solved by the reload pass?
How can generate instructions with the QImode restriction?

Regards,
Shafi


Supporting FP cmp lib routines

2009-09-14 Thread Mohamed Shafi
Hi all,

I am doing a GCC port for a 32bit target in GCC 4.4.0. The target uses
hand coded floating point compare routines. Generally the function
returns the values in the general purpose registers. But these fp cmp
routines return the result in the Status Register itself.  So there is
no need to have compare instruction after the function call for FP
compare. Is there a way to let GCC know that the result for FP compare
are stored in the Status Register so that GCC generates directly a
jump operation? How can i implement this?

Regards,
Shafi


How to split 40bit data types load/store?

2009-09-14 Thread Mohamed Shafi
Hello all,

I am doing a port for a 32bit target in GCC 4.4.0. I have to support a
40bit data (_Accum) in the port. The target has 40bit registers which
is a GPR and works as 32bit reg in other modes. The load and store for
_Accum happens in two step. The lower 32bit in one instruction and the
upper 8bit in the next instruction. I want to split the instruction
after reload. I tired to have a pattern (for load) like this:

(define_insn "fn_load_ext_sa"
 [(set (unspec:SA [(match_operand:DA 0 "register_operand" "")]
UNSPEC_FN_EXT)
   (match_operand:SA 1 "memory_operand" ""))]

(define_insn "fn_load_sa"
 [(set (unspec:SA [(match_operand:DA 0 "register_operand" "")]
UNSPEC_FN)
   (match_operand:SA 1 "memory_operand" ""))]


The above patterns works for O0. But with optimizations i am getting
ICE. It seems that GCC won't  accept unspec object in destination
operand. So how can split the pattens for the load and store for these
data types?

Regards,
Shafi


How to implement compare and branch instruction

2009-09-24 Thread Mohamed Shafi
Hello all,

I am porting a 32bit target in GCC 4.4.0
The target has have distinct signed and unsigned compare instructions,
and only one set of conditional branch instructions. Moreover the
operands cannot be immediate values if the comparison is unsigned. I
have implemented this using compare-and-branch instruction. This gets
split after reload. The pattern that i have written are as follows:

(define_expand "cmp"
 [(set (reg:CC CC_REGNUM)
   (compare (match_operand:INT 0 "register_operand" "")
(match_operand:INT 1 "nonmemory_operand" "")))]
 ""
 "
  {
   compare_op0 = operands[0];
   compare_op1 = operands[1];
   DONE;
  }
 "
)


(define_expand "b"
 [(set (reg:CC CC_REGNUM)
   (compare:CC (match_dup 1)
   (match_dup 2)))
  (set (pc)
   (if_then_else (comp_op:CC (reg:CC CC_REGNUM)(const_int 0))
 (label_ref (match_operand 0 "" ""))
 (pc)))]
  ""
  "{
operands[1] = compare_op0;
operands[2] = compare_op1;

if (CONSTANT_P (operands[2])
&& ( == LTU ||  == GTU ||  == LEU ||  == GEU))
  operands[2] = force_reg (GET_MODE (operands[1]), operands[2]);

operands[3] = gen_rtx_fmt_ee (, CCmode,
  gen_rtx_REG (CCmode,CC_REGNUM), const0_rtx);
emit_jump_insn (gen_compare_and_branch_insn (operands[0], operands[1],
 operands[2], operands[3]));
DONE;
  }"
)

(define_insn_and_split "compare_and_branch_insn"
 [(set (pc)
   (if_then_else (match_operator:CC 3 "comparison_operator"
   [(match_operand 1 "register_operand"
"d,d,a,a,d,t,k,t")
(match_operand 2 "nonmemory_operand"
"J,L,J,L,d,t,t,k")])
 (label_ref (match_operand 0 "" ""))
 (pc)))]
 "!unsigned_immediate_compare_p (GET_CODE (operands[3]), operands[2])"
 "#"
 "reload_completed"
 [(set (reg:CC CC_REGNUM)
   (match_op_dup:CC 3 [(match_dup 1) (match_dup 2)]))
  (set (pc)
   (if_then_else (eq (reg:CC CC_REGNUM) (const_int 0))
 (label_ref (match_dup 0))
 (pc)))]
  "{
if (expand_compare_insn (operands, 0))
  DONE;
  }"
)

In the function "expand_compare_insn" i am asserting that operand[2]
is not a immediate value if the comparison is unsigned. I am getting a
assertion failure in this function. The problem is that reload pass
will replace operand[2]  with its equiv_constant. This breaks the
pattern after reload pass.

Before reload pass

(jump_insn 58 56 59 10 20070129.c:73 (set (pc)
(if_then_else (leu:CC (reg:QI 84)
(reg:QI 91))
(label_ref 87)
(pc))) 77 {compare_and_branch_insn} (expr_list:REG_DEAD (reg:QI 84)
(expr_list:REG_BR_PROB (const_int 200 [0xc8])
(nil

After reload pass:

(jump_insn 58 56 59 10 20070129.c:73 (set (pc)
(if_then_else (leu:CC (reg:QI 17 r1 [84])
(const_int 1 [0x1]))
(label_ref 87)
(pc))) 77 {compare_and_branch_insn} (expr_list:REG_BR_PROB
(const_int 200 [0xc8])
(nil)))


How can i overcome this error?
Thanks for your help.

Regards,
Shafi


Segmentation fault when calling a library fun - GCC bug?

2009-09-25 Thread Mohamed Shafi
I am doing a port for a 32bit target in GCC 4.4.0
I am getting segmentation fault in the function assign_temp in the
following line:

if (DECL_P (type_or_decl))

After analyzing the issue i find that this might be a bug. I just want
to confirm if that is the case or not.
In order to reproduce i think the target should have the following properties
a. Only 2 32bit registers available as argument registers.
b. Second 64bit value will be pushed in stack
c. ACCUMULATE_OUTGOING_ARGS is set
d. STRICT_ALIGNMENT is set
e. PARM_BOUNDARY is 32

When there is a library call for an operation that takes two 64bit
arguments, say division of two long long values - _divdi3, the
following sequence happens
emit_library_call_value -> emit_library_call_value_1 ->
emit_push_insn->assign_temp

emit_push_insn is called because the second argument is pushed on the
stack and ACCUMULATE_OUTGOING_ARGS is set.
assign_temp is called because  STRICT_ALIGNMENT && PARM_BOUNDARY <
GET_MODE_ALIGNMENT (DImode) is true


Can somebody please confirm whether this is due to some mistake in my
port or a GCC bug?

Thanks,
Shafi


Reload going wrong for addition.

2009-09-28 Thread Mohamed Shafi
Hello all,

I doing a port for a 32bit target for GCC 4.4.0. I am getting the
following error:

rd_er.c:19: error: insn does not satisfy its constraints:
(insn 5 35 34 2 rd_er.c:8 (set (reg:SI 16 r0)
(plus:SI (reg:SI 16 r0)
(reg:SI 2 d2))) 57 {addsi3} (expr_list:REG_EQUAL (plus:SI
(reg/f:SI 49 sp)
(const_int -65544 [0xfffefff8]))
(nil)))


My target has 16 data registers and 16 address registers. All are
32bit registers.
The target also has a dedicated stack pointer.
There is no move operation possible between SP and data regs.
There is no provision for addition between data and address registers.
R7 is used as Frame Pointer.


Pattern for addition
---

(define_insn "add3"
  [(set (match_operand:INT 0 "register_operand"
 "=d, t, k, a, a, t, k, t, d")
(plus:INT(match_operand:INT 1 "register_operand"
   "0, 0, 0, t, k, 0, 0, 0, 0")
 (match_operand:INT 2 "nonmemory_operand"
   "J, J, J, L, L, t, t, k, d")))]

The constraints used are -
;;d  -   Data registers [D0 - D15]
;;a  -   Address registers [R0 - R15]
;;t   -   Address and Index registers
;;k   -  Stack Pointer
;;J   -   Unsigned 5bit immediate
;;L   -   Signed 16bit immediate

Since there is no move operation between SP and data regs i have
specified 12 as the register_move_cost between them. I also return the
reload class as address register class in preferred_reload_class when
the rtx is SP.

b4 ira pass
---

(insn 5 2 12 2 rd_er.c:8 (set (reg/v/f:SI 60 [ bufptr ])
(reg/f:SI 23 r7)) 43 {*movsi_internal} (nil))


Input for reload pass
-

(insn 5 2 12 2 rd_er.c:8 (set (reg/v/f:SI 7 d7 [orig:60 bufptr ] [60])
(plus:SI (reg/f:SI 49 sp)
(const_int -65536 [0x]))) 57 {addsi3}
(expr_list:REG_EQUAL (plus:SI (reg/f:SI 49 sp)
(const_int -65536 [0x]))
(nil)))


After IRA
---

Reloads for insn # 5
Reload 0: reload_in (SI) = (reg/f:SI 49 sp)
reload_out (SI) = (reg/v/f:SI 7 d7 [orig:60 bufptr ] [60])
HIGH_OR_LOW, RELOAD_OTHER (opnum = 0)
reload_in_reg: (reg/f:SI 49 sp)
reload_out_reg: (reg/v/f:SI 7 d7 [orig:60 bufptr ] [60])
reload_reg_rtx: (reg:SI 16 r0)
Reload 1: reload_in (SI) = (const_int -65544 [0xfffefff8])
DALU_REGS, RELOAD_FOR_INPUT (opnum = 2)
reload_in_reg: (const_int -65544 [0xfffefff8])
reload_reg_rtx: (reg:SI 2 d2)

(insn 5 35 34 2 rd_er.c:8 (set (reg:SI 16 r0)
(plus:SI (reg:SI 16 r0)
(reg:SI 2 d2))) 57 {addsi3} (expr_list:REG_EQUAL (plus:SI
(reg/f:SI 49 sp)
(const_int -65544 [0xfffefff8]))
(nil)))

The reload pass chooses the final alternative as the goal for reloading.
Since the input instruction already has data register as the
destination the constraint combination (t, 0, t) looses to (d, 0, d),
since the last combination requires least amount copying for
constraint matching (or so the reload pass believes). There are cases
when reload fixes the add pattern and those are when either the
destination is address register or there is no stack pointer involved.
But otherwise i am getting this ICE. I am not sure how to over come
this,.

Hope someone suggests me a solution.

Regards,
Shafi

P.S Can i have commutative operation for the constraint combination
(t, 0, t) i.e (t, %0, t). If so what will be the output template?


define_memory_constraint and REG_OK_STRICT

2009-09-29 Thread Mohamed Shafi
Hello all,

I am doing a port for a 32bit target in GCC 4.4.0.
I have defined memory_constraints in predicates.c like this

(define_memory_constraint "Sr0"
   "Memory refrence through base registers"
   (match_test "target_mem_constraint (\"r0\", op)"))

In the function target_mem_constraint i have

int
target_mem_constraint (const char *str, rtx op)
{
  char c0 = str[0];
  char c1 = str[1];
  rtx op0 = XEXP (op, 0);
  bool strict =  reload_completed;

  if (!MEM_P (op))
return 0;

  switch (c0)
{
case 'r':
  return (!STACK_REG_RTX_P (op0)
  && BASE_REG_RTX_P (op0, strict));
...
...

My question is my definition of strict correct?
or should it be reload_in_progress || reload_completed?

Regards,
Shafi


Re: define_memory_constraint and REG_OK_STRICT

2009-09-29 Thread Mohamed Shafi
2009/9/30 Richard Henderson :
> On 09/29/2009 07:32 AM, Mohamed Shafi wrote:
>>
>> My question is my definition of strict correct?
>> or should it be reload_in_progress || reload_completed?
>
> I'm tempted to say it should be the later, but I'm not sure it really makes
> any difference since reload does not query the operand predicates; it only
> queries the operand constraints.

This is a memory_constraint. The memory constraint allows an
address based on the definition of the bool variable strict
>
> And even that said, neither the ARM or IA64 ports do anything with strict at
> all, which suggests that you may not have to either.  It's possible that
> this works because after reload we verify an instruction with both
> predicates and constraints.
>
> Is this question in response to a particular problem, or just trying to
> avoid possible problems?
>

Both i guess

My pattern for DI

(define_insn "*mov_internal"
 [(set (match_operand:DI_DF 0 "nonimmediate_operand"
"=d,t,t,d,t,d,Se0,d,Ss0,e,Ss1,b,Sr0,c,Se0,b,Sb1,b,Sb2,b,Sd2,b,Sd3,e")
   (match_operand:DI_DF 1 "general_operand"
" i,i,t,t,d,d,d,Se0,e,Ss0,b,Ss1,c,Sr0,b,Se0,b,Sb1,b,Sb2,b,Sd2,e,Sd3"))]

post_inc and post_dec is allowed only by the constraint 'Se0'.
Reload pass was not choosing this alternative for the following pattern:

(insn 103 102 53 4 ch_addr.c:11 (set (mem:DF (post_inc:SI (reg:SI 90 [
__ivtmp_22 ])) [0 S8 A64])
(subreg:DF (reg:DI 116 [+4 ]) 0)) 40 {*movdf_internal}
(expr_list:REG_DEAD (reg:DI 116 [+4 ])
(expr_list:REG_INC (reg:SI 90 [ __ivtmp_22 ])
(nil

because in the mem_constraint function i have

int
target_mem_constraint (const char *str, rtx op)
{
 char c0 = str[0];
 char c1 = str[1];
 rtx op0 = XEXP (op, 0);
 bool strict =  (reload_completed || reload_in_progress);

 if (!MEM_P (op))
   return 0;

 switch (c0)
   {
   case 'r':
 return (!STACK_REG_RTX_P (op0)
 && BASE_REG_RTX_P (op0, strict));

   case 'e':
 if (GET_CODE (op0) == POST_INC
 || GET_CODE (op0) == POST_DEC)
return (!STACK_REG_RTX_P (XEXP (op0, 0))
&& BASE_REG_RTX_P (XEXP (op0, 0), strict));
...
...

So the alternative was getting rejected due to my definition of strict
and thus results in an ICE later. But since there were only few in the
testsuite , i will have to guess that reload was fixing other cases
similar to this and thus maybe generating unoptimized code. So what
should be the definition ?

 bool strict =  (reload_completed || reload_in_progress);

or

 bool strict =  reload_completed ? true : false;


Regards,
Shafi


Re: define_memory_constraint and REG_OK_STRICT

2009-10-02 Thread Mohamed Shafi
2009/9/30 Richard Henderson :
> On 09/29/2009 09:46 PM, Mohamed Shafi wrote:
>>
>>  bool strict =  reload_completed ? true : false;
>
> What happens if you set "strict = false" here?
> That's what ARM does.

  That particular case works, and yes arm does it that way but there
are other targets that uses (reload_completed || reload_in_progress)
like s390.  So thats why i had to ask if my definition of strict is
proper or not. I am not sure which one to use?

Shafi


Re: Reload going wrong for addition.

2009-10-02 Thread Mohamed Shafi
2009/9/28 Richard Henderson :
> On 09/28/2009 07:25 AM, Mohamed Shafi wrote:
>>
>> Hope someone suggests me a solution.
>
> The solution is almost certainly something involving the
> TARGET_SECONDARY_RELOAD hook.  You need to inform reload that it's going to
> need some scratch registers in order to perform the operation.
>
> It's been a long time since I had to fiddle with this sort of thing, so I
> forget all the details involved.  Perhaps someone else has some additional
> advice.
>

Ok what i did was to remove the code from preferred_reload_class
function, so that  now it returns class i.e

#define PREFERRED_RELOAD_CLASS(class, x) class

And did in  TARGET_SECONDARY_RELOAD i added the code to have a scratch
register to do the move operation. Now things are working. So i guess
i should as why we have PREFERRED_RELOAD_CLASS when we can do the same
with TARGET_SECONDARY_RELOAD?

Shafi


Re: How to split 40bit data types load/store?

2009-10-05 Thread Mohamed Shafi
2009/9/14 Richard Henderson :
> On 09/14/2009 07:24 AM, Mohamed Shafi wrote:
>>
>> Hello all,
>>
>> I am doing a port for a 32bit target in GCC 4.4.0. I have to support a
>> 40bit data (_Accum) in the port. The target has 40bit registers which
>> is a GPR and works as 32bit reg in other modes. The load and store for
>> _Accum happens in two step. The lower 32bit in one instruction and the
>> upper 8bit in the next instruction. I want to split the instruction
>> after reload. I tired to have a pattern (for load) like this:
>>
>> (define_insn "fn_load_ext_sa"
>>  [(set (unspec:SA [(match_operand:DA 0 "register_operand" "")]
>>                    UNSPEC_FN_EXT)
>>        (match_operand:SA 1 "memory_operand" ""))]
>>
>> (define_insn "fn_load_sa"
>>  [(set (unspec:SA [(match_operand:DA 0 "register_operand" "")]
>>                     UNSPEC_FN)
>>        (match_operand:SA 1 "memory_operand" ""))]
>
> Unspec on the left-hand-side isn't something that's supposed to happen, and
> is more than likely the cause of your problems.  Try moving the unspec to
> the right-hand-side like:
>
>  (set (reg:SI reg) (mem:SI addr))
>
>  (set (reg:SA reg)
>       (unspec:SA [(reg:SI reg) (mem:QI addr)]
>                  UNSPEC_ACCUM_INSERT))
>
> and
>
>  (set (mem:SI addr) (reg:SI reg))
>
>  (set (mem:QI addr)
>       (unspec:QI [(reg:SA reg)]
>                  UNSPEC_ACCUM_EXTRACT))
>
> Note that after reload it's perfectly acceptable for a hard register to
> appear with the different SI and SAmodes.
>
> It's probably not too hard to define this with zero_extract sequences
> instead of unspecs, but given that these only appear after reload, it may
> not be worth the effort.
>

   I was able to implement this with unspecs. But now it seems that i
need to split the pattern before reload also. So i am thinking of
removing this and doing a split before reload. The issue is that there
is no support to for register indirect addressing mode for accessing
the upper eight bits of the 40bit register. The only addressing mode
supported for accessing this section is (SP+offset). So what i thought
was to allow this addressing mode and at the time of reloading, at the
time of secondary reload with the help of a scratch register and a
scratch memory. But it seems that in GCC it is not possible to have
both scratch memory and a scratch register for the same operation. Am
i right?
So what i did was to implement this at the define_expand stage itself.
The idea is to generate the following sequence:

for load (R0), D0 generate

load (R0), D0// 32bit mode , SAmode move
load (R0+4), scratch_reg  // 32bit mode, SAmode
store scratch_reg, (SP+off)   //32bit mode, SAmode
load.ext (SP+off), D0.u8

and similarly for store.
 Here are the patterns that i used for this purpose:

(define_expand "movda"
 [(set (match_operand:DA 0 "nonimmediate_operand" "")
   (match_operand:DA 1 "nonimmediate_operand" ""))]
 ""
 "{
  if (MEM_P (operands[1]) && REG_P (XEXP (operands[1], 0))
  && XEXP (operands[1], 0) != virtual_stack_vars_rtx))
{
  rtx lo_half, hi_half;
  rtx scratch_mem, scratch_reg, subreg;

  gcc_assert (can_create_pseudo_p ());
  scratch_reg = gen_reg_rtx (SAmode);
  scratch_mem = assign_stack_temp (SAmode, GET_MODE_SIZE (SAmode), 0);\
  subreg = gen_rtx_SUBREG (SAmode, operands[0], 0);

  lo_half = adjust_address (operands[1], SAmode, 0);
  hi_half = adjust_address (operands[1], SAmode, 4);
  emit_insn (gen_rtx_SET (SAmode, subreg, lo_half));
  emit_insn (gen_rtx_SET (SAmode, scratch_reg, hi_half));
  emit_insn (gen_rtx_SET (SAmode, scratch_mem, scratch_reg));
  emit_insn (gen_load_reg_ext (operands[0], scratch_mem));
  DONE;
}
   /* and similarly for store operation */
 }"
)

(define_insn "load_reg_ext"
 [(set (subreg:SA (zero_extract:DA (match_operand:DA 0 "register_operand" "=d")
(const_int 8)
(const_int 24)) 4)
   (match_operand:SA 1 "memory_operand" "Sd3"))]

(define_insn "store_reg_ext"
 [(set (match_operand:SA 0 "memory_operand" "=Sd3")
   (zero_extract:SA (match_operand:DA 1 "register_operand" "d")
(const_int 8)
(const_int 24)))]

(define_insn "*movsa_internal"
 [(set (match_operand:SA 0 "nonimmediate_operand" "=m,d,d")
 (match_operand:SA 1 "nonimmediate_operand" "d,m,d"))]


By default -fomit-frame-pointer will passed to the complier. Without
optimization compiler generates the expected output. But with
optimization that is not the case. It seems that the pattern that i
have written above are not proper. For the simple function like the
following

_Accum foo (_Accum *a)
{
   _Accum b = *a;
   return b;
}

with optimization enabled the complier generates only

load (R0), D0// 32bit mode , SAmode move

the 1st instruction in the expected 4 instruction sequence.
How can i write the patterns properly?

Regards
Shafi


How to support 40bit GP register

2009-10-21 Thread Mohamed Shafi
HI all,

I am porting GCC 4.4.0 for a 32bit target. The target has 40bit data
registers and 32bit address registers that can be used as general
purpose registers. When 40bit registers are used for arithmetic
operations or comparison operations GCC generates code assuming that
its a 32bit register. Whenever there is a move from address register
to data register sign extension is automatically performed by the
target. Since the data register is 40bit after some operations
sign/zero extension has to be performed for the result to be proper.
Take the following test case for example :

typedef struct
{
  char b0;
  char b1;
  char b2;
  char b3;
  char b4;
  char b5;
} __attribute__ ((packed)) b_struct;


typedef struct
{
  short a;
  long b;
  short c;
  short d;
  b_struct e;
} __attribute__ ((packed)) a_struct;


int
main(void)
{
  volatile a_struct *a;
  volatile a_struct b;

  a = &b;
  *a = (a_struct){1,2,3,4};
  a->e.b4 = 'c';

  if (a->b != 2)
abort ();

  exit (0);
}

For accessing a->b GCC generates the following code:

   move.l  (sp-16), d3
   lsrr.l  #<16, d3
   move.l  (sp-12),d2
   asll#<16,d2
   or  d3,d2
   cmpeq.w #<2,d2
   jf  _L2

Because data registers are 40 bit for 'asll' operation the shift count
should be 16+8 or there should be sign extension from 32bit to 40 bits
after the 'or' operation. The target has instruction to sign extend
from 32bit to 40 bit.

Similarly there are other operation that requires sign/zero extension.
So is there any way to tell GCC that the data registers are 40bit and
there by expect it to generate sign/zero extension accordingly ?

Regards,
Shafi


Typo in internals

2009-10-23 Thread Mohamed Shafi
Hi,

The internal doc says :

— Target Hook: bool TARGET_CAN_INLINE_P (tree caller, tree callee)

This target hook returns false if the caller function cannot
inline callee, based on target specific information. By default,
inlining is not allowed if the callee function has function specific
target options and the caller does not use the same options.


But looking in the sources i think this really should have been
TARGET_OPTION_CAN_INLINE_P


Shafi.


Re: Supporting FP cmp lib routines

2009-10-23 Thread Mohamed Shafi
2009/9/14 Richard Henderson :
> Another thing to look at, since you have hand-written routines and may be
> able to specify that e.g. only a subset of the normal call clobbered
> registers are actually modified, is to leave the call as a "compare" insn.
>  Something like
>
> (define_insn "*cmpsf"
>  [(set (reg:CC status-reg)
>        (compare:CC
>          (match_operand:SF 0 "register_operand" "R0")
>          (match_operand:SF 1 "register_operand" "R1")))
>   (clobber (reg:SI r2))
>   (clobber (reg:SI r3))]
>  ""
>  "call __compareSF"
>  [(set_attr "type" "call")])
>
> Where the R0 and R1 constraints resolve to the input registers for the
> routine.  Depending on your ISA and ABI, you may not even need to split this
> pattern post-reload.
>

I have implemented the above solution and it works. I have to support
the same for DF also. But with DF i have a problem with the
constraints. My target generates code for both big and little endian.
The ABI specifies that when a 64bit value is passed as an argument
they are passed in R6 and R7, R6 containing the most significant  long
word and R7 containing the least significant long word, regardless of
the  endianess mode. How can i do this in the DF compare pattern?

Regards,
Shafi


IRA is not looking into the predicates ?

2009-10-30 Thread Mohamed Shafi
Hi,

I am doing a port for a 32bit target in GCC 4.4.0. The target does not
have support for symbolic address in QImode for load operations. In
order to do this what i have done is in define_expand for moveqi
reject symbolic address it they come in source operands and i have
also written a predicate for *moveqi_internal to reject such cases.
But i get the following ICE:

 insn does not satisfy its constraints:
(insn 24 5 6 2 ice4.c:4 (set (reg:QI 17 r1)
(mem/c/i:QI (symbol_ref:SI ("s") [flags 0x2] ) [0 s+0 S1 A32])) 0 {*movqi_internal} (nil))


>From ice4.c.172r.ira

(insn 24 5 6 2 ice4.c:4 (set (reg:QI 17 r1)
(mem/c/i:QI (symbol_ref:SI ("s") [flags 0x2] ) [0 s+0 S1 A32])) 0 {*movqi_internal} (nil))

(insn 6 24 7 2 ice4.c:4 (set (reg:QI 16 r0 [62])
(plus:QI (reg:QI 17 r1)
(const_int -100 [0xff9c]))) 16 {addqi3} (nil))

>From ice4.c.168r.asmcons

(insn 5 2 6 2 ice4.c:4 (set (reg:SI 61 [ s ])
(mem/c/i:SI (symbol_ref:SI ("s") [flags 0x2] ) [0 s+0 S4 A32])) 2 {*movsi_internal} (nil))

(insn 6 5 7 2 ice4.c:4 (set (reg:QI 62)
(plus:QI (subreg:QI (reg:SI 61 [ s ]) 0)
(const_int -100 [0xff9c]))) 16 {addqi3}
(expr_list:REG_DEAD (reg:SI 61 [ s ])
(nil)))

How can i prevent this ICE ?

Regards,
Shafi


Re: How to support 40bit GP register

2009-11-04 Thread Mohamed Shafi
2009/10/22 Richard Henderson :
> On 10/21/2009 07:25 AM, Mohamed Shafi wrote:
>>
>> For accessing a->b GCC generates the following code:
>>
>>        move.l  (sp-16), d3
>>        lsrr.l  #<16, d3
>>        move.l  (sp-12),d2
>>        asll    #<16,d2
>>        or      d3,d2
>>        cmpeq.w #<2,d2
>>        jf      _L2
>>
>> Because data registers are 40 bit for 'asll' operation the shift count
>> should be 16+8 or there should be sign extension from 32bit to 40 bits
>> after the 'or' operation. The target has instruction to sign extend
>> from 32bit to 40 bit.
>>
>> Similarly there are other operation that requires sign/zero extension.
>> So is there any way to tell GCC that the data registers are 40bit and
>> there by expect it to generate sign/zero extension accordingly ?
>
> Define a machine mode for your 40-bit type in cpu-modes.def.  Depending on
> how your 40-bit type is stored in memory, you'll use either
>
>  INT_MODE (RI, 5)                // load-store uses exactly 5 bytes
>  FRACTIONAL_INT_MODE (RI, 40, 8) // load-store uses 8 bytes
>
Richard thanks for the reply.

Load-store uses 32bits. Sign extension happens automatically. So i
have choosen INT_MODE (RI, 5) and copied movsi and renamed it to
movri. I have also specified that RImode need only one register.

> Where I've arbitrarily chosen "RImode" as a mnemonic for Register Integral
> Mode.  Now you define arithmetic operations, as needed, on
> RImode.  You define the "extendsiri" pattern to be that sign-extend from
> 32-to-40-bit instruction.  You define your comparison patterns on RImode,
> and not on SImode, since your comparison instruction works on the entire 40
> bits.

I have defined extendsiri and cbranchri4 patterns. When i compile a
program like

unsigned long xh = 1;
int main ()
{
unsigned long yh = 0xull;
unsigned long z = xh * yh;

 if (z != yh)
   abort ();

return 0;
}

I get the following ICE

internal compiler error: in immed_double_const, at emit-rtl.c:553

This happens from cse_insn () calls insert() -> gen_lowpart ->
gen_lowpart_common -> simplify_gen_subreg -> simplfy_immed_subreg.
simplify_immed_subreg is called with the parameters (outermode=RImode,
(const_int 65535), innermode=DImode, byte=0)

cse_insn is called for the following insn

(insn 10 9 11 3 bug7.c:14 (set (reg:RI 67)
(const_int 65535 [0x])) 4 {movri} (nil))


How can i overcome this?

Regards,
Shafi

>
> You'll wind up with a selection of patterns in your machine description that
> have a sign-extension pattern built in, depending on the exact behaviour of
> your ISA.  There are plenty of examples on x86_64, mips64, and Alpha (to
> name a few) that have similar properties with SI and DImodes.  Examine the
> -fdump-rtl-combine-details dump for exemplars of the canonical forms that
> the combiner creates when it tries to merge sign-extension instructions into
> preceeding patterns.
>


Re: IRA is not looking into the predicates ?

2009-11-04 Thread Mohamed Shafi
2009/10/30 Jeff Law :
> On 10/30/09 07:13, Mohamed Shafi wrote:
>>
>> Hi,
>>
>> I am doing a port for a 32bit target in GCC 4.4.0. The target does not
>> have support for symbolic address in QImode for load operations.
>
> You'll need to make sure to reject such addresses for QImode in
> GO_IF_LEGITIMATE_ADDRESS.
>
>
>>  In
>> order to do this what i have done is in define_expand for moveqi
>> reject symbolic address it they come in source operands and i have
>> also written a predicate for *moveqi_internal to reject such cases.
>>
>
> OK.  Nothing wrong with these steps.  Though you really need to make sure
> GO_IF_LEGITIMATE_ADDRESS is defined correctly.
>
> IRA doesn't look at operand predicates or insn conditions.  It assumes that
> any insns are valid assuming any pseudo registers appearing in the insn get
> suitable hard registers.
>
> Based on the dumps you provided it appears that reg61 does not get a hard
> register and reload is generating the problematical insn #24.  This is a
> good indication that your GO_IF_LEGITIMATE_ADDRESS is incorrectly
> implemented.
>
   I the GO_IF_LEGITIMATE_ADDRESS address macro i am allowing this
address because the target supports symbolic address in QImode for
store operations. And in the macro GO_IF_LEGITIMATE_ADDRESS there is
no option to check if the address is used in load or store. Thats why
in define_expand for moveqi i reject symbolic address it they come in
source operands and a predicate for *moveqi_internal to reject such
cases. But still i am getting the ICE.  IIRC the control does not come
to TARGET_SECONDARY_RELOAD also. How can i overcome this?

Regards,
Shafi


Re: IRA is not looking into the predicates ?

2009-11-04 Thread Mohamed Shafi
2009/10/30 Ian Lance Taylor :
> Mohamed Shafi  writes:
>
>>>From ice4.c.168r.asmcons
>>
>> (insn 5 2 6 2 ice4.c:4 (set (reg:SI 61 [ s ])
>>         (mem/c/i:SI (symbol_ref:SI ("s") [flags 0x2] > 0xb7bfd000 s>) [0 s+0 S4 A32])) 2 {*movsi_internal} (nil))
>>
>> (insn 6 5 7 2 ice4.c:4 (set (reg:QI 62)
>>         (plus:QI (subreg:QI (reg:SI 61 [ s ]) 0)
>>             (const_int -100 [0xff9c]))) 16 {addqi3}
>> (expr_list:REG_DEAD (reg:SI 61 [ s ])
>>         (nil)))
>>
>> How can i prevent this ICE ?
>
> If asmcons is the first place that this appears, then I think it must
> be coming from some asm statement.  So the first step would be to look
> at the asm statement and see if it can be rewritten using a different
> constraint.
>
   No this appears from the rtl expand onwards.

Shafi


Re: How to write shift and add pattern?

2009-11-09 Thread Mohamed Shafi
2009/11/6 Richard Henderson :
> On 11/06/2009 05:29 AM, Mohamed Shafi wrote:
>>
>>     The target that i am working on has 1&  2 bit shift-add patterns.
>> GCC is not generating shift-add patterns when the shift count is 1. It
>> is currently generating add operations. What should be done to
>> generate shift-add pattern instead of add-add pattern?
>
> I'm not sure.  You may have to resort to matching
>
>  (set (match_operand 0 "register_operand" "")
>       (plus (plus (match_operand 1 "register_operand" "")
>                   (match_dup 1))
>             (match_operand 2 "register_operand" ""
>
> But you should debug make_compound_operation first to
> figure out what's going on for your port, because it's
> working for x86_64:
>
>        long foo(long a, long b) { return a*2 + b; }
>
>        leaq    (%rsi,%rdi,2), %rax     # 8     *lea_2_rex64
>        ret                             # 26    return_internal
>
>
> r~
>

   I have fixed this. The culprit was the cost factor. I added the
case in targetm.rtx_costs and now it works properly. But i am having
issues with the reload.

Regards,
Shafi


Re: How to write shift and add pattern?

2009-11-09 Thread Mohamed Shafi
2009/11/6 Ian Lance Taylor :
> Mohamed Shafi  writes:
>
>> It is generating with data registers. Here is the pattern that i have
>> written:
>>
>>
>> (define_insn "*saddl"
>>   [(set (match_operand:SI 0 "register_operand" "=r,d")
>>       (plus:SI (mult:SI (match_operand:SI 1 "register_operand" "r,d")
>>                         (match_operand:SI 2 "const24_operand" "J,J"))
>>                (match_operand:SI 3 "register_operand" "0,0")))]
>>
>> How can i do this. Will the constraint modifiers '?' or '!' help?
>> How can make GCC generate shift and add sequence when the shift count is 1?
>
> Does 'd' represent a data register?  I assume that 'r' is a general
> register, as it always is.  What is the constraint character for an
> address register?  You don't seem to have an alternative here for
> address registers, so I'm not surprised that the compiler isn't
> picking it.  No doubt I misunderstand something.
>
   Ok the constrain for address register is 'a'. Thats typo in the
pattern that i given here. The proper pattern is

 (define_insn "*saddl"
   [(set (match_operand:SI 0 "register_operand" "=a,d")
   (plus:SI (mult:SI (match_operand:SI 1 "register_operand" "a,d")
 (match_operand:SI 2 "const24_operand" "J,J"))
(match_operand:SI 3 "register_operand" "0,0")))]

So how can i choose the address registers over data registers if that
is more profitable?

Regards,
Shafi


Re: How to support 40bit GP register

2009-11-09 Thread Mohamed Shafi
2009/10/22 Richard Henderson :
> On 10/21/2009 07:25 AM, Mohamed Shafi wrote:
>>
>> For accessing a->b GCC generates the following code:
>>
>>        move.l  (sp-16), d3
>>        lsrr.l  #<16, d3
>>        move.l  (sp-12),d2
>>        asll    #<16,d2
>>        or      d3,d2
>>        cmpeq.w #<2,d2
>>        jf      _L2
>>
>> Because data registers are 40 bit for 'asll' operation the shift count
>> should be 16+8 or there should be sign extension from 32bit to 40 bits
>> after the 'or' operation. The target has instruction to sign extend
>> from 32bit to 40 bit.
>>
>> Similarly there are other operation that requires sign/zero extension.
>> So is there any way to tell GCC that the data registers are 40bit and
>> there by expect it to generate sign/zero extension accordingly ?
>
> Define a machine mode for your 40-bit type in cpu-modes.def.  Depending on
> how your 40-bit type is stored in memory, you'll use either
>
>  INT_MODE (RI, 5)                // load-store uses exactly 5 bytes
>  FRACTIONAL_INT_MODE (RI, 40, 8) // load-store uses 8 bytes
>
> Where I've arbitrarily chosen "RImode" as a mnemonic for Register Integral
> Mode.  Now you define arithmetic operations, as needed, on
> RImode.  You define the "extendsiri" pattern to be that sign-extend from
> 32-to-40-bit instruction.  You define your comparison patterns on RImode,
> and not on SImode, since your comparison instruction works on the entire 40
> bits.
>
> You'll wind up with a selection of patterns in your machine description that
> have a sign-extension pattern built in, depending on the exact behaviour of
> your ISA.  There are plenty of examples on x86_64, mips64, and Alpha (to
> name a few) that have similar properties with SI and DImodes.  Examine the
> -fdump-rtl-combine-details dump for exemplars of the canonical forms that
> the combiner creates when it tries to merge sign-extension instructions into
> preceeding patterns.
>
  Ok i have comparison patterns written in RImode. When you say that i
will wind up with a selection of patterns do you mean to say that i
should have patterns for operations that operate on full 40bits in
RImode and disable the corresponding SImode patterns? Or is it that i
have to write nameless patterns in RImode for arithmetic operations
and look at the dumps to see how the combiner will merge the patterns
so that it can match the comparison operations?

Regards,
Shafi


How to split mulsi3 pattern

2009-11-10 Thread Mohamed Shafi
Hello all,

I am doing a port for a 32bit target in GCC 4.4.0. In my target 32bit
multiply instruction is carried out in two instructions.

Dn = Da x Db is executed as

Dn = (Da.L * Db.H + Da.H * Db.L) << 16
Dn = Dn + (Da.L * Db.L)

Currently the pattern that i have for this is as follows:

(define_insn "mulsi3"
 [(set (match_operand:SI 0 "register_operand"   "=&d")
   (mult:SI (match_operand:SI 1 "register_operand"  "%d")
   (match_operand:SI 2 "register_operand" "d")))]

I would like to split this pattern into two (either after of before
reload). Currently i am doing something like this:

(define_insn_and_split "mulsi3"
 [(set (match_operand:SI 0 "register_operand"   "=&d")
   (mult:SI (match_operand:SI 1 "register_operand"  "%d")
(match_operand:SI 2 "register_operand" "d")))]
 ""
 "#"
 "reload_completed"
 [(set (match_dup 0)
   (ashift:SI
(plus:SI (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_LOW)
  (unspec:HI [(match_dup 1)] UNSPEC_REG_HIGH))
 (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_HIGH)
  (unspec:HI [(match_dup 1)] UNSPEC_REG_LOW)))
(const_int 16)))
  (set (match_dup 0)
   (plus:SI (match_dup 0)
(mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_LOW)
 (unspec:HI [(match_dup 1)] UNSPEC_REG_LOW]
  ""
)

But in few testcases this is creating problems. So i would like to
know better patterns to split mulsi3 pattern.
Can someone help me out.

Regards,
Shafi


Re: How to split mulsi3 pattern

2009-11-12 Thread Mohamed Shafi
2009/11/10 Richard Henderson :
> On 11/10/2009 05:48 AM, Mohamed Shafi wrote:
>>
>> (define_insn "mulsi3"
>>  [(set (match_operand:SI 0 "register_operand"           "=&d")
>>       (mult:SI (match_operand:SI 1 "register_operand"  "%d")
>>               (match_operand:SI 2 "register_operand" "d")))]
>
> Note that "%" is only useful if the constraints for the two operands are
> different (e.g. only one operand accepts an immediate input).  When they're
> identical, you simply waste cpu cycles asking reload to try the operands in
> the other order.
>
>>  [(set (match_dup 0)
>>        (ashift:SI
>>         (plus:SI (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_LOW)
>>                           (unspec:HI [(match_dup 1)] UNSPEC_REG_HIGH))
>>                  (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_HIGH)
>>                           (unspec:HI [(match_dup 1)] UNSPEC_REG_LOW)))
>>         (const_int 16)))
>>   (set (match_dup 0)
>>        (plus:SI (match_dup 0)
>>                 (mult:HI (unspec:HI [(match_dup 2)] UNSPEC_REG_LOW)
>>                          (unspec:HI [(match_dup 1)] UNSPEC_REG_LOW]
>
> Well for one, your modes don't match.  You actually want your unspecs and
> MULTs to be SImode.
>
> You could probably usefully model the second insn as
>
> (define_insn "mulsi3_part2"
>  [(set (match_operand:SI 0 "register_operand" "=d")
>        (plus:SI
>          (mult:SI (zero_extend:SI
>                     (match_operand:HI 1 "register_operand" "d"))
>                   (zero_extend:SI
>                     (match_operand:HI 2 "register_operand" "d")))
>          (match_operand:SI 3 "register_operand" "0")))]
>  ""
>  ...)

So i need to change the mode of the register from SI to HI after
reloading. Is that allowed?

Regards,
Shafi


How to support 40bit GP register - Take two

2009-11-19 Thread Mohamed Shafi
Hello all,

I am porting GCC 4.4.0 for a 32bit target. The target has 40bit data
registers and 32bit address register. Both can be used as general
purpose registers. All load and store operations are 32bit. If 40bit
data register is involved in load/sore the register gets sign
extended. Whenever there is a move from address register to data
register sign extension is automatically performed. Currently GCC
generates code for 32bit register target. Since the data register is
40bit after/before some operations sign/zero extension has to be
performed for the result to be proper. So at present for the port the
results are not proper. I would need a solution to fix this.

I had mailed about this previously. You can see about this here
http://www.mail-archive.com/gcc@gcc.gnu.org/msg47224.html

I tried implementing the suggestion given by Richard, but got into
issues. The GCC frame work is written assuming that there are no modes
with HOST_BITS_PER_WIDE_INT < GET_MODE_BITSIZE (mode) < 2 *
HOST_BITS_PER_WIDE_INT. Moreover i am getting ICEs when there is an
optimization/operation related to subreg. (GCC tries to split RImode
values).RImode is 5byte and uses SImode load/store instructions. So
GCC generates offsets/addresses that are not 32bit aligned. Currently
i am hacking the complier all the way to get an executable (though i
have not tested the output of the obtained executables) Even if i
somehow manage to get proper output there is the issue of using 32bit
registers in RImode instructions. RImode values is meant for 40bit
register, i.e data register. That means i will not be able to use
address registers(32bit registers) in RImode patterns even though the
instructions accept them. This will definitely hamper efficiency.

So i was wondering if anybody has any alternative solution that i can
try. All i can think is to flag an insn for unsigned operation so that
i will be able to insert sign/zero extension during say reorg pass.
Can this be implemented? How feasible is this?

Regards,
Shafi


Re: How to support 40bit GP register - Take two

2009-12-17 Thread Mohamed Shafi
2009/12/18 Hans-Peter Nilsson :
> On Fri, 20 Nov 2009, Mohamed Shafi wrote:
>> I tried implementing the suggestion given by Richard, but got into
>> issues. The GCC frame work is written assuming that there are no modes
>> with HOST_BITS_PER_WIDE_INT < GET_MODE_BITSIZE (mode) < 2 *
>> HOST_BITS_PER_WIDE_INT.
>
> (Not seeing a reply regarding this issue, so here's mine, belated:)
>
> Perhaps a wart, but with a 64-bit HOST_BITS_PER_WIDE_INT, would
> that affect your port?  It's not?  Just set need_64bit_hwint=yes
> in config.gcc.  And send a patch for the introductory comment in
> that file, unless your port already matches the "BITS_PER_WORD >
> 32 bits" condition.
>
   Thanks Hans for yourr reply
   I have already tried that. What you are suggesting is the first
solution that i got from Richard Henderson. I have mentioned the
issues if faced with this in my mail. The GCC frame work is written
assuming that there are no modes with HOST_BITS_PER_WIDE_INT <
GET_MODE_BITSIZE (mode) < 2 * HOST_BITS_PER_WIDE_INT. So i had to hack
at places to get things working. For my target the BITS_PER_WORD ==
32. The mode that i am using is RImode (5bytes)

Regards,
Shafi


How to implement pattens with more that 30 alternatives

2009-12-21 Thread Mohamed Shafi
Hi all,

I am doing a port in GCC 4.4.0 for a 32 bit target. As a part of
scheduling framework i have to write the move patterns with more
clarity, so that i could control the scheduling with the help of
attributes. Re-writting the pattern resulted in movsi pattern with 41
alternatives :(  When i specify the attributes it seems that all the
alternatives above 31 are allocated with the default value of the
attribute. This is done in the generated file insn-attrtab.c. The
following is one such piece of code:


case 2:  /* *movsi_internal */
  extract_constrain_insn_cached (insn);
  if (((1 << which_alternative) & 0xf))
{
  return DELAY_SLOT_TYPE_CLOB_SR;
}
  else if (((1 << which_alternative) & 0x30))
{
  return DELAY_SLOT_TYPE_RW_SP;
}
  else if (which_alternative == 6)
{
  return DELAY_SLOT_TYPE_CLOB_SR;
}
  else if (((1 << which_alternative) & 0x1fff80))
{
  return DELAY_SLOT_TYPE_COMMON;
}
  else if (((1 << which_alternative) & 0x1e0))
{
  return DELAY_SLOT_TYPE_RW_SP;
}
  else if (which_alternative == 25)
{
  return DELAY_SLOT_TYPE_READ_SR;
}
  else if (which_alternative == 26)
{
  return DELAY_SLOT_TYPE_READ_EMR;
}
  else if (which_alternative == 27)
{
  return DELAY_SLOT_TYPE_COMMON;
}
  else if (which_alternative == 28)
{
  return DELAY_SLOT_TYPE_WRITE_SR;
}
  else
{
  return DELAY_SLOT_TYPE_COMMON;
}


As you can see from the above code all the alternatives which are more
that 31 will always get the default value of the attribute. This is
because GCC assumes that the target has only 31 alternatives. Even
changing the macro

#define MAX_RECOG_ALTERNATIVES 30

in the file recog.h there is no change in this assumption. (Which i
think should have affected the attribute calulation). I guess that if
i make need_64bit_hwint=yes , then this problem should go away. I
havent check this. But i dont want to do that, since this means that i
will have to change all the dependencies that are affected by this
change. Is there any other solution for my problem?

Any help is appreciated.

Regards,
Shafi


Re: How to implement pattens with more that 30 alternatives

2009-12-22 Thread Mohamed Shafi
2009/12/22 Richard Earnshaw :
>
> On Mon, 2009-12-21 at 18:44 +, Paul Brook wrote:
>> > > > I am doing a port in GCC 4.4.0 for a 32 bit target. As a part of
>> > > > scheduling framework i have to write the move patterns with more
>> > > > clarity, so that i could control the scheduling with the help of
>> > > > attributes. Re-writting the pattern resulted in movsi pattern with 41
>> > > > alternatives :(
>> > >
>> > > Use rtl expressions instead of alternatives. e.g. arm.md:arith_shiftsi
>> >
>> > Or use the more modern iterators approach.
>>
>> Aren't iterators for generating multiple insns (e.g. movsi and movdi) from 
>> the
>> same pattern, whereas in this case we have a single insn  that needs to 
>> accept
>> many different operand combinartions?
>
> Yes, but that is often better, I suspect, than having too fancy a
> pattern that breaks the optimization simplifications that genrecog does.
>
> Note that the attributes that were requested could be made part of the
> iterator as well, using a mode_attribute.
>
  I can't find a back-end that does this. Can you show me a example?

Regards,
Shafi


Question about peephole2 and addressing mode

2010-01-21 Thread Mohamed Shafi
Hello all,

I am doing a port for a 32bit a target in GCC 4.4.0. The target
supports (base + offset) addressing mode for QImode store instructions
but not for QImode load instructions. GCC doesn't take the middle
path. It either supports an addressing mode completely and doesn't
support at all. I tried lot of hacks to support  (base + offset)
addressing mode only for QI mode store instructions. After a lot of
fight i finally gave up and removed the QImode support for this
addressing mode completely in GO_IF_ LEGITIMATE_ADDRESS macro. Now i
am pursing an alternate solution. Have peephole2 patterns to implement
QImode (base+offset) addressing mode for store instructions. How does
it sound?

So now i have written a peephole2 pattern like:

(define_peephole2
 [(parallel
   [(set (match_operand:SI 0 "register_operand" "")
 (plus:SI (match_operand:SI 1 "register_operand" "")
  (match_operand:SI 2 "const_int_operand" "")))
(clobber (reg:CCC CC_REGNUM))
(clobber (reg:CCO EMR_REGNUM))])
  (set (mem:QI (match_dup 0))
   (match_operand:QI 3 "register_operand" ""))]
 "REGNO_OK_FOR_BASE_P (REGNO (operands[1]))
  && constraint_satisfied_p (operands[2], CONSTRAINT_N)"
 [(set (mem:QI (plus:SI (match_dup 1) (match_dup 2)))
   (match_dup 3))]
 "")


In the rtl dumps just before peephole2 pass i get

(insn 213 211 215 39 20010408-1.c:71 (parallel [
(set (reg/f:SI 16 r0 [121])
(plus:SI (reg/v/f:SI 18 r2 [orig:93 p ] [93])
(const_int -1 [0x])))
(clobber (reg:CCC 50 sr))
(clobber (reg:CCO 54 emr))
]) 18 {addsi3} (expr_list:REG_UNUSED (reg:CCO 54 emr)
(expr_list:REG_UNUSED (reg:CCC 50 sr)
(nil

(insn 215 213 214 39 20010408-1.c:71 (set (mem/f/c/i:SI (plus:SI
(reg/f:SI 23 r7)
(const_int -32 [0xffe0])) [5 s+0 S4 A32])
(reg/v/f:SI 18 r2 [orig:93 p ] [93])) 2 {*movsi_internal}
(expr_list:REG_DEAD (reg/v/f:SI 18 r2 [orig:93 p ] [93])
(nil)))

(insn 214 215 284 39 20010408-1.c:71 (set (mem:QI (reg/f:SI 16 r0
[121]) [0 S1 A8])
(reg/v:QI 6 d6 [orig:92 ch ] [92])) 0 {*movqi_internal}
(expr_list:REG_DEAD (reg/f:SI 16 r0 [121])
(expr_list:REG_DEAD (reg/v:QI 6 d6 [orig:92 ch ] [92])
(nil


This is not match by the peephole2 pattern. After debugging i see that
the function 'peephole2_insns' matches only consecutive patterns. Is
that true? Is there a way to over come this?

Another issue. In another instance peephole2 matched but the generated
pattern did not get recognized because GO_IF_ LEGITIMATE_ADDRESS was
rejecting the addressing mode. Since peephole2 pass was run after
reload i changed GO_IF_ LEGITIMATE_ADDRESS macro to allow the
addressing mode after reload is completed. So now the check is
something like this:

 case PLUS:
{
  rtx offset = XEXP (x, 1);
  rtx base = XEXP (x, 0);

  if ( !(BASE_REG_RTX_P (base, strict) || STACK_REG_RTX_P (base)))
return 0;

  /* For QImode the target does not suppport (base + offset) address
 in the load instructions. So we disable this addressing
mode till reload is completed. */
  if (!reload_completed && mode == QImode && BASE_REG_RTX_P
(base, strict))
return 0;

I haven't run the testsuite, but Is this ok to have like this?
Please let me know your thoughts on this.

Thanks for your time.

Regards
Shafi


Re: Question about peephole2 and addressing mode

2010-01-22 Thread Mohamed Shafi
2010/1/22 Richard Henderson :
> On 01/21/2010 06:22 AM, Mohamed Shafi wrote:
>>
>> Hello all,
>>
>> I am doing a port for a 32bit a target in GCC 4.4.0. The target
>> supports (base + offset) addressing mode for QImode store instructions
>> but not for QImode load instructions. GCC doesn't take the middle
>> path. It either supports an addressing mode completely and doesn't
>> support at all. I tried lot of hacks to support  (base + offset)
>> addressing mode only for QI mode store instructions. After a lot of
>> fight i finally gave up and removed the QImode support for this
>> addressing mode completely in GO_IF_ LEGITIMATE_ADDRESS macro. Now i
>> am pursing an alternate solution. Have peephole2 patterns to implement
>> QImode (base+offset) addressing mode for store instructions. How does
>> it sound?
>
> It doesn't sound totally implausible.  But as you notice, peepholes only act
> on sequential instructions.  In order to assist generating sequential
> instructions, you could try allowing base+offset in non-strict mode.
>
> I fear you'll likely have to use a combination of methods in order to get
> decent code here.

 can you point out the combination of the methods in your mind? And by
the way is it possible to allow the addressing mode completely and
then break it down into register indirect addressing mode after reload
pass?

Regards,
Shafi


How to write legitimize_reload_address

2010-01-22 Thread Mohamed Shafi
Hi all,

I am doing a port of a 32bit target in GCC 4.4.0. I have written the
macro legitimize_reload_address which does something similar to what
the target pa does. i.e

   For the PA, transform:

memory(X + )

   into:

if ( & mask) >= 16
  Y = ( & ~mask) + mask + 1  Round up.
else
  Y = ( & ~mask) Round down.
Z = X + Y
memory (Z + ( - Y));


The input for the macro is

(plus:SI (reg/f:SI 23 r7)
(const_int 65536 [0x1]))

and the target support only 15bit signed offset so in
legitimize_reload_address i have

 mask = 0x3fff;
 offset = INTVAL (XEXP ((addr), 1));

  /* Choose rounding direction.  Round up if we are >= halfway.  */
  if ((offset & mask) >= ((mask + 1) / 2))
newoffset = (offset & ~mask) + mask + 1;
  else
newoffset = offset & ~mask;

  /* Ensure that long displacements are aligned.  */
  newoffset &= ~(GET_MODE_SIZE (mode) - 1);

  if (newoffset)
{
  temp = gen_rtx_PLUS (Pmode, XEXP (addr, 0),
   GEN_INT (newoffset));
  addr = gen_rtx_PLUS (Pmode, temp, GEN_INT (offset - newoffset));
  push_reload (XEXP (addr, 0), 0, &XEXP (addr, 0), 0,
   BASE_REG_CLASS, Pmode, VOIDmode, 0, 0,
   opnum, type);
  return addr;
}


The macro is defined like this:

#define LEGITIMIZE_RELOAD_ADDRESS(X,MODE,OPNUM,TYPE,IND_L,WIN) \
do {   \
  rtx new_x = legitimize_reload_address (X, MODE, OPNUM, TYPE, IND_L); \
  if (new_x)   \
{  \
  X = new_x;   \
  goto WIN;\
}  \
} while (0)

I issue that i am facing is that if i return null_rtx without doing
any processing the complier works propely. But if
legitimize_reload_address gets executed and jumbs to the label WIN
iget ICE.

ice1.c:5: error: unrecognizable insn:
(insn 45 44 20 2 ice1.c:5 (set (mem/c:SI (plus:SI (reg:SI 16 r0)
(const_int 65536 [0x1])) [4 S4 A32])
(reg:SI 2 d2)) -1 (nil))


Is there something wrong with my legitimize_relaod_address?

Thanks for your time.
Regards,
shafi


how to identify a part of a multi-word register

2010-02-10 Thread Mohamed Shafi
Hi,

I am doing a port for a 32bit target in GCC 4.4.0. I need a way to
identify that a register is part of a multiword register. I need to
emit an instruction that works on LSW of the double word register on
move instructions. Currently the target splits the DImode and DFmode
moves after reloading. So i am able to generate the required
instruction while doing the split. But it seems that sometimes the
subreg pass splits the multiword register into SImode or SFmode
register references before reg-alloc. Since it is not required to
split these moves, I am not able to insert the required instruction
for LSW.  So I was wondering if it is possible to recognize a register
as a part of a multiword register? In the rtl-dumps there are
expressions like :

(insn 255 254 256 2 pr28634.c:13 (set (mem/v/c/i:SI (plus:SI (reg/f:SI 49 sp)
(const_int -16 [0xfff0])) [2 y+0 S4 A64])
(reg:SI 2 d2)) 2 {*movsi_internal} (nil))

(insn 256 255 257 2 pr28634.c:13 (set (mem/v/c/i:SI (plus:SI (reg/f:SI 49 sp)
(const_int -12 [0xfff4])) [2 y+4 S4 A32])
(reg:SI 3 d3 [+4 ])) 2 {*movsi_internal} (nil))

which points out that d3 is part of a multiword register. Looking into
the gcc sources I find that this is done with the help of REG_OFFSET
macro. So can I use this macro to identify a register as a part of
multiword register? Is there any other way to do this?

Regards,
Shafi


Variable Length Execution Set?

2009-05-27 Thread Mohamed Shafi
Hi all,

Does GCC support architectures that has Variable Length Execution Set (VLES)?
Are there any developments happening in this direction?

Regards,
Shafi


Re: Variable Length Execution Set?

2009-05-27 Thread Mohamed Shafi
2009/5/27 Ian Lance Taylor :
> Mohamed Shafi  writes:
>
>> Does GCC support architectures that has Variable Length Execution Set (VLES)?
>> Are there any developments happening in this direction?
>
> gcc supports many instruction sets whose instructions are not all the
> same size, including x86.  In particular, gcc supports ia64, which uses
> bundling.  If you mean something else, I think you need to give more
> details.

I know that GCC supports VLIW. VLES is similar to VLIW, except that in
a packet i can have variable number of instruction. ie. each packet
should contain at least one instruction with a max of 6 instructions
in a packet.

Shafi


About feasibility of implementing an instruction

2009-07-01 Thread Mohamed Shafi
Hello all,

I just want to know about the feasibility of implementing an
instruction for a port in gcc 4.4
The target has 40 bit register where the normal load/store/move
instructions will be able to access the 32 bits of the register. In
order to move data into the rest of the register [b32 to b39] the data
has to be stored into a 32bit memory location. The data should be
stored in such a way that if it is stored for 0-7 in memory the data
can be moved to b32-b39 of a even register and if the data in the
memory is stored in 16-23 of the memory word then it can be moved to
b32-b39 of a odd register. Hope i make myself clear.

Will it be possible to implement this in the gcc back-end so that the
particular instruction is supported?


Regards,
Shafi


CALL_USED_REGISTERS vs CALL_REALLY_USED_REGISTERS

2009-07-09 Thread Mohamed Shafi
Hello all,

The GCC 4.4.0 internal says :
[Macro] CALL_REALLY_USED_REGISTERS
Like CALL_USED_REGISTERS except this macro doesn’t require that the
entire set of
FIXED_REGISTERS be included. (CALL_USED_REGISTERS must be a superset of FIXED_
REGISTERS). This macro is optional. If not specifed, it defaults to the value of
CALL_USED_REGISTERS.

But it doesn't say why one needs to use this.
What is the need for the macro CALL_REALLY_USED_REGISTERS when
compared to CALL_USED_REGISTERS?

regards,
Shafi


Help for target with BITS_PER_UNIT = 16

2010-08-16 Thread Mohamed Shafi
Hello all,

I am trying to port GCC 4.5.1 for a processor that has the following
addressing capability:

The data memory address space of 64K bytes is represented by a total
of 15 bits, with each address selecting a 16-bit element. When using
the address register, the LSB of address reg (AD) points to a 16-bit
field in data memory. If a data memory line is 128 bits there are 8,
16-bit elements per data memory line. We use little endian addressing,
so
if AD=0, bits [15:0] of data memory line address 0 would be selected.
If AD=1, bits [31:16] of data memory line address 0 would be selected.
If AD=9, bits [31:16] of data memory line address 1 would be selected.

So if i have the following program

short arr[5] = {11,12,13,14,15};

int foo ()
{
    short a = arr[0] + arr[3];
    return a;
}

Assume that short is 16bits and short address is 2byte aligned.Then I
expect the following code to get generated:

mov a0,#arr       // Load the address

mov a1, a0        // Copy the address
add a1, 1          // Increment the location by 1 so that the address
points to arr[1]
ld.16 g0, (a1)    // Load the value 12 into g0

mov a1, a0        // Copy the address
add a1, 3          // Increment the location by 3 so that the address
points to arr[3]
ld.16 g1, (a1)    // Load the value 14 into g0

add g1, g1, g0  // Add 12 and 14

For the following code:

short arr[5] = {11,12,13,14,15};

int foo ()
{
short a,b;

a = (short) (&arr[3] - &arr[1]); // a is 2 after this operation
b = (short) ((char*)&arr[3] - (char*)&arr[1]);  // b is 4 after this operation

return a;
}

My question is should i set the macro BITS_PER_UNIT = 16 to get a code
generated like this? From IRC chat i realize that  BITS_PER_UNIT != 8
is seriously rotten. If that is the case how can i proceed to port
this target?

Regards,
Shafi


Need help in deciding the instruction set for a new target.

2010-08-23 Thread Mohamed Shafi
Hello all,

I am trying to do a port on GCC 4.5. The target has a memory
resolution of 32bits i.e. char is 32bits in the target (addr 0 selects
1st 32bit and addr 1 selects 2nd 32bit). It has only word (32bit)
access.

In terms of address resolution this target is similar to c4x which
became obsolete in GCC 4.2. There are two ways to implement this port.
One is to have BITS_PER_UNIT ==32, like c4x and other is to have a
normal C like char == 8, short == 16, and int == 32. We are thinking
about having BITS_PER_UNIT == 32. Yes I know the support for such a
target is bit rotten in GCC. I am currently trying to removing it.

In the mean time, we are in the process of finalizing the
instructions. The current instruction set has support for 32bit
immediate data only in move operations. i.e.

move src1GP, #imm32

For all other operations like div, sub, add, compare, modulus, load,
store the support is only for 16bit immediate. For all these
instruction there is separate flavor for sign and zero extension. i.e.

mod.s32 srcdstGP, #imm16 // 32%imm16   signed modulus
mod.u32 srcdstGP, #imm16 // 32%imm16 unsigned modulus

cmp.s32 src1GP, #imm16 // signed register to 16-bit immediate compare
cmp.u32 src1GP, #imm16 // unsigned register to 16-bit immediate compare

sub.s32 srcdstGP, #imm16 // signed 16-bit register to immediate subtract
sub.u32 srcdstGP, #imm16 // unsigned 16-bit register to immediate subtract


I want to know if it is good to have both sign and zero extension for
16bit immediate.
Will it be of any use with a configuration where char == short == int == 32bit?
Will I be able to support these kinds of instructions in a GCC port?
Or will it good to have a separate sign and zero extension
instruction, which the current instruction set doesn’t have.
Do I need a separate sign and zero ext instructions along with the
above instructions?

It would be of great help if you could guide me in deciding these instructions.

Regards,
Shafi


Help with reloading FP + offset addressing mode

2010-10-28 Thread Mohamed Shafi
Hi,

I am doing a port in GCC 4.5.1. For the port

1. there is only (reg + offset) addressing mode only when reg is SP.
Other base registers are not allowed
2. FP cannot be used as a base register. (FP based addressing is done
by copying it into a base register)

In order to take advantage of FP elimination (this will create SP +
offset addressing), what i did the following

1. Created a new register class (address registers + FP) and used this
new class as the BASE_REG_CLASS
2. Defined HARD_REGNO_OK_FOR_BASE_P like the following :

#define HARD_REGNO_OK_FOR_BASE_P(NUM) \
    ((NUM) < FIRST_PSEUDO_REGISTER \
 && (((reload_completed || reload_in_progress)? 0 : (NUM) == FP_REG) \
 || REGNO_REG_CLASS(NUM) == ADD_REGS))

3. In legitimate_address_p i have the followoing:

  if (REGNO (x) == FP_REG)
    {
  if (strict)
    return false;
  else
    return true;
    }
  else if (strict)
    return STRICT_REG_OK_FOR_BASE_P (REGNO (x));
  else
    return NONSTRICT_REG_OK_FOR_BASE_P (REGNO (x));

But when FP doesn't get eliminated i will get address of the form

(plus:QI (reg/f:QI 27 as15) (const_int 2))

which gets reloaded by replacing FP with address register, other than
SP. I am guessing this happens because of modified BASE_REG_CLASS. I
haven't confirmed this. So in order to over come this what i have done
is, in legitimize_reload_address i have the following :

  if (GET_CODE (*x) == PLUS
  && REG_P (XEXP (*x, 0))
  && REGNO (XEXP (*x, 0)) < FIRST_PSEUDO_REGISTER
  && GET_CODE (XEXP (*x, 1)) == CONST_INT
  && XEXP (*x, 0) == frame_pointer_rtx)
    {
   /* GCC will by default reload the FP into a BASE_CLASS_REG,
  which results in an invalid address.  For us, the best
  thing to do is move the whole expression to a REG.  */
  push_reload (*x, NULL_RTX, x, NULL, SPAA_REGS,
   mode, VOIDmode,0, 0, opnum, (enum reload_type)type);
  return 1;
    }

Does my logic makes sense? Is there any better way to implement this?

With this implementation for the following sequence :

(insn 9 6 10 2 fun_calls.c:12 (set (reg/f:QI 42)
    (mem/f/c/i:QI (plus:QI (reg/f:QI 33 AP)
    (const_int -2 [0xfffe])) [0 f+0 S1 A32]))
9 {movqi_op} (nil))

(insn 10 9 11 2 fun_calls.c:12 (set (reg:QI 43)
    (const_int 60 [0x3c])) 7 {movqi_op} (nil))

I am getting the following output:

(insn 45 6 47 2 fun_calls.c:12 (set (reg:QI 28 a0)
    (const_int 2 [0x2])) 9 {movqi_op} (nil))

(insn 47 45 48 2 fun_calls.c:12 (set (reg:QI 28 a0)
    (reg/f:QI 27 as15)) 9 {movqi_op} (nil))

(insn 48 47 49 2 fun_calls.c:12 (set (reg:QI 28 a0)
    (plus:QI (reg:QI 28 a0)
    (const_int 2 [0x2]))) 14 {addqi3} (expr_list:REG_EQUIV
(plus:QI (reg/f:QI 27 as15)
    (const_int 2 [0x2]))
    (nil)))

(insn 49 48 10 2 fun_calls.c:12 (set (reg/f:QI 0 g0 [42])
    (mem/f/c/i:QI (reg:QI 28 a0) [0 f+0 S1 A32])) 9 {movqi_op} (nil))

insn 45 is redundant. Is this generated because the
legitimize_reload_address is wrong?

Any hints as to why the redundant instruction gets generated?

Regards,
Shafi


Re: Help with reloading FP + offset addressing mode

2010-10-28 Thread Mohamed Shafi
On 29 October 2010 00:06, Joern Rennecke  wrote:
> Quoting Mohamed Shafi :
>
>> Hi,
>>
>> I am doing a port in GCC 4.5.1. For the port
>>
>> 1. there is only (reg + offset) addressing mode only when reg is SP.
>> Other base registers are not allowed
>> 2. FP cannot be used as a base register. (FP based addressing is done
>> by copying it into a base register)
>> In order to take advantage of FP elimination (this will create SP +
>> offset addressing), what i did the following
>>
>> 1. Created a new register class (address registers + FP) and used this
>> new class as the BASE_REG_CLASS
>
> Stop right there.  You need to distinguish between FRAME_POINTER_REGNUM
> and HARD_FRAME_POINTER_REGNUM.
>

From the description given in the internals, i am not able to
understand why you suggested this. Could you please explain this?

Shafi


Re: Help with reloading FP + offset addressing mode

2010-11-02 Thread Mohamed Shafi
On 30 October 2010 05:45, Joern Rennecke  wrote:
> Quoting Mohamed Shafi :
>
>> On 29 October 2010 00:06, Joern Rennecke 
>> wrote:
>>>
>>> Quoting Mohamed Shafi :
>>>
>>>> Hi,
>>>>
>>>> I am doing a port in GCC 4.5.1. For the port
>>>>
>>>> 1. there is only (reg + offset) addressing mode only when reg is SP.
>>>> Other base registers are not allowed
>>>> 2. FP cannot be used as a base register. (FP based addressing is done
>>>> by copying it into a base register)
>>>> In order to take advantage of FP elimination (this will create SP +
>>>> offset addressing), what i did the following
>>>>
>>>> 1. Created a new register class (address registers + FP) and used this
>>>> new class as the BASE_REG_CLASS
>>>
>>> Stop right there.  You need to distinguish between FRAME_POINTER_REGNUM
>>> and HARD_FRAME_POINTER_REGNUM.
>>>
>>
>> From the description given in the internals, i am not able to
>> understand why you suggested this. Could you please explain this?
>
> In order to trigger reloading of the address, you have to have a register
> elimination, even if the stack pointer is not a suitable destinatination
> for the elimination.  Also, if you want to reload do the work for you,
> you must not lie to it about the addressing capabilities of an actual hard
> register.  Hence, you need separate hard and soft frame pointers.
>

Debugging sessions of the reload pass tells me that if the reload_pass
get the address of the form (reg + off), it assumes one of the
following:

1. the address is invalid because 'reg' is not a suitable base register
2. the offset is out of range
3. the address has an eliminatable register as a base register.

Depending on what it finds, reload_pass reloads the address
accordingly. So for my target when the pass encounters the address of
the form:

(plus:QI (reg/f:QI 33 ArgP) (const_int -2 [0xfffe]))

it eliminates the arg pointer to either stack or frame pointer and
reloads it. If the base register is FP, during reloading it just
reloads the FP with a valid base register, but then the address
becomes invalid. Relaod_pass cannot figure out that the addressing
mode itself is invalid due to wrong base register. Since SP is the
only valid register among the base registers that can form (reg + off)
addressing mode, for the reload to work properly i will have to allow
this addressing mode only when SP is base register - even in
non-strict mode. But then i will loose lot of oppurtunities when
elimination happens in favour of SP. Hence i allow the above form of
address for all frame related pesudos.

So to respond to your comments, i agree that as far as possible the
port has to be truthful to reload pass about the addressing mode
capabilities, but then i am not sure if distinguishing between
FRAME_POINTER_REGNUM and HARD_FRAME_POINTER_REGNUM will help my cause.

Do you agree? Or am i not understanding what your suggestion implies?

Shafi


Opinion on a hardware feature for conditional instructions

2010-11-09 Thread Mohamed Shafi
Hi all,

I need a opinion on a design front. I am doing a port for a private
target in GCC 4.5.1. We are also in the process of designing the
hardware along with the development of the build tools. Currently we
don't have enough bits in the encoding to support conditional
instruction like arm does. i.e. you have the option to decide whether
to update the status flags or not. So what is the next best thing to
have?

1. Allow both conditional and non-conditional instructions to update
the status flags
2. Allow only non-conditional instructions to update the status flags

Could you please let me know your thoughts on this and the reason for
choosing it?

Regards,
Shafi


A question about combining constraints

2010-11-12 Thread Mohamed Shafi
Hi,

For a private target that i am porting in GCC 4.5 I have the following
pattern in my md file for call value:


(define_insn "call_value_op"
  [(set (match_operand 0 "register_operand" "=da")
(call (mem:QI (match_operand:QI 1 "call_operand" "Wd"))
  (match_operand:QI 2 "" "")))]
  ""
  "jsr\\t%1"
  [(set_attr "slottable" "has_slot")]
)

All the constraints are one letter constraints for my target. Here 'W'
is for symbol_ref and all others are register constraints. So for a
particular combination when operand 0 is 'a' and operand 1 is 'W' i
got the following ICE :

error: unable to generate reloads for:
(call_insn 11 4 12 2 test.c:7 (set (reg:QI 12 as0)
(call (mem:QI (symbol_ref:QI ("malloc") [flags 0x41]
) [0 S1 A32])
(const_int 0 [0x0]))) 50 {call_value_op}
(expr_list:REG_DEAD (reg:QI 0 g0)
(expr_list:REG_EH_REGION (const_int 0 [0x0])
(nil)))
(expr_list:REG_DEP_TRUE (use (reg:QI 0 g0))
(nil)))

I get this ICE because the constraints are not matched properly. I ICE
goes away when i write the constraints as:

"=ad", "Wd"

or

"a,a,d,d," , "W,W,d,d"

So i have the following questions:

1. Why is that constraints are not matched here?
2. When can i combine the constrains?

Regards,
Shafi


Re: A question about combining constraints

2010-11-12 Thread Mohamed Shafi
On 12 November 2010 18:39, Joern Rennecke  wrote:
> Quoting Mohamed Shafi :
>
>> So i have the following questions:
>>
>> 1. Why is that constraints are not matched here?
>
> Please read the node "Register Classes" in doc/tm.texi .
>

I am sorry , could you please highlight the relevant portion for me?
In the pattern that i have given the combination (a,W) satisfies the
pattern. But its not matched because i have given then like (da,Wd). I
know that we can combine the constraints together.

Shafi


Re: Help with reloading FP + offset addressing mode

2010-11-24 Thread Mohamed Shafi
On 30 October 2010 05:45, Joern Rennecke  wrote:
> Quoting Mohamed Shafi :
>
>> On 29 October 2010 00:06, Joern Rennecke 
>> wrote:
>>>
>>> Quoting Mohamed Shafi :
>>>
>>>> Hi,
>>>>
>>>> I am doing a port in GCC 4.5.1. For the port
>>>>
>>>> 1. there is only (reg + offset) addressing mode only when reg is SP.
>>>> Other base registers are not allowed
>>>> 2. FP cannot be used as a base register. (FP based addressing is done
>>>> by copying it into a base register)
>>>> In order to take advantage of FP elimination (this will create SP +
>>>> offset addressing), what i did the following
>>>>
>>>> 1. Created a new register class (address registers + FP) and used this
>>>> new class as the BASE_REG_CLASS
>>>
>>> Stop right there.  You need to distinguish between FRAME_POINTER_REGNUM
>>> and HARD_FRAME_POINTER_REGNUM.
>>>
>>
>> From the description given in the internals, i am not able to
>> understand why you suggested this. Could you please explain this?
>
> In order to trigger reloading of the address, you have to have a register
> elimination, even if the stack pointer is not a suitable destinatination
> for the elimination.  Also, if you want to reload do the work for you,
> you must not lie to it about the addressing capabilities of an actual hard
> register.  Hence, you need separate hard and soft frame pointers.
>
> If you have them, but conflate them when you describe what you are doing
> in your port, you are not only likely to confuse the listener/reader,
> but also your documentation, your code, and ultimately yourself.
>

Having a FRAME_POINTER_REGNUM and HARD_FRAME_POINTER_REGNUM will
trigger reloading of address. But for the following pattern

(insn 3 2 4 2 test.c:120 (set (mem/c/i:QI (plus:QI (reg/f:QI 35 SFP)
 (const_int 1 [0x1])) [0 c+0 S1 A32])
(reg:QI 0 g0 [ c ])) 7 {movqi_op} (nil))

where SFP is FRAME_POINTER_REGNUM, an elimination will result in

(insn 3 2 4 2 test.c:120 (set (mem/c/i:QI (plus:QI (reg/f:QI 27 as15)
 (const_int 1 [0x1])) [0 c+0 S1 A32])
(reg:QI 0 g0 [ c ])) 7 {movqi_op} (nil))

where as15 is the HARD_FRAME_POINTER_REGNUM. But remember this new
address is not valid (as only SP is allowed in this addressing mode).
When the above pattern is reloaded i get:

(insn 28 27 4 2 test.c:120 (set (mem/c/i:QI (plus:QI (reg:QI 28 a0)
 (const_int 1 [0x1])) [0 c+0 S1 A32])
  (reg:QI 3 g3)) -1 (nil))

I get unrecognizable insn ICE, because this addressing mode is not
valid. I believe this happens because when the reload_pass get the
address of the form (reg + off), it assumes that the address is
invalid due to one of the following:

1. 'reg' is not a suitable base register
2. the offset is out of range
3. the address has an eliminatable register as a base register.

Is there any way to over come this one?

Any help is appreciated.

Shafi


Help with reloading

2010-12-15 Thread Mohamed Shafi
Hi,

I am doing a port in GCC 4.5.1.
The target supports storing immediate values into memory location
represented by a symbolic address. So in the move pattern i have given
constraints to represent this.

(define_insn "movqi_op"
  [(set (match_operand:QI 0 "nonimmediate_operand" "=!Q,!Q,d,d,d,d,d,d,d,Q,R,S")
(match_operand:QI 1 "general_operand"   "I,J,i,W,J,d,Q,R,S,d,d,d"))]
  ""
  "@
  st.s32\t%0, %1;
  st.u32\t%0, %1;
  set\t%0, %1;
  set.u32\t%0, %1;
  set.u32\t%0, %1;
  move\t%0, %1;
  ld%u0\t%0, %1;
  ld%u0\t%0, %1;
  ld%u0\t%0, %1;
  st%u0\t%0, %1;
  st%u0\t%0, %1;
  st%u0\t%0, %1;"
 )

where
Q represents symbolic address,
R represents all address formed using SP
S represents all address formed using address registers
I, J,W,i represents various const_ints
d represents general registers.


Whenever reload get a pattern to store const_int to a memory that is
scheduled for reloading, the reload pass will match it with Q
constraints. So to avoid those i added the constrain modifier '!' to
'Q'. But even then there is one particular case that causes trouble.
This happens when reload pass gets a pattern where the destination is
an illegal address and source is a pesudo register (no register
allocated) for which reg_equiv_constant[regno] != 0.

Before IRA pass:

(insn 27 25 33 4 pr23848-3.c:12 (set (mem/s/j:QI (reg/f:PQI 69) [0 S1 A32])
(reg:QI 93)) 7 {movqi_op} (expr_list:REG_DEAD (reg/f:PQI 69)
(expr_list:REG_EQUAL (const_int 49 [0x31])
(nil

Just before reloading phase:

(insn 27 25 33 4 pr23848-3.c:12 (set (mem/s/j:QI (reg/f:PQI 12 as0
[69]) [0 S1 A32])
(reg:QI 93)) 7 {movqi_op} (expr_list:REG_DEAD (reg/f:PQI 12 as0 [69])
(expr_list:REG_EQUAL (const_int 0 [0x0])
(nil

Since reg93 is not allocated with any register, its replaced with
reg_equiv_constant[regno], and this combination wins the (Q, I)
constraint pair and in that process 'losers' (variable in loop over
alternatives) becomes 0 and hence breaks out and returns. Due to this
compiler crashes with "insn does not satisfy its constraints:"  error.
Any pointers in fixing this?

Regards,
Shafi

P.S. When can we merge constraints? What are the criteria to decide
which all constraints to merge


Re: Help with reloading

2010-12-20 Thread Mohamed Shafi
On 20 December 2010 10:56, Jeff Law  wrote:
> On 12/15/10 07:14, Mohamed Shafi wrote:
>>
>> Hi,
>>
>> I am doing a port in GCC 4.5.1.
>> The target supports storing immediate values into memory location
>> represented by a symbolic address. So in the move pattern i have given
>> constraints to represent this.
>
> Presumably the target does not support storing an immediate value into other
> MEMs?  ie, the only store-immediate is to a symbolic memory operand, right?
>

yes you are right.

> I think this is a case where you're going to need a secondary reload to
> force the immediate into a register if the destination is a non-symbolic MEM
> or a pseudo without a hard reg and its equivalent address is non-symbolic.
>
I am not sure how i should be implementing this.
Currently in define_expand for move i have code to force the
immediate value into a register if the destination is not a symbolic
address. If i understand correctly this is the only place where i can
decide what to do with the source depending on the destination. right?

Moreover for the pattern

(insn 27 25 33 4 pr23848-3.c:12 (set (mem/s/j:QI (reg/f:PQI 12 as0
[69]) [0 S1 A32])
   (reg:QI 93)) 7 {movqi_op} (expr_list:REG_DEAD (reg/f:PQI 12 as0 [69])
   (expr_list:REG_EQUAL (const_int 0 [0x0])
   (nil

destination is the src operand gets converted by

  /* This is equivalent to calling find_reloads_toplev.
 The code is duplicated for speed.
 When we find a pseudo always equivalent to a constant,
 we replace it by the constant.  We must be sure, however,
 that we don't try to replace it in the insn in which it
 is being set.  */
  int regno = REGNO (recog_data.operand[i]);
  if (reg_equiv_constant[regno] != 0
  && (set == 0 || &SET_DEST (set) != recog_data.operand_loc[i]))
{
  /* Record the existing mode so that the check if constants are
 allowed will work when operand_mode isn't specified.  */

  if (operand_mode[i] == VOIDmode)
operand_mode[i] = GET_MODE (recog_data.operand[i]);

  substed_operand[i] = recog_data.operand[i]
= reg_equiv_constant[regno];
}

and since the destination is already selected for reload

/* If the address was already reloaded,
   we win as well.  */
else if (MEM_P (operand)
 && address_reloaded[i] == 1)
  win = 1;

the reload phase never reaches secondary reload.
So i do not understand your answer. Could you explain it briefly.

Regards,
Shafi


Re: Help with reloading

2010-12-20 Thread Mohamed Shafi
On 20 December 2010 19:30, Jeff Law  wrote:
> On 12/20/10 01:47, Mohamed Shafi wrote:
>>
>>
>>> I think this is a case where you're going to need a secondary reload to
>>> force the immediate into a register if the destination is a non-symbolic
>>> MEM
>>> or a pseudo without a hard reg and its equivalent address is
>>> non-symbolic.
>>>
>>     I am not sure how i should be implementing this.
>>     Currently in define_expand for move i have code to force the
>> immediate value into a register if the destination is not a symbolic
>> address. If i understand correctly this is the only place where i can
>> decide what to do with the source depending on the destination. right?
>
> Just changing the movxx expander is not sufficient since for this case you
> do not know until reload time whether or not a particular insn needs an
> extra register to implement the move.   That's the whole point of the
> secondary reload mechanism -- to allow you to allocate a scratch register
> during reloading to handle oddball cases like this.
>
>
> In your secondary reload code you'll need to check for the case where the
> destination is a MEM and the source is an unallocated pseudo with a constant
> equivalent and return a suitable register class for that case.
>
   Jeff, thanks for the reply.
   I didn't know that you could do that in TARGET_SECONDARY_RELOAD
hook. Can you point me to some target that does this - figuring out
what the destination is based on the source or vice versa. In my case
only the address operand comes into TARGET_SECONDARY_RELOAD hook
during the reload pass. I am not sure how to find out the source for
the pattern which has this particular address as the destination.

Sorry for the trouble.

Shafi


attempt to use poisoned "CONST_COSTS"

2006-08-24 Thread Mohamed Shafi
Hello everyone,

I am upgrading a cross compiler from 3.2 to 3.4.6
I had to change some of the TARGET description macros
in target.h file
for otherwise while building i am getting "attempt to
use poisoned" errors

Presently what is written in target.h is 

1.  #define CPP_PREDEFINES \
"-Dtargetname -D__targetname__ -Amachine=targetname"

corresponding macro in 3.4.6 is
"TARGET_CPU_CPP_BUILTINS"


2.  #define CONST_COSTS(RTX, CODE, OUTER_CODE)  \
case CONST_INT: \
return target_const_costs (RTX, OUTER_CODE);  
\
case CONST: \
return 5;   \
case LABEL_REF: \
return 1;   \
case SYMBOL_REF:\
return ((TARGET_SMALL_MODEL)? 2: 3);\
case CONST_DOUBLE:  \
return 10;

i dont know the corresponding macro in 3.4.6

3.  #define ADDRESS_COST(RTX)   1


corresponding macro in 3.4.6 is "int
TARGET_ADDRESS_COST (rtx address)"


4.  #define RTX_COSTS(X, CODE, OUTER_CODE)  \
case MULT:
   \
return COSTS_N_INSNS (2);
 \
case DIV:   \
case UDIV:  \
case MOD:   \
case UMOD:  \
return COSTS_N_INSNS (30);  
\
case FLOAT: \
case FIX:   \
return COSTS_N_INSNS (100);


corresponding macro in 3.4.6 is "bool
TARGET_RTX_COSTS (rtx x, int code, int outer_code, int
*total)"


  5.#define ASM_GLOBALIZE_LABEL(STREAM,NAME)\
do  \
  { \
fputs ("\t.globl ", STREAM);\
assemble_name (STREAM, NAME);   \
fputs ("\n", STREAM);   \
  } \
while (0)


corresponding macro in 3.4.6 is "void
TARGET_ASM_GLOBALIZE_LABEL (FILE *stream, const char
*name)"



  Now to my problem :
  
  except for TARGET_CPU_CPP_BUILTINS i dont know how
to rewrite the existing macros for 3.4.6
  
  Can anybody help me in this regard?
  
  
  Thanks in advance



Regards,
Shafi.

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


  1   2   >