from:"Frank Isamov"

How to make 'long int' type be a PDImode?

2010-03-07 Thread Frank Isamov

Hi,

I'd like to make a backend which would have 48 bits for 'long' type.
(32 for int and 64 for long long).

I have tried to define:
#define LONG_TYPE_SIZE  48

and one of:
INT_MODE (PDI, 6);
PARTIAL_INT_MODE (DI);

Unfortunately, trying to compile a program, I see that the backend
still uses SImode for 'long'.

I could not find an example for target doing similar work.
Could you please advise or show the location of implementation I can refer?

Thank you,
Frank

Re: How to make 'long int' type be a PDImode?

2010-03-08 Thread Frank Isamov

On Mon, Mar 8, 2010 at 8:29 AM, Joern Rennecke
 wrote:
> Quoting Frank Isamov :
>
>> Hi,
>>
>> I'd like to make a backend which would have 48 bits for 'long' type.
>> (32 for int and 64 for long long).
>>
>> I have tried to define:
>> #define LONG_TYPE_SIZE  48
>
> That's not a partial integer mode; PDImode would have the same size as
> DImode,
> just not all bits would be significant.
>
>> and one of:
>> INT_MODE (PDI, 6);
>
> And that wouldn't be PDImode, more like THImode (three-halves integer mode,
> going by the precedent of TQFmode - three-quarter float mode - of the 1750a
> port in GCC prior to version 3.1)
>

I am sorry, I still can conclude how to make the backend use a certain
mode for 'long' type.
My architecture implements 32- and 48- bit registers and instructions
operating on these registers. That it why my intention is to use 'int'
for 32 and 'long' for 48 bit operations. My attempts lead to the
backend ignore 48 bits path inspite of the LONG_TYPE_SIZE definition
and 'long' is still processed as 32 bit value. I think I am missing
something, so I am applying for help. Any piece of information would
be useful.

Thank you,
Frank

Re: How to make 'long int' type be a PDImode?

2010-03-08 Thread Frank Isamov

On Mon, Mar 8, 2010 at 4:27 PM, Frank Isamov  wrote:
> On Mon, Mar 8, 2010 at 8:29 AM, Joern Rennecke
>  wrote:
>> Quoting Frank Isamov :
>>
>>> Hi,
>>>
>>> I'd like to make a backend which would have 48 bits for 'long' type.
>>> (32 for int and 64 for long long).
>>>
>>> I have tried to define:
>>> #define LONG_TYPE_SIZE  48
>>
>> That's not a partial integer mode; PDImode would have the same size as
>> DImode,
>> just not all bits would be significant.
>>
>>> and one of:
>>> INT_MODE (PDI, 6);
>>
>> And that wouldn't be PDImode, more like THImode (three-halves integer mode,
>> going by the precedent of TQFmode - three-quarter float mode - of the 1750a
>> port in GCC prior to version 3.1)
>>
>
>
> I am sorry, I still can conclude how to make the backend use a certain
> mode for 'long' type.
> My architecture implements 32- and 48- bit registers and instructions
> operating on these registers. That it why my intention is to use 'int'
> for 32 and 'long' for 48 bit operations. My attempts lead to the
> backend ignore 48 bits path inspite of the LONG_TYPE_SIZE definition
> and 'long' is still processed as 32 bit value. I think I am missing
> something, so I am applying for help. Any piece of information would
> be useful.
>
> Thank you,
> Frank
>

Correction: "I still can conclude " should be read as "I still cannot conclude"

Advancing SP on a call

2010-03-10 Thread Frank Isamov

We have a problem with arguments passing in memory.

The caller puts the arguments in memory relative to the sp:
add sp, 4 // allocate space for the argument. stack grows up
store r1, (sp-4)  // store  the argument on the stack
call xxx// call the function.

In xxx the result code looks like:
load (sp-4), r1   // load the argument from the stack.

The problem is that the 'call' instruction pushes the return address
to the stack and
increments the sp by 4 so when the callee tries to access the memory
it does not get
to the correct location.

How can I tell GCC that that the callee should load from the original
offset + 4?

Thanks.

Coloring problem - Pass 0 for finding allocno costs

2010-03-18 Thread Frank Isamov

Hi,

In my backend, I have a problem with the pass which determines the
best register class for a virtual register (Pass 0 for finding allocno
costs).

In all insns in this example both R_REGS and D_REGS register classes
are applicable (but all registers in an insn should be from the same
register class).

This is asmcons output:

;; Function mul (mul)
(note 1 0 5 NOTE_INSN_DELETED)

(note 5 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)

(note 2 5 3 2 NOTE_INSN_DELETED)

(insn 3 2 4 2 a.c:2 (set (reg/v:SI 97 [ b ])
(reg:SI 49 r1 [ b ])) 43 {movsi_regs} (expr_list:REG_DEAD
(reg:SI 49 r1 [ b ])
(nil)))

(note 4 3 7 2 NOTE_INSN_FUNCTION_BEG)

(insn 7 4 9 2 a.c:5 (set (reg/v:SI 94 [ __a_11 ])
(plus:SI (reg:SI 48 r0 [ a ])
(reg/v:SI 97 [ b ]))) 0 {*addsi3_1} (expr_list:REG_DEAD
(reg:SI 48 r0 [ a ])
(nil)))

(insn 9 7 10 2 a.c:5 (parallel [
(set (reg:SI 100)
(ashift:SI (reg/v:SI 97 [ b ])
(const_int 3 [0x3])))
(clobber (reg:CC 88 cc))
]) 36 {ashlsi3} (expr_list:REG_UNUSED (reg:CC 88 cc)
(expr_list:REG_EQUAL (ashift:SI (reg/v:SI 97 [ b ])
(const_int 3 [0x3]))
(nil

(insn 10 9 11 2 a.c:5 (set (reg:SI 101)
(plus:SI (reg:SI 100)
(reg/v:SI 97 [ b ]))) 0 {*addsi3_1} (expr_list:REG_DEAD (reg:SI 100)
(expr_list:REG_DEAD (reg/v:SI 97 [ b ])
(expr_list:REG_EQUAL (mult:SI (reg/v:SI 97 [ b ])
(const_int 9 [0x9]))
(nil)

(insn 11 10 12 2 a.c:5 (set (reg:SI 102)
(plus:SI (reg:SI 101)
(reg/v:SI 94 [ __a_11 ]))) 0 {*addsi3_1}
(expr_list:REG_DEAD (reg:SI 101)
(expr_list:REG_DEAD (reg/v:SI 94 [ __a_11 ])
(nil

(note 12 11 17 2 NOTE_INSN_DELETED)

(insn 17 12 23 2 a.c:7 (parallel [
(set (reg/i:SI 48 r0)
(ashift:SI (reg:SI 102)
(const_int 10 [0xa])))
(clobber (reg:CC 88 cc))
]) 36 {ashlsi3} (expr_list:REG_UNUSED (reg:CC 88 cc)
(expr_list:REG_DEAD (reg:SI 102)
(nil

(insn 23 17 0 2 a.c:7 (use (reg/i:SI 48 r0)) -1 (nil))

The problem I see is that for registers 100,101 I get best register
class D instead of R – actually they get the same cost and D is chosen
(maybe because it is first).

But they should not get the same cost since choosing D_REGS causes two
additional copies.

To my understanding the algorithm checks every insn, and since
register 100 appears only in insns in which all registers are still
virtual, any register class fits without additional cost. But later
when coloring it with d register, we need copies from r to d and back
to r.

Can someone please help with this?

Is there any reading material about this part of the IRA?

Thanks, Frank.

Re: Coloring problem - Pass 0 for finding allocno costs

2010-03-18 Thread Frank Isamov

-- Forwarded message --
From: Frank Isamov 
Date: Thu, Mar 18, 2010 at 4:28 PM
Subject: Re: Coloring problem - Pass 0 for finding allocno costs
To: Ian Bolton 


On Thu, Mar 18, 2010 at 3:51 PM, Ian Bolton  wrote:
>> The problem I see is that for registers 100,101 I get best register
>> class D instead of R - actually they get the same cost and D is chosen
>> (maybe because it is first).
>
> Hi Frank.
>
> Do D and R overlap?  It would be useful to know which regs are in
> which class, before trying to understand what is going on.
>
> Can you paste an example of your define_insn from your MD file to show
> how operands from D or R are both valid?  I ask this because it is
> possible to express that D is more expensive than R with operand
> constraints.
>
> For general IRA info, you might like to look over my long thread on
> here called "Understanding IRA".
>
> Cheers,
> Ian
>

Hi Ian,

Thank you very much for your prompt reply.
D and R are not overlap. Please see fragments of .md and .h files below:

From the md:

(define_register_constraint "a" "R_REGS" "")
(define_register_constraint "d" "D_REGS" "")

(define_predicate "a_operand"
   (match_operand 0 "register_operand")
{
       unsigned int regno;
       if (GET_CODE (op) == SUBREG)
           op = SUBREG_REG (op);
       regno = REGNO (op);
       return (regno >= FIRST_PSEUDO_REGISTER ||
REGNO_REG_CLASS(regno) == R_REGS);
}
)

(define_predicate "d_operand"
   (match_operand 0 "register_operand")
{
       unsigned int regno;
       if (GET_CODE (op) == SUBREG)
           op = SUBREG_REG (op);
       regno = REGNO (op);
       return (regno >= FIRST_PSEUDO_REGISTER ||
REGNO_REG_CLASS(regno) == D_REGS);
}
)

(define_predicate "a_d_operand"
 (ior (match_operand 0 "a_operand")
      (match_operand 0 "d_operand")))

(define_predicate "i_a_d_operand"
 (ior (match_operand 0 "immediate_operand")
      (match_operand 0 "a_d_operand")))

(define_insn "mov_regs"
 [(set (match_operand:SISFM 0 "a_d_operand" "=a, a, a, d, d, d")
               (match_operand:SISFM 1 "i_a_d_operand"   "a, i, d, a, i, d"))]
 ""
 "move\t%1, %0"
)

(define_insn "*addsi3_1"
 [(set (match_operand:SI 0 "a_d_operand"    "=a, a,  a,d,d")
           (plus:SI (match_operand:SI 1 "a_d_operand" "%a, a,  a,d,d")
                           (match_operand:SI 2 "nonmemory_operand"
"U05,S16,a,d,U05")))]
 ""
 "adda\t%2, %1, %0"
)

;;  Arithmetic Left and Right Shift Instructions
(define_insn "3"
 [(set (match_operand:SCIM 0 "register_operand" "=a,d,d")
           (sh_oprnd:SCIM (match_operand:SCIM 1 "register_operand" "a,d,d")
                                   (match_operand:SI 2
"nonmemory_operand" "U05,d,U05")))
  (clobber (reg:CC CC_REGNUM))]
 ""
 "\t%2, %1, %0"
)

From the h file:

#define REG_CLASS_CONTENTS                                              \
 {
            \
   {0x, 0x, 0x}, /* NO_REGS*/          \
   {0x, 0x, 0x}, /* D_REGS*/          \
   {0x, 0x, 0x}, /* R_REGS*/           \

ABI requires use of R registers for arguments and return value. Other
than that all of these instructions are more or less symmetrical in
sense of using D or R. So, an optimal choice would be use of R for
this example. And if D register is chosen, it involves additional copy
from R to D and back to R.

Thank you, Frank

Combine or peephole?

2010-04-19 Thread Frank Isamov

Hi,

My architecture supports instructions with two parallel side effects.
For example, addition and subtraction can be done in parallel:

(define_insn "assi6"
  [(parallel [
 (set (match_operand:SI 0 "register_operand" "=r")
  (minus:SI (match_operand:SI 1 "register_operand" "r")
(match_operand:SI 2 "register_operand" "r")))
 (set (match_operand:SI 3 "register_operand" "=r")
  (plus:SI (match_operand:SI 4 "register_operand" "r")
   (match_operand:SI 5 "register_operand" "r")))
  ])]
  ""
  "as\t%5, %4, %3, %2, %1, %0 %!"
)

This instruction is not chosen at ‘combine’ time even if ‘plus’ and
‘minus’ instructions are located one after other. That it is,
probably, there is no data dependency between them.

In attempt to resolve the problem, I am providing a peephole optimization:

(define_peephole2
  [
 (set (match_operand:SI 0 "register_operand")
  (minus:SI (match_operand:SI 1 "register_operand")
(match_operand:SI 2 "register_operand")))
 (set (match_operand:SI 3 "register_operand")
  (plus:SI (match_operand:SI 4 "register_operand")
   (match_operand:SI 5 "register_operand")))
  ]
  ""
  [(parallel [
 (set (match_dup 0)
  (minus:SI (match_dup 1)
(match_dup 2)))
 (set (match_dup 3)
  (plus:SI (match_dup 4)
   (match_dup 5)))
  ])]
  ""
)

This works for some cases, but I wanted to ask experts whether this is
the way to go. Repeating the same pattern for a peephole might be not
the best way to resolve the problem.

Did you observe something similar? What would be the best way to
resolve such situation?

Thank you,
Frank

Re: Combine or peephole?

2010-04-20 Thread Frank Isamov

 On Mon, Apr 19, 2010 at 5:54 PM, Jeff Law  wrote:
>
> combine requires a data dependency, so for this situation, combine isn't
> going to help.  The easy solution is to create a peephole.    You can also
> create a machine dependent reorg pass to detect more of these opportunities.
> Jeff
>

 Hi Jeff, et al,

 Thank you for your reply. Two more questions:

 1. Is it possible to add a machine dependent reorg pass at backend
 level without changing the standard infrastructure? If so, can you
 please point me such example? If no, may the new plugin architecture
 help here?
 2. A peephole for such case just repeats instruction definition
 pattern. As all information already available for such peephole,
 wouldn’t it be useful to implement the pass to be a part of the
 standard infrastructure?

 Thank you,
 Frank

Re: Combine or peephole?

2010-04-21 Thread Frank Isamov

Hi Ian,

On Wed, Apr 21, 2010 at 5:42 PM, Ian Lance Taylor  wrote:
> Frank Isamov  writes:
>
>>  2. A peephole for such case just repeats instruction definition
>>  pattern. As all information already available for such peephole,
>>  wouldn’t it be useful to implement the pass to be a part of the
>>  standard infrastructure?
>
> See define_peephole and define_peephole2.  If that doesn't answer your
> question, can you rephrase it?
>
> Ian
>

I think I understood the points from Jeff’s reply and I am going to
look at PA implementation now. Just for this email thread
completeness, I’ll try to rephrase the initial question:

Instructions which manipulate with data in parallel and have no data
dependency automatically require peephole2 definition or/and machine
dependent reorg pass. (Please see an example at the bottom of this
email). Peephole2 pattern, in this case, just repeats instruction’s
RTL pattern.
As such instructions can appear in SIMD architectures, I just thought
that it would be profitable to have this pass to be a part of the
common infrastructure.

(define_insn "assi6"
 [(parallel [
(set (match_operand:SI 0 "register_operand" "=r")
 (minus:SI (match_operand:SI 1 "register_operand" "r")
   (match_operand:SI 2 "register_operand" "r")))
(set (match_operand:SI 3 "register_operand" "=r")
 (plus:SI (match_operand:SI 4 "register_operand" "r")
  (match_operand:SI 5 "register_operand" "r")))
 ])]
 ""
 "as\t%5, %4, %3, %2, %1, %0 %!"
)

(define_peephole2
 [
(set (match_operand:SI 0 "register_operand")
 (minus:SI (match_operand:SI 1 "register_operand")
   (match_operand:SI 2 "register_operand")))
(set (match_operand:SI 3 "register_operand")
 (plus:SI (match_operand:SI 4 "register_operand")
  (match_operand:SI 5 "register_operand")))
 ]
 ""
 [(parallel [
(set (match_dup 0)
 (minus:SI (match_dup 1)
   (match_dup 2)))
(set (match_dup 3)
 (plus:SI (match_dup 4)
  (match_dup 5)))
 ])]
 ""
)

How to make 'long int' type be a PDImode?

Re: How to make 'long int' type be a PDImode?

Re: How to make 'long int' type be a PDImode?

Advancing SP on a call

Coloring problem - Pass 0 for finding allocno costs

Re: Coloring problem - Pass 0 for finding allocno costs

Combine or peephole?

Re: Combine or peephole?

Re: Combine or peephole?

9 matches

Site Navigation

Mail list logo

Footer information