On 02/11/2011 07:33 AM, Bernd Schmidt wrote:
Suppose I have two insns, one reserving (A|B|C), and the other reserving
A. I'm observing that when the first one is scheduled in an otherwise
empty state, it reserves the A unit and blocks the second one from being
scheduled in the same cycle. This is
On 02/11/2011 07:43 PM, Frédéric RISS wrote:
> Le vendredi 11 février 2011 à 13:33 +0100, Bernd Schmidt a écrit :
>> Suppose I have two insns, one reserving (A|B|C), and the other reserving
>> A. I'm observing that when the first one is scheduled in an otherwise
>> empty state, it reserves the A un
Le vendredi 11 février 2011 à 13:33 +0100, Bernd Schmidt a écrit :
> Suppose I have two insns, one reserving (A|B|C), and the other reserving
> A. I'm observing that when the first one is scheduled in an otherwise
> empty state, it reserves the A unit and blocks the second one from being
> schedule
Hi,
According to me at this moment the scheduler does not support your needs.
I was confronted with a similar problem as yours and I solved it by
implementing the TARGET_SCHED_DFA_NEW_CYCLE hook. Inside of the function which
supports this hook I choose/set the insn reservation that makes possib
On 02/11/2011 02:13 PM, Alexander Monakov wrote:
> Could you please clarify a bit: would the modified behavior match what your
> target CPU does? The current behavior matches CPUs without lookahead in
> instruction dispatch: the first insn goes to the first matching execution
> unit (A), the secon
On Fri, 11 Feb 2011, Bernd Schmidt wrote:
> Suppose I have two insns, one reserving (A|B|C), and the other reserving
> A. I'm observing that when the first one is scheduled in an otherwise
> empty state, it reserves the A unit and blocks the second one from being
> scheduled in the same cycle. T
On Mon, Jun 14, 2010 at 03:14:37PM +0200, Michael Matz wrote:
> Doing the change in GNU as has the advantage that all insn lengths are
> available without any work, i.e. it will handle e.g. inline asm; and that
> relaxation also is implemented just fine (it exists already in order to
> decide wh
Hi,
On Sun, 13 Jun 2010, H.J. Lu wrote:
> > We shouldn't turn GNU x86 assembler into an optimizing assembler. Next
> > people may ask assembler to remove redundant instructions, ...
Well, but currently nobody is asking for such thing, right?
> > Right now, when something goes wrong, people don
On Sat, Jun 12, 2010 at 8:15 AM, H.J. Lu wrote:
> On Fri, Jun 11, 2010 at 3:42 PM, Quentin Neill
> wrote:
>> On Thu, Jun 10, 2010 at 5:23 PM, H.J. Lu wrote:
>>> [snip]
>>> x86 assembler isn't an optimizing assembler. -mtune only does
>>> instruction selection. What you are proposing sounds like
On Jun 13, 2010, at 7:35 AM, Joern Rennecke wrote:
> An even if you have a suitable text for the assembler, to link the compiler
> with the assembler requires to merge to two complex build systems, and
> resolve symbol name clash issues.
Not trying to be inflammatory, but if you guys are really se
Andi Kleen writes:
> [...]
> Yes but you can't easily pass data back, like accurate instruction lengths.
Wouldn't it be too late by then? Or are you imagining having the
compiler pass trial data to the assembler to create a feedback loop?
- FChE
Quoting Andi Kleen :
On Sun, Jun 13, 2010 at 07:14:03AM -0400, Joern Rennecke wrote:
Quoting Andi Kleen :
I admit I haven't looked into gas code, but naively it can't
be all that difficult to e.g. run gas as a thread and
pass the text input through some shared memory buffer?
If you are gener
On Sun, Jun 13, 2010 at 07:14:03AM -0400, Joern Rennecke wrote:
> Quoting Andi Kleen :
>> I admit I haven't looked into gas code, but naively it can't
>> be all that difficult to e.g. run gas as a thread and
>> pass the text input through some shared memory buffer?
>
> If you are generating text an
Quoting Andi Kleen :
I admit I haven't looked into gas code, but naively it can't
be all that difficult to e.g. run gas as a thread and
pass the text input through some shared memory buffer?
If you are generating text anyway, there should be little difference to
the existing -pipe option - at l
> It would help compilation time a little bit, but generating the
> assembly code and running the entire assembler is a fairly small
> percentage of the overall compilation time--e.g., 3%. It's worth
> doing a fair amount of work to speed up compilation by 3%, but linking
> the assembler into gcc
Andi Kleen writes:
> But if you need more why can't you just link the whole assembler
> into gcc? That would hopefully speed up compilation too
> (e.g. over time the text generation of instructions could
> be bypassed)
It would help compilation time a little bit, but generating the
assembly code
On Fri, Jun 11, 2010 at 3:42 PM, Quentin Neill
wrote:
> On Thu, Jun 10, 2010 at 5:23 PM, H.J. Lu wrote:
>> [snip]
>> x86 assembler isn't an optimizing assembler. -mtune only does
>> instruction selection. What you are proposing sounds like an optimizing
>> assembler to me. Are we going to suppor
Quentin Neill writes:
>
> Another option would be to expose some subset of the assembler
> functionality as a plugin to GCC (similar to how gold is used) to
> extract the instruction sizes. Any comments on that approach?
AFAIK gcc already does keep track of instruction lengths
(e.g. for LOOP),
On Thu, Jun 10, 2010 at 5:23 PM, H.J. Lu wrote:
> [snip]
> x86 assembler isn't an optimizing assembler. -mtune only does
> instruction selection. What you are proposing sounds like an optimizing
> assembler to me. Are we going to support scheduling, macro, ...?
> --
> H.J.
Just to clarify, we ar
On Fri, Jun 11, 2010 at 02:09:33PM -0500, Quentin Neill wrote:
> Currently GCC doesn't compute the current encoding offset (doesn't
> know mnemonic/opcode lengths),
That's not true, gcc for i?86/x86_64 actually calculates the length and for
most of the commonly used insns correctly, I've spent so
On Fri, Jun 11, 2010 at 12:09 PM, Quentin Neill
wrote:
> On Fri, Jun 11, 2010 at 10:58 AM, Daniel Jacobowitz
> wrote:
>> On Thu, Jun 10, 2010 at 09:48:24PM -0500, Quentin Neill wrote:
> [snip]
>>> Does this qualify as a form of what you are suggesting? Because this
>>> is exactly what is being p
On Fri, Jun 11, 2010 at 10:58 AM, Daniel Jacobowitz
wrote:
> On Thu, Jun 10, 2010 at 09:48:24PM -0500, Quentin Neill wrote:
[snip]
>> Does this qualify as a form of what you are suggesting? Because this
>> is exactly what is being proposed:
>>
>> .balign 8 # start window
>> i
On Thu, Jun 10, 2010 at 09:48:24PM -0500, Quentin Neill wrote:
> > On the other hand, I'm not going to argue that it's a lot of work.
Missing "not" !
> When you say "put assertions in the assembler output" I understood it
> to mean "in the assembly source code output by the compiler", not "the
>
On Thu, Jun 10, 2010 at 5:40 PM, Daniel Jacobowitz
wrote:
> On Thu, Jun 10, 2010 at 02:03:03PM -0600, Jeff Law wrote:
>> That adds quite a bit of complication to the compiler though --
>> getting the instruction lengths right (and thus proper packing &
>> alignment) can be extremely difficult. I
On Thu, Jun 10, 2010 at 02:03:03PM -0600, Jeff Law wrote:
> That adds quite a bit of complication to the compiler though --
> getting the instruction lengths right (and thus proper packing &
> alignment) can be extremely difficult. I did some experiments with
> this on a target with *fixed* instru
On Thu, Jun 10, 2010 at 3:03 PM, Jeff Law wrote:
> On 06/10/10 13:52, H.J. Lu wrote:
>> On Thu, Jun 10, 2010 at 11:05 AM, Quentin Neill
>> wrote:
>>> Cross-posting Reza's call for feedback to the binutils list since it
>>> is relevant - s ee the last few paragraphs regarding how to
>>> "solve th
On Thu, Jun 10, 2010 at 3:09 PM, Quentin Neill
wrote:
> On Thu, Jun 10, 2010 at 4:08 PM, H.J. Lu wrote:
>> On Thu, Jun 10, 2010 at 1:59 PM, Quentin Neill
>> wrote:
>>> On Thu, Jun 10, 2010 at 3:03 PM, Jeff Law wrote:
On 06/10/10 13:52, H.J. Lu wrote:
> On Thu, Jun 10, 2010 at 11:05 AM,
On 06/10/10 13:52, H.J. Lu wrote:
On Thu, Jun 10, 2010 at 11:05 AM, Quentin Neill
wrote:
Cross-posting Reza's call for feedback to the binutils list since it
is relevant -
see the last few paragraphs regarding how to "solve the alignment problem".
Original thread: http://gcc.gnu.org/ml/gc
On Thu, Jun 10, 2010 at 4:08 PM, H.J. Lu wrote:
> On Thu, Jun 10, 2010 at 1:59 PM, Quentin Neill
> wrote:
>> On Thu, Jun 10, 2010 at 3:03 PM, Jeff Law wrote:
>>> On 06/10/10 13:52, H.J. Lu wrote:
On Thu, Jun 10, 2010 at 11:05 AM, Quentin Neill
wrote:
> Cross-posting Reza's call f
Quoting Jeff Law :
That adds quite a bit of complication to the compiler though -- getting
the instruction lengths right (and thus proper packing & alignment) can
be extremely difficult. I did some experiments with this on a target
with *fixed* instruction lengths a while back and even though t
On Thu, Jun 10, 2010 at 11:05 AM, Quentin Neill
wrote:
> Cross-posting Reza's call for feedback to the binutils list since it
> is relevant -
> see the last few paragraphs regarding how to "solve the alignment problem".
>
> Original thread: http://gcc.gnu.org/ml/gcc/2010-06/threads.html#00402
>
>
On Thu, Jun 10, 2010 at 1:59 PM, Quentin Neill
wrote:
> On Thu, Jun 10, 2010 at 3:03 PM, Jeff Law wrote:
>> On 06/10/10 13:52, H.J. Lu wrote:
>>> On Thu, Jun 10, 2010 at 11:05 AM, Quentin Neill
>>> wrote:
Cross-posting Reza's call for feedback to the binutils list since it
is relevant
Cross-posting Reza's call for feedback to the binutils list since it
is relevant -
see the last few paragraphs regarding how to "solve the alignment problem".
Original thread: http://gcc.gnu.org/ml/gcc/2010-06/threads.html#00402
Not sure if followups should occur on one list or both.
--
Quentin N
On Thu, May 6, 2010 at 11:47 AM, roy rosen wrote:
> Hi all.
>
> I work on a VLIW architecture.
> The sched2 pass adds a TImode to insns which should start a new issue group.
> But, after this pass, other passes change the insns, so the sched2
> work that was done is not correct anymore (the groups
Alex Turjan wrote:
--- On Thu, 5/7/09, Maxim Kuvyrkov wrote:
From: Maxim Kuvyrkov
Subject: Re: scheduling question
To: atur...@yahoo.com
Cc: "Vladimir Makarov" , gcc@gcc.gnu.org
Date: Thursday, May 7, 2009, 1:01 PM
Alex Turjan wrote:
Hi,
During scheduling Im confronted with the fa
Alex Turjan wrote:
Hi,
During scheduling Im confronted with the fact that an instruction is moved
from the ready list to queued with the cost 2, while according to my
expectations the insn should have been moved to queued with cost 1.
High, Alex. I could look at this, if you have a test and
--- On Thu, 5/7/09, Maxim Kuvyrkov wrote:
> From: Maxim Kuvyrkov
> Subject: Re: scheduling question
> To: atur...@yahoo.com
> Cc: "Vladimir Makarov" , gcc@gcc.gnu.org
> Date: Thursday, May 7, 2009, 1:01 PM
> Alex Turjan wrote:
> > Hi,
> > During schedu
Alex Turjan wrote:
Hi,
During scheduling Im confronted with the fact that an instruction is moved
from the ready list to queued with the cost 2, while according to my
expectations the insn should have been moved to queued with cost 1.
Did anybody experience similar problem?
From what you desc
2007/10/11, Jim Wilson <[EMAIL PROTECTED]>:
Thanks for you helpful hints ! And I am sorry for such a late reply.
I have figured out this problem yesterday :-).
> Do we know for sure that the scheduler is failing here? Have you looked
> at -da RTL dumps to verify which pass is performing the inco
ÎâêØ wrote:
Well... Is there anything I miss or forget to do ?
Someone needs to step through code in a debugger and try to figure out
what is going wrong.
I made an initial attempt at that. I hacked gcc a little to try to
force a failure with a contrived testcase. The first thing I notice
"rws_access_reg should be handling this correctly. It uses
HARD_REGNO_NREGS to get the number of regs referred to by a reg rtl.
So it should return 64 in this case, and then it will iterate over all
64-bit PR regs when checking for a dependency."
I have found HARD_REGNO_NREGS in ia64.h
#define HA
Ô¬Á¢Íþ wrote:
So, my question becomes clear:
How to solve this problem by making GCC knows the data dependencies
between mov X = pr (or mov pr = X, -1) and other usage of a specific
predicate register (e.g. p6, p7)?
We already have support for these move instructions. See the
movdi_internal p
Hello,
> Ira Rosen/Haifa/IBM wrote on 06/02/2007 11:49:17:
>
> > Dorit Nuzman/Haifa/IBM wrote on 05/02/2007 21:13:40:
> >
> > > Richard Guenther <[EMAIL PROTECTED]> wrote on 05/02/2007 17:59:00:
> > >
> ...
> > >
> > > That's going to change once this project goes in: "(3.2) Straight-
> > > line
> Richard Guenther <[EMAIL PROTECTED]> wrote on 05/02/2007 18:16:05:
>
> > On Mon, 5 Feb 2007, Jan Hubicka wrote:
> ...
> > > Did you run some benchmarks?
> >
> > Not yet - I'm looking at the C++ SPEC 2006 benchmarks at the moment
> > and using vectorization there seems to do a lot of collateral d
On Tue, 6 Feb 2007, Dorit Nuzman wrote:
> Ira Rosen/Haifa/IBM wrote on 06/02/2007 11:49:17:
>
> > Dorit Nuzman/Haifa/IBM wrote on 05/02/2007 21:13:40:
> >
> > > Richard Guenther <[EMAIL PROTECTED]> wrote on 05/02/2007 17:59:00:
> > >
> ...
> > >
> > > That's going to change once this project goes
Ira Rosen/Haifa/IBM wrote on 06/02/2007 11:49:17:
> Dorit Nuzman/Haifa/IBM wrote on 05/02/2007 21:13:40:
>
> > Richard Guenther <[EMAIL PROTECTED]> wrote on 05/02/2007 17:59:00:
> >
...
> >
> > That's going to change once this project goes in: "(3.2) Straight-
> > line code vectorization" from htt
Dorit Nuzman/Haifa/IBM wrote on 05/02/2007 21:13:40:
> Richard Guenther <[EMAIL PROTECTED]> wrote on 05/02/2007 17:59:00:
>
> > On Mon, 5 Feb 2007, Paolo Bonzini wrote:
> >
> > >
> > > > As we also only vectorize innermost loops I believe doing a
> > > > complete unrolling pass early will help i
Richard Guenther <[EMAIL PROTECTED]> wrote on 06.02.2007 11:19:15:
> On Tue, 6 Feb 2007, Dorit Nuzman wrote:
> > After sleeping on it, it actually makes a lot of sense to me to
schedule
> > complete loop unrolling before vectorization - I think it would either
> > simplify loops (sometimes creatin
On Tue, 6 Feb 2007, Dorit Nuzman wrote:
> > Hi Richard,
> >
> >
> ...
> > However...,
> >
> > I have seen cases in which complete unrolling before vectorization
> enabled
> > constant propagation, which in turn enabled significant simplification of
> > the code, thereby, in fact making a previousl
> Hi Richard,
>
>
...
> However...,
>
> I have seen cases in which complete unrolling before vectorization
enabled
> constant propagation, which in turn enabled significant simplification of
> the code, thereby, in fact making a previously unvectorizable loop (at
> least on some targets, due to the
Hello,
> >As we also only vectorize innermost loops I believe doing a
> >complete unrolling pass early will help in general (I pushed
> >for this some time ago).
> >
> >Thoughts?
>
> It might also hurt, though, since we don't have a basic block
> vectorizer. IIUC the vectorizer is able to turn
Hi Richard,
Richard Guenther <[EMAIL PROTECTED]> wrote on 05/02/2007 17:27:03:
...
>
> ...
>
> and we are later not able to do constant propagation to the
> second loop which we can do if we first unroll such small loops.
>
> As we also only vectorize innermost loops
by the way, we are working o
Richard Guenther <[EMAIL PROTECTED]> wrote on 05/02/2007 17:59:00:
> On Mon, 5 Feb 2007, Paolo Bonzini wrote:
>
> >
> > > As we also only vectorize innermost loops I believe doing a
> > > complete unrolling pass early will help in general (I pushed
> > > for this some time ago).
> > >
> > > Though
Richard Guenther <[EMAIL PROTECTED]> wrote on 05/02/2007 18:16:05:
> On Mon, 5 Feb 2007, Jan Hubicka wrote:
...
> > Did you run some benchmarks?
>
> Not yet - I'm looking at the C++ SPEC 2006 benchmarks at the moment
> and using vectorization there seems to do a lot of collateral damage
> (maybe n
On Mon, 5 Feb 2007, Jan Hubicka wrote:
> >
> > Hi,
> >
> > currently with -ftree-vectorize we generate for
> >
> > for (i=0; i<3; ++i)
> > # SFT.4346_507 = VDEF
> > # SFT.4347_508 = VDEF
> > # SFT.4348_509 = VDEF
> > d[i] = 0.0;
>
> Also Tomas' patch is supposed to catch this sp
>
> Hi,
>
> currently with -ftree-vectorize we generate for
>
> for (i=0; i<3; ++i)
> # SFT.4346_507 = VDEF
> # SFT.4347_508 = VDEF
> # SFT.4348_509 = VDEF
> d[i] = 0.0;
Also Tomas' patch is supposed to catch this special case and convert it
into memset that should be subsequentl
On Mon, 5 Feb 2007, Paolo Bonzini wrote:
>
> > As we also only vectorize innermost loops I believe doing a
> > complete unrolling pass early will help in general (I pushed
> > for this some time ago).
> >
> > Thoughts?
>
> It might also hurt, though, since we don't have a basic block vectorizer
As we also only vectorize innermost loops I believe doing a
complete unrolling pass early will help in general (I pushed
for this some time ago).
Thoughts?
It might also hurt, though, since we don't have a basic block
vectorizer. IIUC the vectorizer is able to turn
for (i = 0; i < 4; i+
> Please does anyone know the answer to the following questions?
>
> 1. The operating system (OS) schedules tasks, but gnat allow us to set
> schedule policies such as Round Robin, then how does gnat tell the OS to
> start doing Round Robin scheduling?
>
> 2. If someone wants to write a new sch
"Tabony, Charles" <[EMAIL PROTECTED]> writes:
> What does it mean by "unit none"?
First I'll note that you shouldn't see this when using the DFA
scheduler (define_insn_reservation, etc.). You should only see it
when using the old pipeline description (define_function_unit, etc.).
The old pipelin
60 matches
Mail list logo