viewvc: python: RuntimeError: maximum recursion limit exceeded

2011-08-15 Thread Georg-Johann Lay

Hi, I'm getting the following error in viewvc for several days now:

http://gcc.gnu.org/viewcvs/trunk/gcc/dse.c?view=markup

An Exception Has Occurred
Python Traceback

Traceback (most recent call last):
  File "/usr/lib/python2.3/site-packages/viewvc/lib/viewvc.py", line 
4463, in main

request.run_viewvc()
  File "/usr/lib/python2.3/site-packages/viewvc/lib/viewvc.py", line 
394, in run_viewvc

self.view_func(self)
  File "/usr/lib/python2.3/site-packages/viewvc/lib/viewvc.py", line 
1845, in view_markup

markup_or_annotate(request, 0)
  File "/usr/lib/python2.3/site-packages/viewvc/lib/viewvc.py", line 
1775, in markup_or_annotate

path[-1], mime_type, encoding)
  File "/usr/lib/python2.3/site-packages/viewvc/lib/viewvc.py", line 
1656, in markup_stream_pygments

encoding='utf-8'), ps)
  File "/usr/lib/python2.3/site-packages/pygments/__init__.py", line 
86, in highlight

return format(lex(code, lexer), formatter, outfile)
  File "/usr/lib/python2.3/site-packages/pygments/__init__.py", line 
69, in format

formatter.format(tokens, outfile)
  File "/usr/lib/python2.3/site-packages/pygments/formatter.py", line 
92, in format

return self.format_unencoded(tokensource, outfile)
  File "/usr/lib/python2.3/site-packages/pygments/formatters/html.py", 
line 752, in format_unencoded

for t, piece in source:
  File "/usr/lib/python2.3/site-packages/pygments/formatters/html.py", 
line 652, in _format_lines

for ttype, value in tokensource:
  File "/usr/lib/python2.3/site-packages/pygments/lexer.py", line 167, 
in streamer

for i, t, v in self.get_tokens_unprocessed(text):
  File "/usr/lib/python2.3/site-packages/pygments/lexers/compiled.py", 
line 161, in get_tokens_unprocessed

for index, token, value in \
  File "/usr/lib/python2.3/site-packages/pygments/lexer.py", line 502, 
in get_tokens_unprocessed

m = rexmatch(text, pos)
RuntimeError: maximum recursion limit exceeded


Re: A case that PRE optimization hurts performance

2011-08-15 Thread Václav Zeman

On Tue, 2 Aug 2011 10:37:03 +0800, Jiangning Liu wrote:

Hi,

For the following simple test case, PRE optimization hoists 
computation
(s!=1) into the default branch of the switch statement, and finally 
causes
very poor code generation. This problem occurs in both X86 and ARM, 
and I

believe it is also a problem for other targets.

[...]

Do you have any idea about this?

Fill a bug report to GCC Bugzilla 

--
VZ



ggc_alloc_rtvec_sized allocates spaces more than necessary?

2011-08-15 Thread 王亮
Hi,

Current implementation of ggc_alloc_rtvec_sized is

#define ggc_alloc_rtvec_sized(NELT) \
(ggc_alloc_zone_vec_rtvec_def (sizeof (rtx),\
   sizeof (struct rtvec_def) + ((NELT) - 1), \
   &rtl_zone))

The size it allocates is

  (sizeof (struct rtvec_def) + ((NELT) - 1)) * sizeof (rtx)
 // (1)

Originally, the allocated size is

  sizeof (struct rtvec_def) + ((NELT) - 1) * sizeof (rtx)
// (2)

So current implementation allocates more spaces than before.

I replace the second parameter of ggc_alloc_zone_vec_rtvec_def with

  (sizeof (struct rtvec_def) + sizeof (rtx) - 1) / sizeof (rtx) +
((NELT) - 1)   // (3)

It bootstraps on x86 successfully.  So I guess the extra spaces are
not used.  Did I miss something?

Thanks,
Liang.


Re: ggc_alloc_rtvec_sized allocates spaces more than necessary?

2011-08-15 Thread Richard Guenther
On Mon, Aug 15, 2011 at 2:16 PM, 王亮  wrote:
> Hi,
>
> Current implementation of ggc_alloc_rtvec_sized is
>
> #define ggc_alloc_rtvec_sized(NELT)                                     \
>    (ggc_alloc_zone_vec_rtvec_def (sizeof (rtx),                        \
>                                   sizeof (struct rtvec_def) + ((NELT) - 1), \
>                                   &rtl_zone))
>
> The size it allocates is
>
>  (sizeof (struct rtvec_def) + ((NELT) - 1)) * sizeof (rtx)

This looks indeed bogus.

>     // (1)
>
> Originally, the allocated size is
>
>  sizeof (struct rtvec_def) + ((NELT) - 1) * sizeof (rtx)
>    // (2)

This one is correct.

Laurynas?

> So current implementation allocates more spaces than before.
>
> I replace the second parameter of ggc_alloc_zone_vec_rtvec_def with
>
>  (sizeof (struct rtvec_def) + sizeof (rtx) - 1) / sizeof (rtx) +
> ((NELT) - 1)   // (3)
>
> It bootstraps on x86 successfully.  So I guess the extra spaces are
> not used.  Did I miss something?
>
> Thanks,
> Liang.
>


Re: New mirror

2011-08-15 Thread Sergey Kutserey
Hi again! Can you please reply - do you ever need this mirror?
Thank you.

On Mon, Aug 8, 2011 at 1:43 PM, Sergey Kutserey  wrote:
> Hi there! We just raised a new mirror in US, Missouri, Saint Louis.
> It has 100Mb/s connection and synced twice a day from main site gcc.gnu.org
> URL of mirror is: gcc.petsads.us
> My email s.kutse...@gmail.com
> My name is Sergey Kutserey
>
> Hopefully you can add this mirror into public mirror list for GCC project.
> Thank you.
>


[PATCH] Remove "bogus" g++.dg/init/copy7.C testcase

2011-08-15 Thread Richard Guenther

The g++.dg/init/copy7.C testcase checks whether the C++ frontend
guards memcpy it emits via a conditional verifying that src != dst
because calling memcpy with overlapping source / destination is
not supported.

The testcase is misguided though (and the C++ frontend was, until
recently) - the middle-end itself will replace aggregate copies
with memcpy libcalls if it suits - without such conditional.
As PR39480 shows (the bug that prompted to "fixing" the C++ frontend),
the "error" was diagnosed by valgrind, not any real memcpy implemenation.

The argument still holds that no reasonable memcpy implementation
will reject the src == dest case.  Arguing about explicit cache
write-allocation is moot, as you'd still have to handle the
case of memcpy (&a, &a+1, 1) correctly - and thus any reasonable
implementation would handle the src == dest case explicitly if
that is necessary.

Thus, the following simply removes the now FAILing testcase on
the basis that it never was PASSing really (as my modified
C testcases in PR50079 show).  If we ever encounter a platform
that fails for memcpy (&a, &a, ...) and we decide it's not the
platform that is broken we have to invent a fix in the middle-end
and (conditionally) guard any libcall block moves.

Comments?  Ok to commit?

Thanks,
Richard.

2011-08-15  Richard Guenther  

PR middle-end/50079
* g++.dg/init/copy7.C: Remove testcase.

Index: gcc/testsuite/g++.dg/init/copy7.C
===
--- gcc/testsuite/g++.dg/init/copy7.C   (revision 177759)
+++ gcc/testsuite/g++.dg/init/copy7.C   (working copy)
@@ -1,39 +0,0 @@
-// PR c++/39480
-// It isn't always safe to call memcpy with identical arguments.
-// { dg-do run }
-
-extern "C" void abort();
-extern "C" void *
-memcpy(void *dest, void *src, __SIZE_TYPE__ n)
-{
-  if (dest == src)
-abort();
-  else
-{
-  __SIZE_TYPE__ i;
-  for (i = 0; i < n; i++)
-((char *)dest)[i] = ((const char*)src)[i];
-}
-}
-
-struct A
-{
-  double d[10];
-};
-
-struct B: public A
-{
-  char bc;
-};
-
-B b;
-
-void f(B *a1, B* a2)
-{
-  *a1 = *a2;
-}
-
-int main()
-{
-  f(&b,&b);
-}


Re: i370 port

2011-08-15 Thread Ulrich Weigand
Paul Edwards wrote:

> I was surprised that an instruction that is marked as s_operand
> was getting a seemingly non-s_operand given to it, so I added an
> "S" constraint:

That's right.  It is not good to have a constraint that accepts
more than the predicate, since reload will at this point only
consider the constraint.  Adding a more restricted constraint
should be the proper fix for this problem.

> That then gave an actual compiler error instead of generating bad
> code, which is a step forward:
> 
> pdos.c: In function `pdosLoadExe':
> pdos.c:2703: error: unable to generate reloads for:

You'll need to mark your new constraint as EXTRA_MEMORY_CONSTRAINT
so that reload knows what to do when an argument doesn't match.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: i370 port

2011-08-15 Thread Paul Edwards

You'll need to mark your new constraint as EXTRA_MEMORY_CONSTRAINT
so that reload knows what to do when an argument doesn't match.


Thanks! That certainly produced an effect.

Unfortunately it's not quite right, seemingly not loading R9 properly:

LR9,13
AR9,13
MVC   0(10,9),0(2)

And it had a knock-on effect too, producing bad code elsewhere:

<  SLR   2,2
<  SLR   3,3
<  ST2,128(13)
<  ST3,4+128(13)
<  ST2,136(13)
<  ST3,4+136(13)
<  ST2,144(13)
<  ST3,4+144(13)
---

 MVC   128(8,13),=F'0'
 MVC   136(8,13),=F'0'
 MVC   144(8,13),=F'0'


But I guess that is another can of worms to investigate.

BFN.  Paul.



Re: [PATCH] Remove "bogus" g++.dg/init/copy7.C testcase

2011-08-15 Thread Mike Stump
On Aug 15, 2011, at 5:42 AM, Richard Guenther wrote:
> The argument still holds that no reasonable memcpy implementation
> will reject the src == dest case.

Hum...  Sounds like if that's the case that we should document it in the manual 
as something we expect (requirement) of the memcpy implementation.  I'll let a 
frontend or optimization person review this.


Re: ggc_alloc_rtvec_sized allocates spaces more than necessary?

2011-08-15 Thread Andreas Schwab
王亮  writes:

> Hi,
>
> Current implementation of ggc_alloc_rtvec_sized is
>
> #define ggc_alloc_rtvec_sized(NELT) \
> (ggc_alloc_zone_vec_rtvec_def (sizeof (rtx),\
>sizeof (struct rtvec_def) + ((NELT) - 1), \
>&rtl_zone))
>
> The size it allocates is
>
>   (sizeof (struct rtvec_def) + ((NELT) - 1)) * sizeof (rtx)
>  // (1)
>
> Originally, the allocated size is
>
>   sizeof (struct rtvec_def) + ((NELT) - 1) * sizeof (rtx)
> // (2)
>
> So current implementation allocates more spaces than before.
>
> I replace the second parameter of ggc_alloc_zone_vec_rtvec_def with
>
>   (sizeof (struct rtvec_def) + sizeof (rtx) - 1) / sizeof (rtx) +
> ((NELT) - 1)   // (3)

I think it was meant to be this:

#define ggc_alloc_rtvec_sized(NELT) \
  ggc_alloc_zone_rtvec_def (sizeof (struct rtvec_def)   \
+ ((NELT) - 1) * sizeof (rtx),  \
&rtl_zone)

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: i370 port

2011-08-15 Thread Ulrich Weigand
Paul Edwards wrote:

> Unfortunately it's not quite right, seemingly not loading R9 properly:
> 
> LR9,13
> AR9,13
> MVC   0(10,9),0(2)

That's weird.  What does the reload dump (.greg) say?
 
> And it had a knock-on effect too, producing bad code elsewhere:
> 
> <  SLR   2,2
> <  SLR   3,3
> <  ST2,128(13)
> <  ST3,4+128(13)
> <  ST2,136(13)
> <  ST3,4+136(13)
> <  ST2,144(13)
> <  ST3,4+144(13)
> ---
> >  MVC   128(8,13),=F'0'
> >  MVC   136(8,13),=F'0'
> >  MVC   144(8,13),=F'0'
> 
> But I guess that is another can of worms to investigate.

It seems the literal is not marked as being doubleword.  That might
be related to the fact that const_int's do not carry a mode, so you
cannot just look at the literal's mode to determine the required
size, but have to take the full instruction into account ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Announcing the Port of Intel(r) Cilk (TM) Plus into GCC

2011-08-15 Thread Iyer, Balaji V
Hello Everyone,
   This letter describes the recently created GCC branch called "cilkplus" that 
ports the Intel(R) Cilk(TM) Plus language extensions to the C and C++ 
front-ends of gcc-4.7. We are looking for collaborators and advice as we 
proceed - both on this open-source gcc project, and on the open language 
specification. The compiler and its associated runtime are available at:  
http://gcc.gnu.org/svn/gcc/branches/cilkplus . The URL for the patch is 
available at: http://software.intel.com/file/38093 .

   Intel Cilk Plus is a set of C and C++ constructs for task-parallel and 
data-parallel programming for improving performance on multicore and vector 
processors. This language extension includes the following features:

1.  Three keywords provide a simple yet powerful model for parallel 
programming: _Cilk_spawn, _Cilk_sync and _Cilk_for. Reducers provide an easy, 
lock-free way to deal with shared data.

2.  Simple array notations including elemental functions allow programmers 
to easily use data-parallelism.

3.  Pragmas communicate SIMD information to the vectorizer to help ensure 
that loops are vectorized correctly.

   The implementation of Intel Cilk Plus language extensions to gcc requires 
the above patches to the C and C++ front-ends, plus a copy of the Intel Cilk 
Plus runtime library (Cilk Plus RTL).  Both of these have been checked into the 
new gcc branch.  The Cilk Plus RTL is being maintained by using an upstream, 
BSD-licensed version available at http://www.cilkplus.org .  Changes to the 
Cilk Plus RTL are welcome and must be contributed to the upstream version via 
this web site.  A contribution process is in place for receiving such changes; 
see http://www.cilkplus.org for details.

   In this release, the C and C++ compiler parsers now accept the three 
keywords _Cilk_spawn, _Cilk_sync, and _Cilk_for, as well as a pragma to adjust 
the grainsize of _Cilk_for. The list below provides a brief explanation of each 
of the keywords. For more details, see the "Intel(r) Cilk(tm) Plus Language 
Specification" at http://www.cilkplus.org .

1.  _Cilk_spawn - Annotates a function-call and indicates that execution 
may (but is not required to) continue without waiting for the function to 
return. The syntax is:
[   = ] _Cilk_spawn  ( 
(optional)) ;

2.  _Cilk_sync - Indicates that all the statements in the current Cilk 
block must finish executing before any statements after the _Cilk_sync begin 
executing. The syntax is:
_Cilk_sync ;

3.  _Cilk_for - Is a variant of a for-statement where any or all iterations 
may (but are not required to) execute in parallel. You can optionally precede 
_Cilk_for with a grainsize-pragma to specify the number of serial iterations 
desired for each chunk of the parallel loop. If there is no grainsize pragma or 
if the grainsize evaluates to '0', then the runtime will pick a grainsize using 
its own internal heuristics. The syntax:
[ #pragma cilk grainsize =  ] _Cilk_for ( ; 
 ; )
 

   The parser will accept these keywords and insert the appropriate functions 
to interact with the runtime library. Along with these keywords, you can use 
#pragma simd directives to communicate loop information to the vectorizer so it 
can generate better vectorized code. The five #pragma simd directives are: 
vectorlength, private, linear, reduction, and assert. The list below summarizes 
the five directives. For a detailed explanation please refer to the "Intel(r) 
Cilk(tm) Plus Language Specification" at http://www.cilkplus.org .

1)  #pragma simd vectorlength (n1, n2 ...): Specify a choice vector width 
that the back-end may use to vectorize the loop.

2)  #pragma simd private (var1, var2, ...): Specify a set of variables for 
which each loop iteration is independent of each other iterations.

3)  #pragma simd linear (var1:stride1, var2:stride2, ...): Specify a set of 
variables that increase monotonically in each iteration of the loop.

4)  #pragma simd reduction (operator: var1, var2...): Specify a set of 
variables whose value is computed by vector reduction using the specified 
operator.

5)  #pragma simd assert: Directs the compiler to halt if the vectorizer is 
unable to vectorize the loop.

   The current implementation of the runtime library has been tested on x86 
(both 32 and 64 bit) architectures. In theory, the runtime library should not 
be too difficult for you to port to other architectures. However, be aware that 
access to shared variables currently assumes sequential consistency, so 
architectures that use a different memory model may require you to insert 
additional memory barriers.

   These language extensions provide a simple, well-structured, and powerful 
model for parallel programming. Intel hopes that you will find these extensions 
to be a useful and significant enhancement to the GCC C and C++ compiler. In 
this initial release, the array notations and elemental functions present in 
the full Int

Re: Announcing the Port of Intel(r) Cilk (TM) Plus into GCC

2011-08-15 Thread H.J. Lu
Hi,

I checked this into cilkplus branch.  Jason, can you also mirror
branches/cilkplus in GCC git mirror?

Thanks.

H.J.

On Mon, Aug 15, 2011 at 1:30 PM, Iyer, Balaji V  wrote:
> Hello Everyone,
>   This letter describes the recently created GCC branch called "cilkplus" 
> that ports the Intel(R) Cilk(TM) Plus language extensions to the C and C++ 
> front-ends of gcc-4.7. We are looking for collaborators and advice as we 
> proceed - both on this open-source gcc project, and on the open language 
> specification. The compiler and its associated runtime are available at:  
> http://gcc.gnu.org/svn/gcc/branches/cilkplus . The URL for the patch is 
> available at: http://software.intel.com/file/38093 .
>
>   Intel Cilk Plus is a set of C and C++ constructs for task-parallel and 
> data-parallel programming for improving performance on multicore and vector 
> processors. This language extension includes the following features:
>
> 1.  Three keywords provide a simple yet powerful model for parallel 
> programming: _Cilk_spawn, _Cilk_sync and _Cilk_for. Reducers provide an easy, 
> lock-free way to deal with shared data.
>
> 2.      Simple array notations including elemental functions allow 
> programmers to easily use data-parallelism.
>
> 3.      Pragmas communicate SIMD information to the vectorizer to help ensure 
> that loops are vectorized correctly.
>
>   The implementation of Intel Cilk Plus language extensions to gcc requires 
> the above patches to the C and C++ front-ends, plus a copy of the Intel Cilk 
> Plus runtime library (Cilk Plus RTL).  Both of these have been checked into 
> the new gcc branch.  The Cilk Plus RTL is being maintained by using an 
> upstream, BSD-licensed version available at http://www.cilkplus.org .  
> Changes to the Cilk Plus RTL are welcome and must be contributed to the 
> upstream version via this web site.  A contribution process is in place for 
> receiving such changes; see http://www.cilkplus.org for details.
>
>   In this release, the C and C++ compiler parsers now accept the three 
> keywords _Cilk_spawn, _Cilk_sync, and _Cilk_for, as well as a pragma to 
> adjust the grainsize of _Cilk_for. The list below provides a brief 
> explanation of each of the keywords. For more details, see the "Intel(r) 
> Cilk(tm) Plus Language Specification" at http://www.cilkplus.org .
>
> 1.      _Cilk_spawn - Annotates a function-call and indicates that execution 
> may (but is not required to) continue without waiting for the function to 
> return. The syntax is:
> [   = ] _Cilk_spawn  ( 
> (optional)) ;
>
> 2.      _Cilk_sync - Indicates that all the statements in the current Cilk 
> block must finish executing before any statements after the _Cilk_sync begin 
> executing. The syntax is:
> _Cilk_sync ;
>
> 3.      _Cilk_for - Is a variant of a for-statement where any or all 
> iterations may (but are not required to) execute in parallel. You can 
> optionally precede _Cilk_for with a grainsize-pragma to specify the number of 
> serial iterations desired for each chunk of the parallel loop. If there is no 
> grainsize pragma or if the grainsize evaluates to '0', then the runtime will 
> pick a grainsize using its own internal heuristics. The syntax:
> [ #pragma cilk grainsize =  ] _Cilk_for ( 
> ;  ; )
>     
>
>   The parser will accept these keywords and insert the appropriate functions 
> to interact with the runtime library. Along with these keywords, you can use 
> #pragma simd directives to communicate loop information to the vectorizer so 
> it can generate better vectorized code. The five #pragma simd directives are: 
> vectorlength, private, linear, reduction, and assert. The list below 
> summarizes the five directives. For a detailed explanation please refer to 
> the "Intel(r) Cilk(tm) Plus Language Specification" at 
> http://www.cilkplus.org .
>
> 1)      #pragma simd vectorlength (n1, n2 ...): Specify a choice vector width 
> that the back-end may use to vectorize the loop.
>
> 2)      #pragma simd private (var1, var2, ...): Specify a set of variables 
> for which each loop iteration is independent of each other iterations.
>
> 3)      #pragma simd linear (var1:stride1, var2:stride2, ...): Specify a set 
> of variables that increase monotonically in each iteration of the loop.
>
> 4)      #pragma simd reduction (operator: var1, var2...): Specify a set of 
> variables whose value is computed by vector reduction using the specified 
> operator.
>
> 5)      #pragma simd assert: Directs the compiler to halt if the vectorizer 
> is unable to vectorize the loop.
>
>   The current implementation of the runtime library has been tested on x86 
> (both 32 and 64 bit) architectures. In theory, the runtime library should not 
> be too difficult for you to port to other architectures. However, be aware 
> that access to shared variables currently assumes sequential consistency, so 
> architectures that use a different memory model may require you to insert 
> additional memo

Re: A question about sched_analyze_insn in sched-deps.c

2011-08-15 Thread Ayal Zaks
>AFAIK SMS will not do speculative memory access.

Right, SMS does no speculative memory access. Though that might not be
a bad idea...
Ayal.


2011/8/11 Revital Eres 
>
> Hello,
>
> >> I appriciate explanation regarding the following piece of code in
> >> sched_analyze_insn function (sched-deps.c): When handling jump instruction
> >> dependence edges are created between the jump instruction and memory
> >> writes and volatile reads and I'm not quite sure the reason why.
> >
> > Jump instructions can be conditional.  Note the check for whether the
> > next instruction is a barrier.
>
> Thanks for the answer. I'm asking that in the context of SMS --- I'm
> not sure if this dependence is needed when SMS is applied. AFAIK SMS
> will not do speculative memory access.  If that's indeed the case I'll
> submit the following patch.
>
> Thanks,
> Revital
>
> Index: sched-deps.c
> ===
> --- sched-deps.c        (revision 177556)
> +++ sched-deps.c        (working copy)
> @@ -2777,32 +2777,36 @@ sched_analyze_insn (struct deps_desc *de
>             }
>
>          /* All memory writes and volatile reads must happen before the
> -            jump.  Non-volatile reads must happen before the jump iff
> -            the result is needed by the above register used mask.  */
> +            jump unless the analysis is done for the SMS pass.
> +            Non-volatile reads must happen before the jump iff the
> +            result is needed by the above register used mask.  */
>
> -         pending = deps->pending_write_insns;
> -         pending_mem = deps->pending_write_mems;
> -         while (pending)
> +         if (common_sched_info->sched_pass_id != SCHED_SMS_PASS)
>            {
> -             if (! sched_insns_conditions_mutex_p (insn, XEXP (pending, 0)))
> -               add_dependence (insn, XEXP (pending, 0), REG_DEP_OUTPUT);
> -             pending = XEXP (pending, 1);
> -             pending_mem = XEXP (pending_mem, 1);
> -           }
> -
> -         pending = deps->pending_read_insns;
> -         pending_mem = deps->pending_read_mems;
> -         while (pending)
> -           {
> -             if (MEM_VOLATILE_P (XEXP (pending_mem, 0))
> -                 && ! sched_insns_conditions_mutex_p (insn, XEXP (pending, 
> 0)))
> -               add_dependence (insn, XEXP (pending, 0), REG_DEP_OUTPUT);
> -             pending = XEXP (pending, 1);
> -             pending_mem = XEXP (pending_mem, 1);
> +             pending = deps->pending_write_insns;
> +             pending_mem = deps->pending_write_mems;
> +             while (pending)
> +               {
> +                 if (! sched_insns_conditions_mutex_p (insn, XEXP (pending, 
> 0)))
> +                   add_dependence (insn, XEXP (pending, 0), REG_DEP_OUTPUT);
> +                 pending = XEXP (pending, 1);
> +                 pending_mem = XEXP (pending_mem, 1);
> +               }
> +
> +             pending = deps->pending_read_insns;
> +             pending_mem = deps->pending_read_mems;
> +             while (pending)
> +               {
> +                 if (MEM_VOLATILE_P (XEXP (pending_mem, 0))
> +                     && ! sched_insns_conditions_mutex_p (insn, XEXP 
> (pending, 0)))
> +                   add_dependence (insn, XEXP (pending, 0), REG_DEP_OUTPUT);
> +                 pending = XEXP (pending, 1);
> +                 pending_mem = XEXP (pending_mem, 1);
> +               }
> +
> +             add_dependence_list (insn, deps->last_pending_memory_flush, 1,
> +                                  REG_DEP_ANTI);
>            }
> -
> -         add_dependence_list (insn, deps->last_pending_memory_flush, 1,
> -                              REG_DEP_ANTI);
>        }
>     }


Re: ggc_alloc_rtvec_sized allocates spaces more than necessary?

2011-08-15 Thread Laurynas Biveinis
> On Mon, Aug 15, 2011 at 2:16 PM, 王亮  wrote:
>> The size it allocates is
>>
>>  (sizeof (struct rtvec_def) + ((NELT) - 1)) * sizeof (rtx)

>> Originally, the allocated size is
>>
>>  sizeof (struct rtvec_def) + ((NELT) - 1) * sizeof (rtx)

Yes, this is correct, good catch.

>>  (sizeof (struct rtvec_def) + sizeof (rtx) - 1) / sizeof (rtx) +
>> ((NELT) - 1)   // (3)

Due to the way those macros expand, right now replacing the first arg
with "1" and the second one with straightforward "sizeof (struct
rtvec_def) + ((NELT) - 1) * sizeof (rtx)" will work and will be easier
to read than division. Liang, would you submit such patch?

Thanks again,
-- 
Laurynas


Re: ggc_alloc_rtvec_sized allocates spaces more than necessary?

2011-08-15 Thread Laurynas Biveinis
2011/8/15 Andreas Schwab :
> I think it was meant to be this:
>
> #define ggc_alloc_rtvec_sized(NELT)                                     \
>  ggc_alloc_zone_rtvec_def (sizeof (struct rtvec_def)                   \
>                            + ((NELT) - 1) * sizeof (rtx),              \
>                            &rtl_zone)

Note that ggc_alloc_zone_rtvec_def takes three args, not two.

-- 
Laurynas