Hi,
On Fri, 15 Apr 2011, Jerry DeLisle wrote:
> > I'll make the DECL_EXPR conditional on the size being variable. As
> > Tobias already okayed the patch I'm planning to check in the slightly
> > modified variant as below, after a new round of testing.
>
> Thats A-OK
r172524
Ciao,
Michael.
On 04/15/2011 07:28 AM, Michael Matz wrote:
--- snip ---
I'll make the DECL_EXPR conditional on the size being variable. As Tobias
already okayed the patch I'm planning to check in the slightly modified
variant as below, after a new round of testing.
Thats A-OK
Thanks,
Jerry
Hi,
On Fri, 15 Apr 2011, Dominique Dhumieres wrote:
> Michael,
>
> > Yes, this is due to the DECL_EXPR statement which is rendered by the
> > dumper just the same as a normal decl. The testcase looks for exactly one
> > such decl, but with -fstack-arrays there are exactly two for each such
>
Michael,
> Yes, this is due to the DECL_EXPR statement which is rendered by the
> dumper just the same as a normal decl. The testcase looks for exactly one
> such decl, but with -fstack-arrays there are exactly two for each such
> array.
The testsuite is run without -fstack-arrays, so I dont'
Hi,
On Thu, 14 Apr 2011, Dominique Dhumieres wrote:
> I have forgotten to mentionned that I have a variant of fatigue
> in which I have done the inlining manually along with few other
> optimizations and the timing for it is
>
> [macbook] lin/test% gfc -Ofast fatigue_v8.f90
> [macbook] lin/test%
I have forgotten to mentionned that I have a variant of fatigue
in which I have done the inlining manually along with few other
optimizations and the timing for it is
[macbook] lin/test% gfc -Ofast fatigue_v8.f90
[macbook] lin/test% time a.out > /dev/null
2.793u 0.002s 0:02.79 100.0%0+0k 0+1io
Tobias Burnus wrote:
no stack-arrayswith stack-arrays
+ -fwhole-program -flto: 10.1s 8.9s
+ -fwhole-program -flto -finline-limit=600 4.8s 3.6s
I wonder whether the following is special to my system* or generally
true. I
Michael Matz wrote:
Try this patch. I've verified that capacita and nf work with it and
-march=native -ffast-math -funroll-loops -fstack-arrays -O3 . In fact all
of polyhedron works for me on these flags. (I've set a ulimit -s of
512MB, but I don't know if such a large amount is required).
T
> See http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01087.html . ...
With this patch on top of revision 172429 (and the second patch for
-fstack-arrays) I now get
Date & Time : 14 Apr 2011 20:48:24
Test Name
Hi,
On Thu, 14 Apr 2011, Michael Matz wrote:
> no stack-arrayswith stack-arrays
> no addtional options: 10.2s 8.8s
> + -fwhole-program: 7.1s 8.8s
> + -fwhole-program -flto: 10.1s
Hi,
On Tue, 12 Apr 2011, Dominique Dhumieres wrote:
> > The resulting speed up for nf.f90 is rather remarkable. What specific
> > feature of the fortran leads to a 30=>15s ?
>
> I think it is the automatic array in the subroutine trisolve. Note that the
> speedup is rather 27->19s and may be d
> I have opened PR48590 for at least one issue that I see. ...
The fix for pr48590 (r172427) does not speedup fatigue.f90 compiled
with -fstack-arrays.
Dominique
Dear Richard and Dominique,
> VLAs and malloc based arrays may behave differently with respect to alias
> analysis (I'd have to look at some examples). All effects other than malloc
- hmmm, yes. I was forgetting what might happen with inlining. It's
not evident, is it?
> overhead I would attr
On Wed, Apr 13, 2011 at 12:38 PM, Richard Guenther
wrote:
> On Wed, Apr 13, 2011 at 11:42 AM, Paul Richard Thomas
> wrote:
>> Dear Dominique,
>>
>>> I think it is the automatic array in the subroutine trisolve. Note that the
>>> speedup is rather 27->19s and may be darwin specific (slow malloc).
On Wed, Apr 13, 2011 at 11:42 AM, Paul Richard Thomas
wrote:
> Dear Dominique,
>
>> I think it is the automatic array in the subroutine trisolve. Note that the
>> speedup is rather 27->19s and may be darwin specific (slow malloc).
>
> I saw a speed-up of similar order on FC9/x86_64.
>
> I strongly
Dear Dominique,
> I think it is the automatic array in the subroutine trisolve. Note that the
> speedup is rather 27->19s and may be darwin specific (slow malloc).
I saw a speed-up of similar order on FC9/x86_64.
I strongly doubt that it is anything to do with the automatic array -
if it is ther
On Apr 12 2011, Michael Matz wrote:
On Mon, 11 Apr 2011, Steven Bosscher wrote:
Isn't there a way to put a maximum on the size of the arrays on stack,
e.g. -fstack-arrays-limit= or something like that?
Not without generating contorded code. The problem is that these arrays
are variable leng
Hello,
On Mon, 11 Apr 2011, Steven Bosscher wrote:
> > Try this patch. I've verified that capacita and nf work with it and
> > -march=native -ffast-math -funroll-loops -fstack-arrays -O3 . In fact all
> > of polyhedron works for me on these flags. (I've set a ulimit -s of
> > 512MB, but I don'
> The resulting speed up for nf.f90 is rather remarkable. What specific
> feature of the fortran leads to a 30=>15s ?
I think it is the automatic array in the subroutine trisolve. Note that the
speedup is rather 27->19s and may be darwin specific (slow malloc).
Note also that -fstack-arrays pre
Dear Michael,
Thanks for updating the patch. I am afraid that my attention to
gfortran is somewhat limited at present. However, I see that
Dominique has verified your patch and that all is well.
The resulting speed up for nf.f90 is rather remarkable. What specific
feature of the fortran leads
With the new patch (+the fix for the uninitialized tmp), the polyhedron
tests pass:
Date & Time : 11 Apr 2011 21:20:59
Test Name : pbharness
Compile Command : gfc %n.f90 -Ofast -funroll-loops -ftree-loop-lin
On 04/11/2011 06:04 PM, Michael Matz wrote:
===
*** trans-array.c (revision 172206)
--- trans-array.c (working copy)
*** gfc_trans_auto_array_allocation (tree de
[...]
! gfc_add_init_cleanup (block, inittr
On Apr 11 2011, H.J. Lu wrote:
Is that possible to raise stack limit when -fstack-arrays is used?
In some systems, yes. In others (including most Unices), not without
evil contortions that will assuredly break something else (like
debuggers and some MPIs).
Regards,
Nick Maclaren.
On Mon, Apr 11, 2011 at 9:04 AM, Michael Matz wrote:
> On Sat, 9 Apr 2011, Paul Richard Thomas wrote:
>
>> I find that both nf.f90 and capacita.f90 segfault in runtime for any
>> stack size.
>
> Try this patch. I've verified that capacita and nf work with it and
> -march=native -ffast-math -funro
I am trying the new patch. Updating failed with
../../work/gcc/fortran/trans-array.c: In function
'gfc_trans_auto_array_allocation':
../../work/gcc/fortran/trans-array.c:4881:24: error: 'tmp' may be used
uninitialized in this function [-Werror=uninitialized]
cc1: all warnings being treated as er
On Mon, Apr 11, 2011 at 6:04 PM, Michael Matz wrote:
> On Sat, 9 Apr 2011, Paul Richard Thomas wrote:
>
>> I find that both nf.f90 and capacita.f90 segfault in runtime for any
>> stack size.
>
> Try this patch. I've verified that capacita and nf work with it and
> -march=native -ffast-math -funro
On Sat, 9 Apr 2011, Paul Richard Thomas wrote:
> I find that both nf.f90 and capacita.f90 segfault in runtime for any
> stack size.
Try this patch. I've verified that capacita and nf work with it and
-march=native -ffast-math -funroll-loops -fstack-arrays -O3 . In fact all
of polyhedron work
Hi,
On Mon, 11 Apr 2011, Eric Botcazou wrote:
> > See? That's whay I meant with having to use a large ulimit for stack
> > size. Usually stack overflows symptom is a simple segfault. What ulimit
> > -s have you used for your capacita tests?
>
> Compiling with -fstack-check should give the seg
> What ulimit -s have you used for your capacita tests?
On *-apple-darwin* the stacksize is limited to (kbytes, -s) 65532
and it is hard coded. It is my (very limited) understanding that
this limit can be bypassed by using something such as
-Wl,-stack_size,0xf000 (for 1Gbytes in 64 bit mode -i
> See? That's whay I meant with having to use a large ulimit for stack
> size. Usually stack overflows symptom is a simple segfault. What ulimit
> -s have you used for your capacita tests?
Compiling with -fstack-check should give the segfault reliably.
--
Eric Botcazou
Hi,
On Mon, Apr 11, 2011 at 01:49:16PM +0200, Michael Matz wrote:
> Hi,
>
> On Sun, 10 Apr 2011, Dominique Dhumieres wrote:
>
> > > I find that both nf.f90 and capacita.f90 segfault in runtime for any
> > > stack size.
> >
> > On x86_64-apple-darwin10, nf.f90 "works". However if I run it throug
Hi,
On Sun, 10 Apr 2011, Dominique Dhumieres wrote:
> > I find that both nf.f90 and capacita.f90 segfault in runtime for any stack
> > size.
>
> On x86_64-apple-darwin10, nf.f90 "works". However if I run it through
> valgrind I get
>
> ==64815== Memcheck, a memory error detector
> ==64815== Co
> I find that both nf.f90 and capacita.f90 segfault in runtime for any stack
> size.
On x86_64-apple-darwin10, nf.f90 "works". However if I run it through
valgrind I get
==64815== Memcheck, a memory error detector
==64815== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==64815=
Dear Dominique and Michael,
I find that both nf.f90 and capacita.f90 segfault in runtime for any stack size.
Cheers
Paul
On Sat, Apr 9, 2011 at 12:08 PM, Dominique Dhumieres wrote:
> Michael,
>
> I have applied your patch on top of revision 172217 on
> x86_64-apple-darwin10.7.0.
> So far I ha
Michael,
I have applied your patch on top of revision 172217 on
x86_64-apple-darwin10.7.0.
So far I have only limited tests on the polyhedron test suite.
The test nf.f90 (containing an automatic array) executes in less than 20s,
compares
to ~28s without the patch. However capacita.f90 is miscomp
On Apr 9 2011, Magnus Fromreide wrote:
There is actually a much better approach, which was done in Algol 68
and seems now to be done only in Ada. As far as the compiler
implementation goes, it is a trivial variation on what you have done,
but there is a little more work in the run-time system.
> There is actually a much better approach, which was done in Algol 68
> and seems now to be done only in Ada. As far as the compiler
> implementation goes, it is a trivial variation on what you have done,
> but there is a little more work in the run-time system.
Obviously this depends on the com
On Sat, 2011-04-09 at 09:21 +0100, N.M. Maclaren wrote:
> On Apr 8 2011, Michael Matz wrote:
> >
> >It adds a new option -fstack-arrays which makes the frontend put
> >all local arrays on stack memory. ...
>
> Excellent!
>
> >I haven't rechecked performance now, but four months ago this was the
On Apr 8 2011, Michael Matz wrote:
It adds a new option -fstack-arrays which makes the frontend put
all local arrays on stack memory. ...
Excellent!
I haven't rechecked performance now, but four months ago this was the
result for the fortran benchmarks in cpu2006:
There is actually a muc
39 matches
Mail list logo