Most of the time is spent in this function...
void
dlasr(
str_cref side,
str_cref pivot,
str_cref direct,
int const& m,
int const& n,
arr_cref<double> c,
arr_cref<double> s,
arr_ref<double, 2> a,
int const& lda)
in this loop:
FEM_DOSTEP(j, n - 1, 1, -1) {
ctemp = c(j);
stemp = s(j);
if ((ctemp != one) || (stemp != zero)) {
FEM_DO(i, 1, m) {
temp = a(i, j + 1);
a(i, j + 1) = ctemp * temp - stemp * a(i, j);
a(i, j) = stemp * temp + ctemp * a(i, j);
}
}
}
a(i, j) is implemented as
T* elems_; // member
T const&
operator()(
ssize_t i1,
ssize_t i2) const
{
return elems_[dims_.index_1d(i1, i2)];
}
with
ssize_t all[Ndims]; // member
ssize_t origin[Ndims]; // member
size_t
index_1d(
ssize_t i1,
ssize_t i2) const
{
return
(i2 - origin[1]) * all[0]
+ (i1 - origin[0]);
}
The array pointer is buried as elems_ member in the arr_ref<> class template.
How can I apply __restrict in this case?
Ralf
----- Original Message ----
From: Andrew Pinski <[email protected]>
To: Ralf W. Grosse-Kunstleve <[email protected]>
Cc: [email protected]
Sent: Tue, August 10, 2010 8:47:18 PM
Subject: Re: food for optimizer developers
On Tue, Aug 10, 2010 at 6:51 PM, Ralf W. Grosse-Kunstleve
<[email protected]> wrote:
> I wrote a Fortran to C++ conversion program that I used to convert selected
> LAPACK sources. Comparing runtimes with different compilers I get:
>
> absolute relative
> ifort 11.1.072 1.790s 1.00
> gfortran 4.4.4 2.470s 1.38
> g++ 4.4.4 2.922s 1.63
I wonder if adding __restrict to some of the arguments of the
functions will help. Fortran aliasing is so different from C
aliasing.
-- Pinski