On Jun 19, 2013, at 12:43 , Berend Hasselman wrote:
>
> On 19-06-2013, at 10:24, peter dalgaard <[email protected]> wrote:
>
>>
>> On Jun 18, 2013, at 21:49 , peter dalgaard wrote:
>>
>>>
>>> On Jun 18, 2013, at 21:23 , Berend Hasselman wrote:
>>>
>>>>
>>>> So it seems that the blocked algorithm is the cause of the error and that
>>>> using the (possibly slow) unblocked algorithm gives the correct result.
>>>
>>> Thanks Berend,
>>>
>>> The finger seems to be pointing to the internal libRlapack/libRblas. The
>>> apparent pattern is that things work when R is linked against external
>>> libraries and not when the internal ones are used. So it could be time to
>>> start looking for differences between R's lapack module and the original
>>> LAPACK code.
>>
>> Now done and no (relevant) differences found. Hmmm....
>>
>> There could be compiler issues. It could also be that the internal LAPACK
>> uses a newer version than the external libs and a bug was introduced in
>> between them.
>>
>> One option could be to bypass R by coding Robin's example in pure Fortran
>> and see whether issues persist when linked against libRlapack, vecLib, and
>> the relevant subset of current LAPACK sources. (The bogus 1.0000 eigenvalue
>> in the 33x33 variant of Robin's example should make it rather easy to spot
>> whether things work or not.)
>
>
> I have made that pure Fortran program.
> I have run it on OS X 10.8.4 with
>
> - libRblas libRlapack
> - my private refblas and Lapack
> - framework Accelerate
> - downloaded zheev.f + dependencies + libRblas
> - downloaded zheev.f + dependencies + my refblas
>
> The Fortran program and the shell script to do the compiling and running and
> the output file follow at the end of this mail.
> The shell scripts runs otool -L on each executable to make sure it is linked
> to the intended libraries.
> The fortran program runs zheev with the minimal lwork and the optimal lwork.
>
> Summary of the output:
>
> - libRblas libRlapack ==> Bad
> - my private refblas and Lapack ==> OK
> - framework Accelerate ==> OK
> - downloaded zheev.f + dependencies + libRblas ==> Bad
> - downloaded zheev.f + dependencies + my refblas ==> OK
>
> It looks like libRblas is the cause of the problem.
> I haven't done any further investigation of the matter.
>
> Berend
>
> Output is:
>
> <output>
> Running hb with Rlapack and Rblas
> ./hbRlapack:
>
> /Library/Frameworks/R.framework/Versions/3.0/Resources/lib/libRblas.dylib
> (compatibility version 0.0.0, current version 0.0.0)
>
> /Library/Frameworks/R.framework/Versions/3.0/Resources/lib/libRlapack.dylib
> (compatibility version 3.0.0, current version 3.0.1)
> /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current
> version 169.3.0)
> lwork= 199
> info= 0
> min eig value= -4.433595556845234E-008
> lwork= 3300
> info= 0
> min eig value= -0.359347003948635
>
> Running hb with Haslapack and Hasblas (Lapack 3.4.2)
> ./hbHaslapack:
> /Users/berendhasselman/Library/lib/x86_64/librefblas.dylib
> (compatibility version 0.0.0, current version 0.0.0)
> /Users/berendhasselman/Library/lib/x86_64/liblapack.dylib
> (compatibility version 0.0.0, current version 0.0.0)
> /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current
> version 169.3.0)
> lwork= 199
> info= 0
> min eig value= -4.433595556845234E-008
> lwork= 3300
> info= 0
> min eig value= -4.433595559424842E-008
>
> Running hb with -framework Accelerate
> ./hbveclib:
> /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate
> (compatibility version 1.0.0, current version 4.0.0)
> /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current
> version 169.3.0)
> lwork= 199
> info= 0
> min eig value= -4.433595568797253E-008
> lwork= 3300
> info= 0
> min eig value= -4.433595550683581E-008
>
> Compile downloaded zheev + dependencies
> /Users/berendhasselman/Documents/Programming/R/problems/bug-error/eigbug
> Running hb with downloaded zheev and Rblas
> ./hbzheev:
>
> /Library/Frameworks/R.framework/Versions/3.0/Resources/lib/libRblas.dylib
> (compatibility version 0.0.0, current version 0.0.0)
> /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current
> version 169.3.0)
> lwork= 199
> info= 0
> min eig value= -4.433595556845234E-008
> lwork= 3300
> info= 0
> min eig value= -0.359347003948635
>
> Running hb with downloaded zheev and Hasrefblas
> ./hbzheev:
> /Users/berendhasselman/Library/lib/x86_64/librefblas.dylib
> (compatibility version 0.0.0, current version 0.0.0)
> /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current
> version 169.3.0)
> lwork= 199
> info= 0
> min eig value= -4.433595556845234E-008
> lwork= 3300
> info= 0
> min eig value= -4.433595559424842E-008
> </output>
>
> Fortran program:
> <fortran>
> program test
>
> integer n
> parameter(n=100)
> complex*16 A(n,n)
> double precision w(n), rwork(3*n-2)
>
> integer lwork, info
>
> complex*16 work, wtmp(1)
> allocatable work(:)
>
> call genmat(A,n)
> lwork = 2*n-1
> allocate(work(lwork))
> print *, "lwork=",lwork
> call zheev("N","L",n,A,n,w,work,lwork,rwork,info)
> print *, "info=",info
> print *, "min eig value=",minval(w)
> deallocate(work)
>
> call genmat(A,n)
> lwork = -1
> call zheev("N","L",n,A,n,w,wtmp,lwork,rwork,info)
> lwork = real(wtmp(1))
> allocate(work(lwork))
> print *, "lwork=",lwork
> call zheev("N","L",n,A,n,w,work,lwork,rwork,info)
> print *, "info=",info
> print *, "min eig value=",minval(w)
>
> stop
> end
>
> subroutine genmat(A,n)
> integer n
> complex*16 A(n,*)
> integer i, j
> double precision tmp
>
> do i=1,n
> do j=1,n
> tmp = exp(-0.1D0 * (i-j)**2)
> A(i,j) = cmplx(tmp,0.0D0)
> enddo
> enddo
>
> return
> end
> </fortran>
>
>
> Shell script to do the compiling and running:
> <script>
> gfortran -c hankinbug.f -o hankinbug.o
> gfortran hankinbug.o -L/Library/Frameworks/R.framework/Libraries -lRblas
> -lRlapack -o hbRlapack
> echo "Running hb with Rlapack and Rblas"
> otool -L ./hbRlapack
> ./hbRlapack
> echo ""
>
> gfortran hankinbug.o -L${HOME}/Library/lib/x86_64 -lrefblas -llapack -o
> hbHaslapack
> echo "Running hb with Haslapack and Hasblas (Lapack 3.4.2)"
> otool -L ./hbHaslapack
> ./hbHaslapack
> echo ""
>
> gfortran hankinbug.o -framework Accelerate -o hbveclib
> echo "Running hb with -framework Accelerate"
> otool -L ./hbveclib
> ./hbveclib
> echo ""
>
> echo "Compile downloaded zheev + dependencies"
> cd zheev
> gfortran -c *.f
> cd -
> gfortran hankinbug.o zheev/*.o -L/Library/Frameworks/R.framework/Libraries
> -lRblas -o hbzheev
> echo "Running hb with downloaded zheev and Rblas"
> otool -L ./hbzheev
> ./hbzheev
> echo ""
>
> gfortran hankinbug.o zheev/*.o -L${HOME}/Library/lib/x86_64 -lrefblas -o
> hbzheev
> echo "Running hb with downloaded zheev and Hasrefblas"
> otool -L ./hbzheev
> ./hbzheev
> echo ""
> </script>
>
Thanks. I think I have it nailed down now. The culprit was indeed in our
reference BLAS (I had only checked the LAPACK code), cmplxblas.f to be
specific. Revision 53001 had a number of IF statements being commented out, but
two of the changes looked like this:
@@ -1561,7 +1561,7 @@
C( J, J ) = DBLE( C( J, J ) )
END IF
DO 170 L = 1, K
- IF( ( A( J, L ).NE.ZERO ) .OR. ( B( J, L ).NE.ZERO ) )
+c IF( ( A( J, L ).NE.ZERO ) .OR. ( B( J, L ).NE.ZERO ) )
$ THEN
TEMP1 = ALPHA*DCONJG( B( J, L ) )
TEMP2 = DCONJG( ALPHA*A( J, L ) )
Notice that the continuation line was NOT commented out. So FORTRAN, being what
it is, continues the line before the comment and parses it as
DO 170 L = 1, KTHEN
with KTHEN uninitialized! and things go downhill from there.
(The uninitialized variable was actually hinted at in PR14964 and the fact that
I could get one of my builds to segfault also helped.)
--
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: [email protected] Priv: [email protected]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel