Thank you, Martin. Yes, on my computer (48Gb memory) it also gives a message "caught segfault" as you got. So it is because the matrix size is too large.
I apologize for the offence I've caused. 2011/8/4 Martin Maechler <maech...@stat.math.ethz.ch>: >>>>>> C6H5NO2 <c6h5...@gmail.com> >>>>>> on Thu, 4 Aug 2011 16:03:54 +0800 writes: > > > Thank you very much, Josh! > > As you suggested, I will contact the developers of "Matrix". > > > PS, C6 are just initial characters of my email account :-) > > > Best wishes, > > C6 > > well, as the posting guide (http://www.R-project.org/posting-guide.html) > says, this is regarded as impolite by many, > and if I wasn't one of the Matrix package authors, > I would not spend time helping 'C6' either. > > > 2011/8/4 Joshua Wiley <jwiley.ps...@gmail.com>: > >> Hi C6 (were C1 - 5 already taken in your family?), > >> > >> I downloaded your data and can replicate your problem. R > >> ceases responding and terminates. This does not occur with all > >> uses of qr on a dgCMatrix object. I know nothing about sparse > >> matrices, but if you believe this should not be occurring, you > >> should contact the package maintainers. Here is my > >> sessionInfo() (FYI, it would probably be helpful to report > >> yours also in case the issue is version dependent): > >> > >> R Under development (unstable) (2011-07-30 r56564) Platform: > >> x86_64-pc-mingw32/x64 (64-bit) > >> > >> locale: [1] LC_COLLATE=English_United States.1252 [2] > >> LC_CTYPE=English_United States.1252 [3] > >> LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] > >> LC_TIME=English_United States.1252 > >> > >> attached base packages: [1] stats graphics grDevices utils > >> datasets methods base > >> > >> other attached packages: [1] Matrix_0.999375-50 lattice_0.19-30 > >> > >> loaded via a namespace (and not attached): [1] grid_2.14.0 > >> tools_2.14.0 > >> > >> Cheers, > >> > >> Josh > >> > >> On Wed, Aug 3, 2011 at 4:26 PM, C6H5NO2 <c6h5...@gmail.com> > >> wrote: > >>> Hello R users, > >>> > >>> I am trying to give the QR decomposition for a large sparse > >>> matrix in the format of dgCMatrix. When I run qr function for > >>> this matrix, the R session simply stops and exits to the > >>> shell. The matrix is of size 108595x108595, and it has > >>> 4866885 non-zeros. I did the experiment on windows 7 and linux > >>> mint 11 (both 64 bit), and the results are the same. > >>> > >>> I have uploaded my data file to > >>> http://ifile.it/elf2p6z/A.RData . The file is 10.681 MB and I > >>> hope someone could kindly download it. The code to see my > >>> problem is: > > >>> library(Matrix) > >>> load("A.RData") > >>> B <- qr(A) > > > >>> Best wishes, C6 > > And what's the size of RAM your two computers have ?? > The answer is of quite some importance. > > Short answer: If you have a large very sparse matrix, > you don't know if the QR decomposition of that matrix is also > very sparse... and if it ain't it will blow up memory, > and that's what I'm pretty sure happened with you. > > What I don't see is why R "simply stops" for you and does not > through a an error message about insufficient memory. > As I show below, I do get a seg.fault > --- which may be considered a bug --- > *BUT* I do get the message about memory problems. > Did you really *not* get any such message? > Is it because you've used a GUI that hides such valuable > information from the user? > > Here's the more detailed reason / analysis about why the above > "kills R". This is commented R code, > you can cut paste after you've got 'A' : > > str(A) > ## Formal class 'dgCMatrix' [package "Matrix"] with 6 slots > ## ..@ i : int [1:4866885] 0 1 2 16 32 33 2392 2417 0 1 ... > ## ..@ p : int [1:108596] 0 8 21 35 44 51 59 63 69 78 ... > ## ..@ Dim : int [1:2] 108595 108595 > ## ..@ Dimnames:List of 2 > ## .. ..$ : NULL > ## .. ..$ : NULL > ## ..@ x : num [1:4866885] 140.03 14.79 14.79 1.78 1.78 ... > ## ..@ factors : list() > > system.time(# the following is still not as fast as it could be): > isSymmetric(A) # yes ! > )# 1.13 {the *2nd* time; 1.9 the 1st time !} > > ## First work with a submatrix: > n <- 10000 > A1 <- A[1:n, 1:n] > system.time( > qr1 <- qr(A1))# on cmath-8, machine with 48 GB RAM memory > ## user system elapsed > ## 59.884 0.316 60.240 !! > > str(qr1) > ## Formal class 'sparseQR' [package "Matrix"] with 6 slots > ## ..@ V :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots > ## .. .. ..@ i : int [1:7692948] 0 1 2 3 4 3 4 5 4 5 ... > ## .. .. ..@ p : int [1:10001] 0 1 2 5 8 10 11 12 18 26 ... > ## .. .. ..@ Dim : int [1:2] 10000 10000 > ## .. .. ..@ Dimnames:List of 2 > ## .. .. .. ..$ : NULL > ## .. .. .. ..$ : NULL > ## .. .. ..@ x : num [1:7692948] 1 1 -3.71 8.68 -8.68 ... > ## .. .. ..@ factors : list() > ## ..@ beta: num [1:10000] 0 0 0.01216 0.00743 0.01758 ... > ## ..@ p : int [1:10000] 61 62 63 64 94 80 161 162 163 164 ... > ## ..@ R :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots > ## .. .. ..@ i : int [1:13581659] 0 1 2 2 3 3 4 2 3 4 ... > ## .. .. ..@ p : int [1:10001] 0 1 2 3 5 7 11 12 13 15 ... > ## .. .. ..@ Dim : int [1:2] 10000 10000 > ## .. .. ..@ Dimnames:List of 2 > ## .. .. .. ..$ : NULL > ## .. .. .. ..$ : NULL > ## .. .. ..@ x : num [1:13581659] 370.37 3.47 22.14 12.48 17.12 ... > ## .. .. ..@ factors : list() > ## ..@ q : int [1:10000] 61 62 63 64 80 94 161 162 163 164 ... > ## ..@ Dim : int [1:2] 10000 10000 > > object.size(A1) > ## 4352184 bytes > object.size(qr1) > ## 255539456 bytes, i.e. 255 MB > c(object.size(qr1) / object.size(A1)) ## 58.715 > c(object.size(A) / object.size(A1)) ## 13.52 > ##--> "predicted size" of qr(A): > c(object.size(A) / object.size(A1))*object.size(qr1) > ## ~ 3 G Bytes > > n <- 20000 > A2 <- A[1:n, 1:n] > system.time( > qr2 <- qr(A2))# on cmath-8, machine with 48 GB RAM memory > ## user system elapsed > ## 1024.068 2.850 1027.488 -- 17 minutes > > object.size(A2) > ## 8504464 bytes > object.size(qr2) > ## 1432'809992 bytes, i.e. 1432.81 MBytes > c(object.size(qr2) / object.size(A2)) ## 168.4774 > ##--> "predicted size" of qr(A): > c(object.size(A) / object.size(A2) *object.size(qr2)) > ## 9912'944757 == 9912.944757 MBytes ~= 10 GBytes --- this will not fit! > > ## Ok: one step further > n <- 30000 > A3 <- A[1:n, 1:n] > system.time( > qr3 <- qr(A3))# on cmath-8, machine with 48 GB RAM memory > ## user system elapsed > ## 3384.183 32.234 3418.392 -- almost one hour ! > > object.size(A3) > ## 11'335112 bytes > object.size(qr3) > ## 3059'252216 bytes, i.e. 3059 MBytes > c(object.size(qr3) / object.size(A3)) ## 269.9 > ##--> "predicted size" of qr(A): > c(object.size(A) / object.size(A3) *object.size(qr3)) > ## 1.588e+10 --- ~ 15 GB -- this is *MORE* than an R object can contain: > .Machine$integer.max > ## 2147483647 = 2.147'483'647 e9 > > system.time(ch1 <- chol(A1)) > ## CHOLMOD warning: > ## Error in .local(x, ...) : CHOLMOD factorization was unsuccessful > ## In addition: Warning message: > ## In .local(x, ...) : > ## Cholmod warning 'not positive definite' at > file:../Cholesky/t_cholmod_rowfac.c, > > system.time(lu1 <- lu(A1)) > ## Error ... : cs_lu(A) failed: near-singular A (or out of memory) > > ##--- Ok, now try the full thing, and see if "R dies without a word" > ## or if it at least says something before death : > system.time( > qrA <- qr(A) > ) > ## > ## *** caught segfault *** > ## address 0x7f3f5f48cf70, cause 'memory not mapped' > ## /u/maechler/bin/R_arg: line 137: 15063 Segmentation fault $exe $@ > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.