Hi all --- I have a large sparse matrix, call it P: ``` > str(P) Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ..@ i : int [1:7868093] 4221 6098 8780 10313 11102 14243 20570 22145 24468 24977 ... ..@ p : int [1:7357] 0 0 269 388 692 2434 3662 4179 4205 4256 ... ..@ Dim : int [1:2] 1303967 7356 ..@ Dimnames:List of 2 .. ..$ : NULL .. ..$ : NULL ..@ x : num [1:7868093] 1 1 1 1 1 1 1 1 1 1 ... ..@ factors : list() ```
I'd like to row-normalize (say, with the L-2 norm)... the straight-forward approach would be something like: ``` > row_normalized_P <- P / rowSums(P^2) ``` But this causes a memory allocation error, since it appears the `rowSums` result is being recycled (appropriately) into a _dense_ matrix with dimensions equal to `dim(P)`. Given that P is known to be sparse (or at the very least is stored in sparse format), does anyone know of a non-iterative approach to achieve the desired `row_normalized_P` shown above? (I.e. the resultant matrix will be equally sparse as P itself... and I'd like to avoid ever having a dense matrix (apart from the rowSums vector) allocated during the normalization steps.) The only semi-efficient method I've found around this is to `apply` across rows (more accurately through blocks of rows coerced into dense sub-matrices of P), but I'd like to try to remove the looping logic from my codebase if I can, and I'm wondering if perhaps there's a built-in in the Matrix package (that I'm just not aware of) that helps with this particular type of computation. Cheers and thanks for any help! -murat [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.