Good day, Thanks to handy hints from Martin Morgan, I ran R under gdb and checked for any numeric overflow. We pinpointed the cause:
(gdb) info locals i = 0 j = 10738 m = 200000 n = 50000 ans = 0x55555b332790 aa = 0x55555b3327c0 There is a line of C code in dgeMatrix.c for (i = 0; i < m; i++) aa[i] += xx[i + j * m]; i + j * m are all int, and overflow (lldb) print 0 + 10738 * 200000 (int) $5 = -2147367296 So, either the code should check that this doesn't occur, or be adjusted to allow for large indexes. If anyone is interested, this is in the context of single-cell ATAC-seq data, which typically has about 200000 genomic regions (rows) and perhaps 100000 biological cells (columns). -------------------------------------- Dario Strbenac University of Sydney Camperdown NSW 2050 Australia ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel