On 04/19/2018 11:47 AM, Serguei Sokol wrote:
Le 19/04/2018 à 09:30, Tomas Kalibera a écrit :
On 04/19/2018 02:06 AM, Duncan Murdoch wrote:
On 18/04/2018 5:08 PM, Tousey, Colton wrote:
Hello,
I want to report a bug in R that is limiting my capabilities to
export a matrix with write.csv or write.table with over
2,147,483,648 elements (C's int limit). I found this bug already
reported about before:
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17182. However,
there appears to be no solution or fixes in upcoming R version
releases.
The error message is coming from the writetable part of the utils
package in the io.c source
code(https://svn.r-project.org/R/trunk/src/library/utils/src/io.c):
/* quick integrity check */
if(XLENGTH(x) != (R_len_t)nr * nc)
error(_("corrupt matrix -- dims not not match
length"));
The issue is that nr*nc is an integer and the size of my matrix,
2.8 billion elements, exceeds C's limit, so the check forces the
code to fail.
Yes, looks like a typo: R_len_t is an int, and that's how nr was
declared. It should be R_xlen_t, which is bigger on machines that
support big vectors.
I haven't tested the change; there may be something else in that
function that assumes short vectors.
Indeed, I think the function won't work for long vectors because of
EncodeElement2 and EncodeElement0. EncodeElement2/0 would have to be
changed, including their signatures
That would be a definite fix but before such deep rewriting is
undertaken may the following small fix (in addition to "(R_xlen_t)nr *
nc") will be sufficient for cases where nr and nc are in int range but
their product can reach long vector limit:
replace
tmp = EncodeElement2(x, i + j*nr, quote_col[j], qmethod,
&strBuf, sdec);
by
tmp = EncodeElement2(VECTOR_ELT(x, (R_xlen_t)i + j*nr), 0,
quote_col[j], qmethod,
&strBuf, sdec);
Unfortunately we can't do that, x is a matrix of an atomic vector type.
VECTOR_ELT is taking elements of a generic vector, so it cannot be
applied to "x". But even if we extracted a single element from "x" (e.g.
via a type-switch etc), we would not be able to pass it to
EncodeElement0 which expects a full atomic vector (that is, including
its header). Instead we would have to call functions like EncodeInteger,
EncodeReal0, etc on the individual elements. Which is then the same as
changing EncodeElement0 or implementing a new version of it. This does
not seem that hard to fix, just is not as trivial as changing the cast..
Tomas
Serguei
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel