Fixed in R-devel 74754.
Tomas
On 04/19/2018 12:15 PM, Tomas Kalibera wrote:
On 04/19/2018 11:47 AM, Serguei Sokol wrote:
Le 19/04/2018 à 09:30, Tomas Kalibera a écrit :
On 04/19/2018 02:06 AM, Duncan Murdoch wrote:
On 18/04/2018 5:08 PM, Tousey, Colton wrote:
Hello,
I want to report a bug in R that is limiting my capabilities to
export a matrix with write.csv or write.table with over
2,147,483,648 elements (C's int limit). I found this bug already
reported about before:
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17182.
However, there appears to be no solution or fixes in upcoming R
version releases.
The error message is coming from the writetable part of the utils
package in the io.c source
code(https://svn.r-project.org/R/trunk/src/library/utils/src/io.c):
/* quick integrity check */
if(XLENGTH(x) != (R_len_t)nr * nc)
error(_("corrupt matrix -- dims not not match
length"));
The issue is that nr*nc is an integer and the size of my matrix,
2.8 billion elements, exceeds C's limit, so the check forces the
code to fail.
Yes, looks like a typo: R_len_t is an int, and that's how nr was
declared. It should be R_xlen_t, which is bigger on machines that
support big vectors.
I haven't tested the change; there may be something else in that
function that assumes short vectors.
Indeed, I think the function won't work for long vectors because of
EncodeElement2 and EncodeElement0. EncodeElement2/0 would have to be
changed, including their signatures
That would be a definite fix but before such deep rewriting is
undertaken may the following small fix (in addition to "(R_xlen_t)nr
* nc") will be sufficient for cases where nr and nc are in int range
but their product can reach long vector limit:
replace
tmp = EncodeElement2(x, i + j*nr, quote_col[j], qmethod,
&strBuf, sdec);
by
tmp = EncodeElement2(VECTOR_ELT(x, (R_xlen_t)i + j*nr), 0,
quote_col[j], qmethod,
&strBuf, sdec);
Unfortunately we can't do that, x is a matrix of an atomic vector
type. VECTOR_ELT is taking elements of a generic vector, so it cannot
be applied to "x". But even if we extracted a single element from "x"
(e.g. via a type-switch etc), we would not be able to pass it to
EncodeElement0 which expects a full atomic vector (that is, including
its header). Instead we would have to call functions like
EncodeInteger, EncodeReal0, etc on the individual elements. Which is
then the same as changing EncodeElement0 or implementing a new version
of it. This does not seem that hard to fix, just is not as trivial as
changing the cast..
Tomas
Serguei
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel