Hi all,

Setting names/dimnames on vectors/matrices of length>=64 returns an ALTREP 
wrapper which internally still contains the names/dimnames, and calling 
base::serialize on the result writes them out. They are unserialized in the 
same way, with the names/dimnames hidden in the ALTREP wrapper, so the problem 
is not obvious except in wasted time, bandwidth, or disk space.

Example:
   v1 <- setNames(rnorm(64), paste("element name", 1:64))
   v2 <- unname(v1)
   names(v2)
   # NULL
   length(serialize(v1, NULL))
   # [1] 2039
   length(serialize(v2, NULL))
   # [1] 2132
   length(serialize(v2[TRUE], NULL))
   # [1] 543

   con <- rawConnection(raw(), "w")
   serialize(v2, con)
   v3 <- unserialize(rawConnectionValue(con))
   names(v3)
   # NULL
   length(serialize(v3, NULL))
   # 2132

   # Similarly for matrices:
   m1 <- matrix(rnorm(64), 8, 8, dimnames=list(paste("row name", 1:8), 
paste("col name", 1:8)))
   m2 <- unname(m1)
   dimnames(m2)
   # NULL
   length(serialize(m1, NULL))
   # [1] 918
   length(serialize(m2, NULL))
   # [1] 1035
   length(serialize(m2[TRUE, TRUE], NULL))
   # 582

Previously discussed here, too:
https://r.789695.n4.nabble.com/Invisible-names-problem-td4764688.html

This happens with other attributes as well, but less predictably:
   x1 <- structure(rnorm(100), data=rnorm(1000000))
   x2 <- structure(x1, data=NULL)
   length(serialize(x1, NULL))
   # [1] 8000952
   length(serialize(x2, NULL))
   # [1] 924

   x1b <- rnorm(100)
   attr(x1b, "data") <- rnorm(1000000)
   x2b <- x1b
   attr(x2b, "data") <- NULL
   length(serialize(x1b, NULL))
   # [1] 8000863
   length(serialize(x2b, NULL))
   # [1] 8000956

This is pretty severe, trying to track down why serializing a small object 
kills the network, because of which large attributes it may have once had 
during its lifetime around the codebase that are still secretly tagging along.

Is there a plan to resolve this? Any suggestions for maybe a C++ workaround 
until then? Or an alternative performant serialization solution?

Best,
--
Zafer


        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to