Please do not cross post. You have already rased this on bugzilla. I will follow up there later today.
luke On Sat, 3 Jul 2021, Zafer Barutcuoglu wrote:
Hi all, Setting names/dimnames on vectors/matrices of length>=64 returns an ALTREP wrapper which internally still contains the names/dimnames, and calling base::serialize on the result writes them out. They are unserialized in the same way, with the names/dimnames hidden in the ALTREP wrapper, so the problem is not obvious except in wasted time, bandwidth, or disk space. Example: v1 <- setNames(rnorm(64), paste("element name", 1:64)) v2 <- unname(v1) names(v2) # NULL length(serialize(v1, NULL)) # [1] 2039 length(serialize(v2, NULL)) # [1] 2132 length(serialize(v2[TRUE], NULL)) # [1] 543 con <- rawConnection(raw(), "w") serialize(v2, con) v3 <- unserialize(rawConnectionValue(con)) names(v3) # NULL length(serialize(v3, NULL)) # 2132 # Similarly for matrices: m1 <- matrix(rnorm(64), 8, 8, dimnames=list(paste("row name", 1:8), paste("col name", 1:8))) m2 <- unname(m1) dimnames(m2) # NULL length(serialize(m1, NULL)) # [1] 918 length(serialize(m2, NULL)) # [1] 1035 length(serialize(m2[TRUE, TRUE], NULL)) # 582 Previously discussed here, too: https://r.789695.n4.nabble.com/Invisible-names-problem-td4764688.html This happens with other attributes as well, but less predictably: x1 <- structure(rnorm(100), data=rnorm(1000000)) x2 <- structure(x1, data=NULL) length(serialize(x1, NULL)) # [1] 8000952 length(serialize(x2, NULL)) # [1] 924 x1b <- rnorm(100) attr(x1b, "data") <- rnorm(1000000) x2b <- x1b attr(x2b, "data") <- NULL length(serialize(x1b, NULL)) # [1] 8000863 length(serialize(x2b, NULL)) # [1] 8000956 This is pretty severe, trying to track down why serializing a small object kills the network, because of which large attributes it may have once had during its lifetime around the codebase that are still secretly tagging along. Is there a plan to resolve this? Any suggestions for maybe a C++ workaround until then? Or an alternative performant serialization solution? Best, -- Zafer [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tier...@uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel