>>>>> Dénes Tóth <toth.de...@ttk.mta.hu>
>>>>>     on Fri, 18 Mar 2016 22:56:23 +0100 writes:

    > Hi Roy,
    > R (usually) makes a copy if the dimensionality of an array is modified, 
    > even if you use this syntax:

    > x <- array(1:24, c(2, 3, 4))
    > dim(x) <- c(6, 4)

    > See also ?tracemem, ?data.table::address, ?pryr::address and other tools 
    > to trace if an internal copy is done.

Well, without using strange (;-) packages,  indeed standard R's
tracemem(), notably the help page is a good pointer.

According to the help page memory tracing is enabled in the
default R binaries for Windows and OS X.
For Linux (where I, as R developer, compile R myself anyway),
one needs to configure with --enable-memory-profiling .

Now, let's try:

   > x <- array(rnorm(47), dim = c(1000,50, 40))
   > tracemem(x)
   [1] "<0x7f79a498a010>"
   > dim(x) <- c(1000* 50, 40)
   > x[5] <- pi
   > tracemem(x)
   [1] "<0x7f79a498a010>"
   > 

So, *BOTH*  the re-dimensioning  *AND*  the  sub-assignment did
*NOT* make a copy.

Indeed, R has become much smarter  in these things in recent
years ... not thanks to me, but very much thanks to
Luke Tierney (from R-core), and also thanks to contributions from "outside",
notably Tomas Kalibera.

And hence: *NO* such strange workarounds are needed in this specific case: 

    > Workaround: use data.table::setattr or bit::setattr to modify the 
    > dimensions in place (i.e., without making a copy). Risk: if you modify 
    > an object by reference, all other objects which point to the same memory 
    > address will be modified silently, too.

Martin Maechler, ETH Zurich  (and R-core)

    > HTH,
    > Denes

(generally, your contributions help indeed, Denes, thank you!)


    > On 03/18/2016 10:28 PM, Roy Mendelssohn - NOAA Federal wrote:
    >> Hi All:
    >> 
    >> I am working with a very large array.  if noLat is the number of 
latitudes, noLon the number of longitudes and noTime the number of  time 
periods, the array is of the form:
    >> 
    >> myData[noLat, no Lon, noTime].
    >> 
    >> It is read in this way because that is how it is stored in a (series) of 
netcdf files.  For the analysis I need to do, I need instead the array:
    >> 
    >> myData[noLat*noLon, noTime].  Normally this would be easy:
    >> 
    >> myData<- array(myData,dim=c(noLat*noLon,noTime))
    >> 
    >> My question is how does this command work in R - does it make a copy of 
the existing array, with different indices for the dimensions, or does it just 
redo the indices and leave the given array as is?  The reason for this question 
is my array is 30GB in memory, and I don’t have enough space to have a copy of 
the array in memory.  If the latter I will have to figure out a work around to 
bring in only part of the data at a time and put it into the proper locations.
    >> 
    >> Thanks,
    >> 
    >> -Roy

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to