On Mon, Jan 7, 2013 at 3:23 PM, Chris Jewell <chris.jew...@warwick.ac.uk>wrote:
> Hi All, > > I'm currently trying to write an S4 class that mimics a data.frame, but > stores data on disc in HDF5 format. The idea is that the dataset is likely > to be too large to fit into a standard desktop machine, and by using > subscripts, the user may load bits of the dataset at a time. eg: > > > myLargeData <- LargeData("/path/to/file") > > mySubSet <- myLargeData[1:10, seq(1,15,by=3)] > > I've therefore defined by LargeData class thus > > > LargeData <- setClass("LargeData", representation(filename="character")) > > setMethod("initialize","LargeData", function(.Object,filename) > .Object@filename <- filename) > > The above function needs to return .Object. > I've then defined the "[" method to call a C++ function (Rcpp), opening > the HDF5 file, and returning the required rows/cols as a data.frame. > > However, what if the user wants to load the entire dataset into memory? > Which method do I overload to achieve the following? > > > fullData <- myLargeData > > class(fullData) > [1] "data.frame" > > or apply transformations: > > > myEigen <- eigen(myLargeData) > > In C++ I would normally overload the "double" or "float" operator to > achieve this -- can I do the same thing in R? > The coercions are going to have to be explicit, since there are no type declarations. So, an as.data.frame method for coercing to a data.frame (as well as a coerce method via setAs), and you'll need methods for many of the base R functions. Some of those you can implicitly support using an S3 method on as.data.frame, assuming the function calls it. > Thanks, > > Chris > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel