>> -----Original Message----- >> From: r-devel-boun...@r-project.org >> [mailto:r-devel-boun...@r-project.org] On Behalf Of Duncan Murdoch >> Sent: Wednesday, May 12, 2010 11:35 AM >> To: bull...@stat.berkeley.edu >> Cc: r-de...@stat.math.ethz.ch >> Subject: Re: [Rd] ranges and contiguity checking >> >> On 12/05/2010 2:18 PM, James Bullard wrote: >> > Hi All, >> > >> > I am interfacing to some C libraries (hdf5) and I have >> methods defined for >> > '[', these methods do hyperslab selection, however, currently I am >> > limiting slab selection to contiguous blocks, i.e., things >> defined like: >> > i:(i+k). I don't do any contiguity checking at this point, >> I just grab the >> > max and min of the range and them potentially do an >> in-memory subselection >> > which is what I am definitely trying to avoid. Besides >> using deparse, I >> > can't see anyway to figure out that these things (i:(i+k) >> and c(i, i+1, >> > ..., i+k)) are different. >> > >> > I have always liked how 1:10 was a valid expression in R >> (as opposed to >> > python where it is not by itself.), however I'd somehow >> like to know that >> > the thing was contiguous range without examining the un-evaluated >> > expression or worse, all(diff(i:(i+k)) == 1) > > You could define a sequence class, say 'hfcSeq' > and insist that the indices given to [.hfc are > hfcSeq objects. E.g., instead of > hcf[i:(i+k)] > the user would use > hcf[hfcSeq(i,i+k)] > or > index <- hcfSeq(i,i+k) > hcf[index] > max, min, and range methods for hcfSeq > would just inspect one or both of its > elements.
I could do this, but I wanted it to not matter to the user whether or not they were dealing with a HDF5Dataset or a plain-old matrix. It seems like I cannot define methods on: ':'. If I could do that then I could implement an immutable 'range' class which would be good, but then I'd have to also implement: '['(matrix, range) -- which would be easy, but still more work than I wanted to do. I guess I was thinking that there is some inherent value in an immutable native range type which is constant in time and memory for construction. Then I could define methods on '['(matrix, range) and '['(matrix, integer). I'm pretty confident this is more less what is happening in the IRanges package in Bioconductor, but (maybe for the lack of support for setting methods on ':') it is happening in a way that makes things very non-transparent to a user. As it stands, I can optimize for performance by using a IRange-type wrapper or I can optimize for code-clarity by killing performance. thanks again, jim > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > >> >> You can implement all(diff(x) == 1) more efficiently in C, >> but I don't >> see how you could hope to do any better than that without >> putting very >> un-R-like restrictions on your code. Do you really want to say that >> >> A[i:(i+k)] >> >> is legal, but >> >> x <- i:(i+k) >> A[x] >> >> is not? That will be very confusing for your users. The problem is >> that objects don't remember where they came from, only arguments to >> functions do, and functions that make use of this fact mainly >> do it for >> decorating the output (nice labels in plots) or making error messages >> more intelligible. >> >> Duncan Murdoch >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel