On 13-08-01 4:36 AM, Zhang Weiwu wrote:
Hello. readBin is designed to read a batch of data with the same spec, e.g.
read 10000 floats into a vector. In practise I read into data frame, not
vector.  For each data frame, I need to read a integer and a float.

for (i in 1:1000) {
        dataframe$int[i]   <- readBin(con, integer(), size=2)
        dataframe$float[i] <- readBin(con, numeric(), size=4)
}

And I need to read 100 such data files, ending up with a for loop in a for
loop. Something feels wrong here, as it is being said if you use double-FOR
you are not speaking R.

What is the R way of doing this? I can think of writing the content of the
loop into a function, and vectorize it -- But, the result would be a list of
list, not exactly data-frame, and the list grows incrementally, which is
inefficient, since I know the size of my data frame at the outset. I am a
new learner, not speaking half of R vocabulary, kindly provide some hint
please:)

I don't think there are any functions to do this directly. I'd probably use the loop (since the time to read 1000 entries would be small). If it was longer, what I might do is to read the file as raw bytes, then read the integer and float vector from subsets of the bytes.

For example, the following untested code:

rawvec <- readBin(con, "raw")
n <- length(rawvec) / 6
i <- 0:(n-1)
# Using sort here is inefficient, but I'm lazy...
indices <- sort( c(6*i + 1, 6*i + 2) )
con <- rawConnection(rawvec[indices])
int <- readBin(con, "integer", size=2)
close(con)

indices <- sort( c(6*i + 3, 6*i + 4, 6*i + 5, 6*i + 6) )
con <- rawConnection(rawvec[indices])
float <- readBin(con, "numeric", 4)
close(con)

dataframe <- data.frame(int=int, float=float)

The other way to do this is to read the data in a C function, using .Call or .C to get it into R.

Duncan Murdoch

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to