I'm trying to read CODA/mcmc files (see the coda package), as generated by jags/WinBUGS/OpenBUGS, into a big.matrix. I can't load the whole mcmc object produced by read.coda() into memory since I'm using a laptop for this analysis (currently I'm unfunded).

Right now I'm doing it by creating the filebacked.big.matrix, reading a chunk of data at a time from the chain file using read.table() with "skip" and "nrows" set, and storing it into the big.matrix. While this is memory efficient, the processing overhead seems be related to the size of the skip value, so that the time required is proportionate to the number of variables.

Any tips on how to do this faster / more efficiently? I'm using a unix system, so a solution that uses grep/sed

Here's some sample code of how I do it now:
index = read.table("Big.CODAindex.txt", col.names = c("var","start","end"))
        n       = index[1,3] - index[1,2] + 1
        k       = dim(index)[1]
X = filebacked.big.matrix( nrow = n, ncol = k, backingfile = "Big.CODA.backing") for(i in 1:k) { X[,i] = read.table("Big.CODAchain1.txt", skip = (i-1)*n, nrows = n)[,2]
                                        print(i)
                                        print(Sys.time())
                                }

Also, here are the first few rows of the index and chain files, so you can see the formatting. The index file tells you each variable's name and the range or rows in the chain file containing the variable's values. The chain file contains the iteration number the value was taken from, and

CODAindex.txt
        egu[1] 1 10000
        egu[2] 10001 20000
        egt[1] 20001 30000
        egt[2] 30001 40000
        ept[1] 40001 50000
        ept[2] 50001 60000
        ...

CODAchain1.txt
        10001  -0.289963
        10011  -0.310657
        10021  -0.290596
        10031  -0.286273
        10041  -0.319877
        10051  -0.299019
        ....

Thanks in advance for any tips!

--Guy W. Cole
R version 2.14.0 (2011-10-31) x86_64-apple-darwin9.8.0

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to