Does anyone else have any insights to this issue:
Henrick, thank you for your very quick response. I've examined the readBin help file with respect to endian and I'm still not sure I'm getting this correct. Here is what I'm coding: con <- file(file.choose(), open="rb") Year66 <- readBin(con, what=integer(), signed = TRUE, size = 2, endian="little", n = 40374840) # define endian= "little" length(Year66) close(con) # convert millimeters to inches Year66.in <- Year66 * 0.039370 describe(Year66.in) Year66.in n missing unique Mean .05 .10 .25 .50 .75 .90 .95 8185584 0 65511 -21.56 -650.1 -650.1 -162.2 0.0 0.0 636.5 639.1 lowest : -1290 -1290 -1290 -1290 -1290, highest: 1290 1290 1290 1290 1290 # establish cut points using inches bins <- cut(Year66.in, breaks=30) barplot(table(bins)) length(Year66.in) # this returns a value representing the number of records read as 8185584 or 20.2% (see next line) of the records that I'm expecting. length(Year66.in) / (419*264*365) # returns proportion of records expected in one year #### here I will introduce code to classify the summary statistics using both a clustering and a non-metric scaling function. These procedures will hopefully enable differentiation of #### cluster-groupings, associating the initial input annual year values with a separate (not-shown) calculated index. What I eventually want to accomplish is a statistical summary for each of the 37 years in the binary file. Reading in the file on a year to year basis (n=40374840) should give me the all of the records for just the first year, not all of the records in the binary file. I also therefore need to better understand how to read a set of records for year 2, 3, 4, ... 37. Any ideas ? Thanks for your assistance Steve Steve Friedman Ph. D. Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 steve_fried...@nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 Henrik Bengtsson <h...@stat.berkeley .edu> To Sent by: steve_fried...@nps.gov henrik.bengtsson@ cc gmail.com r-help@r-project.org Subject Re: [R] Reading Binary Files 02/11/2009 09:20 AM PST Argument 'size' is what you are looking for, cf. help(readBin). Whenever reading binary files this way, I strongly recommend that you are explicit about all arguments of readBin(), e.g. readBin(con, what=integer(), size=2, signed=TRUE, endian="little", n=n); For instance, you probably do not want 'endian' to be dependent on the platform (see help) you run on, but instead be specific to the file format you are reading. /Henrik On Wed, Feb 11, 2009 at 8:04 AM, <steve_fried...@nps.gov> wrote: > > Hello > > I'm encountering some difficulty correctly reading binary files. The binary > files store data as "short" rather than "double" , "int", or any of the > other modes of the vector being read. > > The data represents a regular grid of size 419 rows by 264 columns, to make > it more interesting, the data are daily records, for a total of 37 years. > The file size is therefore 419(rows) * 264(columns) * 365(days) * 37(years) > long. > > The product of these dimensions is 1493869080 records. > > I'm using the following code to read these into R (windows 2.8.1 ) > > con <- file(file.choose(), open="rb") > Year66 <- readBin(con, integer, signed=TRUE, n = 40374840) > close(con) > > length(Year66) > > returns 2046396 > > I'm betting that I'm defining the "what" incorrectly, but after numerous > attempts with different choices I'm wondering if readBin can handle "short" > values? > > Any help is greatly appreciated. > > Steve > > > Steve Friedman Ph. D. > Spatial Statistical Analyst > Everglades and Dry Tortugas National Park > 950 N Krome Ave (3rd Floor) > Homestead, Florida 33034 > > steve_fried...@nps.gov > Office (305) 224 - 4282 > Fax (305) 224 - 4147 > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Steve Friedman Ph. D. Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 steve_fried...@nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.