I would gladly examine your example, Mike. Cheers, Philippe > Le 18 sept. 2016 à 16:05, Michael Sumner <mdsum...@gmail.com> a écrit : > > > >> On Sun, 18 Sep 2016, 19:04 Philippe de Rochambeau <phi...@free.fr> wrote: >> Please find below code that attempts to read ints, longs and floats from a >> binary file (which is a simplification of my original program). >> Please disregard the R inefficiencies, such as using rbind, for now. >> I’ve also included Java code to generate the binary file. >> The output shows that, at one point, anInt becomes undefined. Unfortunately, >> I couldn’t find the correct R function to determine whether inInt is >> undefined or not, as is.null, is.nan, and is.infinite don’t work. >> Any help would be much appreciated. >> Many thanks in advance. >> Philippe >> >> ——————— >> [1] "anInt = 1" >> [1] "is.null FALSE" >> [1] "is.nan FALSE" >> [1] "is.infinite FALSE" >> [1] "aLong = 2" >> [1] "aFloat = 3.44440007209778" >> [1] "--------------------------" >> [1] "anInt = 2" >> [1] "is.null FALSE" >> [1] "is.nan FALSE" >> [1] "is.infinite FALSE" >> [1] "aLong = 22" >> [1] "aFloat = 13.4644002914429" >> [1] "--------------------------" >> [1] "anInt = 3" >> [1] "is.null FALSE" >> [1] "is.nan FALSE" >> [1] "is.infinite FALSE" >> [1] "aLong = 55" >> [1] "aFloat = 45.4444007873535" >> [1] "--------------------------" >> [1] "anInt = " >> [1] "is.null FALSE" >> [1] "is.nan " >> [1] "is.infinite " >> [1] "aLong = " >> [1] "aFloat = " >> [1] "--------------------------" >> [,1] [,2] [,3] >> [1,] 1 2 3.4444 >> [2,] 2 22 13.4644 >> [3,] 3 55 45.4444 >> [4,] Integer,0 Integer,0 Numeric,0 >> > >> >> ----------- >> >> >> ————————————————————— >> >> readFile <- function(inputPath) { >> URL <- file(inputPath, "rb") >> PLT <- matrix(nrow=0, ncol=3) >> counte <- 0 >> max <- 4 >> while (counte < max) { >> anInt <- readBin(con=URL, what=integer(), size=4, n=1, endian="big") >> print(paste("anInt =", anInt)) >> #if (! (anInt == 0)) { print(paste("empty int")); break } >> print(paste("is.null ", is.null(anInt))) >> print(paste("is.nan ", is.nan(anInt))) >> print(paste("is.infinite ", is.infinite(anInt))) >> aLong <- readBin(URL, integer(), size=8, n=1, endian="big") >> print(paste("aLong =", aLong)) >> aFloat <- readBin(URL, numeric(), size=4, n=1, endian="big") >> print(paste("aFloat =", aFloat)) >> print("--------------------------") >> PLT <- rbind(PLT, list(anInt, aLong, aFloat)) >> counte <- counte + 1 >> } # end while >> close(URL) >> PLT >> } >> fichier <- "/Users/philippe/Desktop/datatests/data0.bin" >> PLT2 <- readFile(fichier) >> print(PLT2) >> ————————————————————— >> >> import java.io.*; >> >> public class Main { >> >> Main() { >> writeData(); >> } >> >> public static void main(String[] args) { >> new Main(); >> } >> >> public void writeData() { >> >> final String path = >> "/Users/philippe/Desktop/datatests/data0.bin"; >> >> DataOutputStream dos; >> try { >> dos = new DataOutputStream(new >> BufferedOutputStream(new FileOutputStream(path))); >> // big endian write! ("high byte first") , see >> https://docs.oracle.com/javase/7/docs/api/java/io/DataOutputStream.html >> dos.writeInt(1); >> dos.writeLong(2L); >> dos.writeFloat(3.4444F); >> >> dos.writeInt(2); >> dos.writeLong(22L); >> dos.writeFloat(13.4644F); >> >> dos.writeInt(3); >> dos.writeLong(55L); >> dos.writeFloat(45.4444F); >> >> dos.close(); >> } catch (FileNotFoundException e) { >> e.printStackTrace(); >> } catch (IOException ioe) { >> ioe.printStackTrace(); >> } >> >> } >> >> } >> >> >> ————————————————————— >> >> >> >> >> >> >> > Le 17 sept. 2016 à 20:45, Philippe de Rochambeau <phi...@free.fr> a écrit : >> > >> > Hi Jim, >> > this is exactly the answer I was look for. Many thanks. I didn’t R had a >> > pack function, as in PERL. >> > To answer your earlier question, I am trying to update legacy code to read >> > a binary file with unknown size, over a network, slice up it into rows >> > each containing an integer, an integer, a long, a short, a float and a >> > float, and stuff the rows into a matrix. > > > > It's possible to read all rows fast as raw(), then parse in a vectorised way > with matrix indexing to group the bytes appropriately. There is an example on > the mailing list somewhere, but otherwise I can show an example if that's of > interest. > > > Cheers, Mike > > >> > Best regards, >> > Philippe >> > >> >> Le 17 sept. 2016 à 20:38, jim holtman <jholt...@gmail.com >> >> <mailto:jholt...@gmail.com>> a écrit : >> >> >> >> Here is an example of how to do it: >> >> >> >> x <- 1:10 # integer values >> >> xf <- seq(1.0, 2, by = 0.1) # floating point >> >> >> >> setwd("d:/temp") >> >> >> >> # create file to write to >> >> output <- file('integer.bin', 'wb') >> >> writeBin(x, output) # write integer >> >> writeBin(xf, output) # write reals >> >> close(output) >> >> >> >> >> >> library(pack) >> >> library(readr) >> >> >> >> # read all the data at once >> >> allbin <- read_file_raw('integer.bin') >> >> >> >> # decode the data into a list >> >> (result <- unpack("V V V V V V V V V V d d d d d d d d d d", allbin)) >> >> >> >> >> >> >> >> >> >> Jim Holtman >> >> Data Munger Guru >> >> >> >> What is the problem that you are trying to solve? >> >> Tell me what you want to do, not how you want to do it. >> >> >> >> On Sat, Sep 17, 2016 at 11:04 AM, Ismail SEZEN <sezenism...@gmail.com >> >> <mailto:sezenism...@gmail.com><mailto:sezenism...@gmail.com >> >> <mailto:sezenism...@gmail.com>>> wrote: >> >> I noticed same issue but didnt care much :) >> >> >> >> On Sat, Sep 17, 2016, 18:01 jim holtman <jholt...@gmail.com >> >> <mailto:jholt...@gmail.com> <mailto:jholt...@gmail.com >> >> <mailto:jholt...@gmail.com>>> wrote: >> >> Your example was not reproducible. Also how do you "break" out of the >> >> "while" loop? >> >> >> >> >> >> Jim Holtman >> >> Data Munger Guru >> >> >> >> What is the problem that you are trying to solve? >> >> Tell me what you want to do, not how you want to do it. >> >> >> >> On Sat, Sep 17, 2016 at 8:05 AM, Philippe de Rochambeau <phi...@free.fr >> >> <mailto:phi...@free.fr> <mailto:phi...@free.fr <mailto:phi...@free.fr>>> >> >> wrote: >> >> >> >>> Hello, >> >>> the following function, which stores numeric values extracted from a >> >>> binary file, into an R matrix, is very slow, especially when the said >> >>> file >> >>> is several MB in size. >> >>> Should I rewrite the function in inline C or in C/C++ using Rcpp? If the >> >>> latter case is true, how do you « readBin » in Rcpp (I’m a total Rcpp >> >>> newbie)? >> >>> Many thanks. >> >>> Best regards, >> >>> phiroc >> >>> >> >>> >> >>> ------------- >> >>> >> >>> # inputPath is something like http://myintranet/getData >> >>> <http://myintranet/getData><http://myintranet/getData >> >>> <http://myintranet/getData>>? >> >>> pathToFile=/usr/lib/xxx/yyy/data.bin <http://myintranet/getData >> >>> <http://myintranet/getData> <http://myintranet/getData >> >>> <http://myintranet/getData>>? >> >>> pathToFile=/usr/lib/xxx/yyy/data.bin> >> >>> >> >>> PLTreader <- function(inputPath){ >> >>> URL <- file(inputPath, "rb") >> >>> PLT <- matrix(nrow=0, ncol=6) >> >>> compteurDePrints = 0 >> >>> compteurDeLignes <- 0 >> >>> maxiPrints = 5 >> >>> displayData <- FALSE >> >>> while (TRUE) { >> >>> periodIndex <- readBin(URL, integer(), size=4, n=1, >> >>> endian="little") # int (4 bytes) >> >>> eventId <- readBin(URL, integer(), size=4, n=1, >> >>> endian="little") # int (4 bytes) >> >>> dword1 <- readBin(URL, integer(), size=4, signed=FALSE, >> >>> n=1, endian="little") # int >> >>> dword2 <- readBin(URL, integer(), size=4, signed=FALSE, >> >>> n=1, endian="little") # int >> >>> if (dword1 < 0) { >> >>> dword1 = dword1 + 2^32-1; >> >>> } >> >>> eventDate = (dword2*2^32 + dword1)/1000 >> >>> repNum <- readBin(URL, integer(), size=2, n=1, >> >>> endian="little") # short (2 bytes) >> >>> exp <- readBin(URL, numeric(), size=4, n=1, >> >>> endian="little") # float (4 bytes, strangely enough, would expect 8) >> >>> loss <- readBin(URL, numeric(), size=4, n=1, >> >>> endian="little") # float (4 bytes) >> >>> PLT <- rbind(PLT, c(periodIndex, eventId, eventDate, >> >>> repNum, exp, loss)) >> >>> } # end while >> >>> return(PLT) >> >>> close(URL) >> >>> } >> >>> >> >>> ---------------- >> >>> [[alternative HTML version deleted]] >> >>> >> >>> ______________________________________________ >> >>> R-help@r-project.org <mailto:R-help@r-project.org> >> >>> <mailto:R-help@r-project.org <mailto:R-help@r-project.org>> mailing list >> >>> -- To UNSUBSCRIBE and more, see >> >>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>> <https://stat.ethz.ch/mailman/listinfo/r-help><https://stat.ethz.ch/mailman/listinfo/r-help >> >>> <https://stat.ethz.ch/mailman/listinfo/r-help>> >> >>> PLEASE do read the posting guide http://www.R-project.org/ >> >>> <http://www.r-project.org/> <http://www.r-project.org/ >> >>> <http://www.r-project.org/>> >> >>> posting-guide.html >> >>> and provide commented, minimal, self-contained, reproducible code. >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> ______________________________________________ >> >> R-help@r-project.org <mailto:R-help@r-project.org> >> >> <mailto:R-help@r-project.org <mailto:R-help@r-project.org>> mailing list >> >> -- To UNSUBSCRIBE and more, see >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> <https://stat.ethz.ch/mailman/listinfo/r-help><https://stat.ethz.ch/mailman/listinfo/r-help >> >> <https://stat.ethz.ch/mailman/listinfo/r-help>> >> >> PLEASE do read the posting guide >> >> http://www.R-project.org/posting-guide.html >> >> <http://www.r-project.org/posting-guide.html> >> >> <http://www.r-project.org/posting-guide.html >> >> <http://www.r-project.org/posting-guide.html>> >> >> and provide commented, minimal, self-contained, reproducible code. >> > >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- To >> > UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > <https://stat.ethz.ch/mailman/listinfo/r-help> >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > <http://www.r-project.org/posting-guide.html> >> > and provide commented, minimal, self-contained, reproducible code. >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -- > Dr. Michael Sumner > Software and Database Engineer > Australian Antarctic Division > 203 Channel Highway > Kingston Tasmania 7050 Australia >
[[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.