David, You still haven't provided a reproducible example. As Duncan already said, "if you don't post code that allows us to reproduce the crash, it's really unlikely that we'll be able to fix it."
And R-devel is not the appropriate venue to discuss this if it's truly an issue with xts/zoo. Best, -- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com On Mon, Jul 23, 2012 at 12:41 AM, David Terk <david.t...@gmail.com> wrote: > Looks like the call to: > > dat.i <- to.period(dat.i, period=per, k=subper, name=NULL) > > If what is causing the issue. If variable name is not set, or set to any > value other than NULL. Than no hang occurs. > > -----Original Message----- > From: David Terk [mailto:david.t...@gmail.com] > Sent: Monday, July 23, 2012 1:25 AM > To: 'Duncan Murdoch' > Cc: 'r-devel@r-project.org' > Subject: RE: [Rd] Reading many large files causes R to crash - Possible Bug > in R 2.15.1 64-bit Ubuntu > > I've isolated the bug. When the seg fault was produced there was an error > that memory had not been mapped. Here is the odd part of the bug. If you > comment out certain code and get a full run than comment in the code which > is causing the problem it will actually run. So I think it is safe to > assume something wrong is taking place with memory allocation. Example. > While testing, I have been able to get to a point where the code will run. > But if I reboot the machine and try again, the code will not run. > > The bug itself is happening somewhere in XTS or ZOO. I will gladly upload > the data files. It is happening on the 10th data file which is only 225k > lines in size. > > Below is the simplified code. The call to either > > dat.i <- to.period(dat.i, period=per, k=subper, name=NULL) > index(dat.i) <- index(to.period(templateTimes, period=per, k=subper)) > > is what is causing R to hang or crash. I have been able to replicate this > on Windows 7 64 bit and Ubuntu 64 bit. Seems easiest to consistently > replicate from R Studio. > > The code below will consistently replicate when the appropriate files are > used. > > parseTickDataFromDir = function(tickerDir, per, subper) { > tickerAbsFilenames = list.files(tickerDir,full.names=T) > tickerNames = list.files(tickerDir,full.names=F) > tickerNames = gsub("_[a-zA-Z0-9].csv","",tickerNames) > pb <- txtProgressBar(min = 0, max = length(tickerAbsFilenames), style = 3) > > for(i in 1:length(tickerAbsFilenames)) { > dat.i = parseTickData(tickerAbsFilenames[i]) > dates <- unique(substr(as.character(index(dat.i)), 1,10)) > times <- rep("09:30:00", length(dates)) > openDateTimes <- strptime(paste(dates, times), "%F %H:%M:%S") > templateTimes <- NULL > > for (j in 1:length(openDateTimes)) { > if (is.null(templateTimes)) { > templateTimes <- openDateTimes[j] + 0:23400 > } else { > templateTimes <- c(templateTimes, openDateTimes[j] + 0:23400) > } > } > > templateTimes <- as.xts(templateTimes) > dat.i <- merge(dat.i, templateTimes, all=T) > if (is.na(dat.i[1])) { > dat.i[1] <- -1 > } > dat.i <- na.locf(dat.i) > dat.i <- to.period(dat.i, period=per, k=subper, name=NULL) > index(dat.i) <- index(to.period(templateTimes, period=per, > k=subper)) > setTxtProgressBar(pb, i) > } > close(pb) > } > > parseTickData <- function(inputFile) { > DAT.list <- scan(file=inputFile, > sep=",",skip=1,what=list(Date="",Time="",Close=0,Volume=0),quiet=T) > index <- as.POSIXct(paste(DAT.list$Date,DAT.list$Time),format="%m/%d/%Y > %H:%M:%S") > DAT.xts <- xts(DAT.list$Close,index) > DAT.xts <- make.index.unique(DAT.xts) > return(DAT.xts) > } > > DATTick <- parseTickDataFromDir(tickerDirSecond, "seconds",10) > > -----Original Message----- > From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] > Sent: Sunday, July 22, 2012 4:48 PM > To: David Terk > Cc: r-devel@r-project.org > Subject: Re: [Rd] Reading many large files causes R to crash - Possible Bug > in R 2.15.1 64-bit Ubuntu > > On 12-07-22 3:54 PM, David Terk wrote: >> I am reading several hundred files. Anywhere from 50k-400k in size. >> It appears that when I read these files with R 2.15.1 the process will >> hang or seg fault on the scan() call. This does not happen on R 2.14.1. > > The code below doesn't do anything other than define a couple of functions. > Please simplify it to code that creates a file (or multiple files), reads it > or them, and shows a bug. > > If you can't do that, then gradually add the rest of the stuff from these > functions into the mix until you figure out what is really causing the bug. > > If you don't post code that allows us to reproduce the crash, it's really > unlikely that we'll be able to fix it. > > Duncan Murdoch > >> >> >> >> This is happening on the precise build of Ubuntu. >> >> >> >> I have included everything, but the issue appears to be when >> performing the scan in the method parseTickData. >> >> >> >> Below is the code. Hopefully this is the right place to post. >> >> >> >> parseTickDataFromDir = function(tickerDir, per, subper, fun) { >> >> tickerAbsFilenames = list.files(tickerDir,full.names=T) >> >> tickerNames = list.files(tickerDir,full.names=F) >> >> tickerNames = gsub("_[a-zA-Z0-9].csv","",tickerNames) >> >> pb <- txtProgressBar(min = 0, max = length(tickerAbsFilenames), >> style = 3) >> >> >> >> for(i in 1:length(tickerAbsFilenames)) { >> >> >> >> # Grab Raw Tick Data >> >> dat.i = parseTickData(tickerAbsFilenames[i]) >> >> #Sys.sleep(1) >> >> # Create Template >> >> dates <- unique(substr(as.character(index(dat.i)), 1,10)) >> >> times <- rep("09:30:00", length(dates)) >> >> openDateTimes <- strptime(paste(dates, times), "%F %H:%M:%S") >> >> templateTimes <- NULL >> >> >> >> for (j in 1:length(openDateTimes)) { >> >> if (is.null(templateTimes)) { >> >> templateTimes <- openDateTimes[j] + 0:23400 >> >> } else { >> >> templateTimes <- c(templateTimes, openDateTimes[j] + 0:23400) >> >> } >> >> } >> >> >> >> # Convert templateTimes to XTS, merge with data and convert NA's >> >> templateTimes <- as.xts(templateTimes) >> >> dat.i <- merge(dat.i, templateTimes, all=T) >> >> # If there is no data in the first print, we will have leading >> NA's. So set them to -1. >> >> # Since we do not want these values removed by to.period >> >> if (is.na(dat.i[1])) { >> >> dat.i[1] <- -1 >> >> } >> >> # Fix remaining NA's >> >> dat.i <- na.locf(dat.i) >> >> # Convert to desired bucket size >> >> dat.i <- to.period(dat.i, period=per, k=subper, name=NULL) >> >> # Always use templated index, otherwise merge fails with other >> symbols >> >> index(dat.i) <- index(to.period(templateTimes, period=per, >> k=subper)) >> >> # If there was missing data at open, set close to NA >> >> valsToChange <- which(dat.i[,"Open"] == -1) >> >> if (length(valsToChange) != 0) { >> >> dat.i[valsToChange, "Close"] <- NA >> >> } >> >> if(i == 1) { >> >> DAT = fun(dat.i) >> >> } else { >> >> DAT = merge(DAT,fun(dat.i)) >> >> } >> >> setTxtProgressBar(pb, i) >> >> } >> >> close(pb) >> >> colnames(DAT) = tickerNames >> >> return(DAT) >> >> } >> >> >> >> parseTickData <- function(inputFile) { >> >> DAT.list <- scan(file=inputFile, >> sep=",",skip=1,what=list(Date="",Time="",Close=0,Volume=0),quiet=T) >> >> index <- >> as.POSIXct(paste(DAT.list$Date,DAT.list$Time),format="%m/%d/%Y >> %H:%M:%S") >> >> DAT.xts <- xts(DAT.list$Close,index) >> >> DAT.xts <- make.index.unique(DAT.xts) >> >> return(DAT.xts) >> >> } >> >> >> >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel