[R] trouble with Vista & reading files

Mike Williamson Thu, 20 Aug 2009 17:52:10 -0700

All,

    I am having trouble with a "read.table()" function that is inside of
another function.  But if I call the function by itself, it works fine.
Moreover, if I run the script on a Mac OS X (with the default Mac OS X
version of R installed, rev 2.8), it works fine.  But it does not work if I
run it on windows vista (also default Windows version of R, rev. 2.8).


    Again, both calls shown below work fine in Mac, but only the call by
itself works in Vista.  The other call embedded in a function does not.

                            Thanks in advance for all the help!!
                                              Regards, Mike

    Below are the calls:

#############################################
Below is the call which DOES work, as long as it is called by itself.
##############################################



*  eTestData <-
read.table("C:/Users/<userID>/Documents/R/eTestDataDir/EtestExample.csv",header
= TRUE,
                          as.is = TRUE)
*


#############################################
Below is the call which does NOT work.  The problem function call
highlighted in *red*
Especially strange with this is that there is a call below to ask for all
the files in the directory, which I
have highlighted in *purple*, and that call works fine.  So it is some sort
of permissions thing.
##############################################

cro.etest.grab <- function(dataDir="raw.etest.data", header="hdr",
                          dataHeaders="datasets/eTestDataHeaders.txt",
                          slotCol="Wafer", dateFormat="%m/%d/%Y %H:%M:%S",
                          lotCol="eTestLotID") {
### Function: grab data in its raw form from SVTC's HP electrical tester and
###           "munge" it into a format more friendly for analysis in R.
### Requires: dataDir     -- the directory where the raw SVTC data set is
stored
###           header      -- the differentiation between the names of the
data
###                          files and the header files.  E.g., if data file
###                          is "CORR682..18524" and header file is
###                          "CORR682.hdr.18524", then the header is "hdr".
###           dataHeaders -- Sometimes the labels for the data is missing,
but
###                          they are NEARLY always the same.  If the labels
###                          are ever missing, this fills them in with the
###                          vector of headers given here.  E.g., c("Wafer",
###                          "Site","R2_ET1_M1",etc.)
###           slotCol     -- In the data file, typically column "Wafer" is
###                          actually the slot ID. This renames it to Slot.
###                          So, SlotCol is the name IN THE RAW DATA.
###           lotCol       -- The data files have no lot ID column, the lot
ID
###                          is grabbed from the file name. This provides a
###                          column header name for the lot ID.
###           dateFormat  -- The test data header file has eTest time,
written
###                          in the format month/day/year hour:minute:sec.
###                          If another format is being read, it can be
###                          altered here.
  dataHeaders <- read.table(dataHeaders, stringsAsFactors = FALSE)[,1]

  print(paste("dataDir:",dataDir,"    header:",header,"
slotCol:",slotCol,
              "    lotCol:",lotCol))
  *allFiles <- list.files(path = dataDir)*
  tmp <- grep("hdr",allFiles,ignore.case = TRUE)
  dataFiles <- allFiles[-tmp]
  hdrFiles <- sub("\\.(.*)\\.","\\.hdr\\1\\.",dataFiles)
*  eTestData <- read.table(paste(dataDir,"/",dataFiles[1],sep=""),header =
TRUE,
                          as.is = TRUE)
*  eTestData[,slotCol] <- as.character(eTestData[,slotCol])
  eTestData[,lotCol] <- rep(dataFiles[1],length(eTestData[,1]))
  tmp <- try(scan(paste(dataDir,"/", hdrFiles[1], sep=""), what =
"character",
                  sep="\n", quiet=TRUE), silent=TRUE)
  if (is.null(attr(tmp,"class"))) {
    dateCols <- grep("[0-9][0-9]/[0-9][0-9]/20[01][0-9]",tmp)
    hdrDF <-
data.frame(tmp[(dateCols-1)],tmp[dateCols],stringsAsFactors=FALSE)
    hdrDF$LotDate <- rep(hdrDF[1,2],length(hdrDF[,1])) ; hdrDF <- hdrDF[-1,]
    hdrDF[1,1] <- tmp[(dateCols-2)][1]
    names(hdrDF) <- c(slotCol,"Date","LotDate")
    hdrDF[,slotCol] <- substring(hdrDF[,slotCol],
                                 (regexpr("=",hdrDF[,slotCol])+2),
                                 nchar(hdrDF[,slotCol]))
    if (any(nchar(hdrDF[,slotCol])==0)) {
      print(paste("Header file",hdrFiles[i],
                  "has no wafer information.  Headers will not be
included."))
    } else {
      otherCols <- tmp[grep("=",tmp)]
      otherCols <- otherCols[-grep("WAFER", otherCols, ignore.case=TRUE)]
      otherData <- substring(otherCols,(regexpr("=",otherCols)+2),
                             nchar(otherCols))
      otherCols <- substring(otherCols,1,(regexpr("=",otherCols)-2))
      otherData <- as.data.frame(matrix(rep(otherData,length(hdrDF[,1])),
                                        ncol=length(otherCols),
                                        byrow=TRUE),stringsAsFactors=FALSE)
      names(otherData) <- otherCols ; hdrDF <- cbind(hdrDF,otherData)
      hdrDF$Date <- as.POSIXct(hdrDF$Date,format=dateFormat)
      hdrDF$LotDate <- as.POSIXct(hdrDF$LotDate,format=dateFormat)
      hdrDF$TestTime <- NA
      for (j in c(1:length(hdrDF[,1])-1)) {
        hdrDF$TestTime[j] <- hdrDF$Date[(j+1)]-hdrDF$Date[j]
      }
      eTestData <- merge(hdrDF,eTestData,by=slotCol)
    }
  }
  if (length(dataFiles) > 1) {
    for (i in c(2:length(dataFiles))) { # i <- 48
#      print(paste("I am at file #",i,", named",dataFiles[i]))
      tmp <- read.table(paste(dataDir,"/",dataFiles[i],sep=""),header =
TRUE,
                        as.is = TRUE)
      if (names(tmp)[1] != "Wafer") {
        if (length(names(tmp)) == length(dataHeaders)) {
          print(paste("File",dataFiles[i],"is missing header information."))
          print("      'Typical' headers will be used.")
          tmp <- read.table(paste(dataDir,"/",dataFiles[i],sep=""),
                            header = FALSE, as.is = TRUE)
          names(tmp) <- dataHeaders
        } else {
          print(paste("File",dataFiles[i],"is missing header information
and"))
          print("   has a non-standard set of data.  It will not be
included.")
          next
        } # end of "if" whether the num of cols of the dataset is standard
      } # end of "if" whether the data headers are missing
      tmp[,slotCol] <- as.character(tmp[,slotCol])
      tmp[,lotCol] <- rep(dataFiles[i],length(tmp[,1]))
###      if (useHeaders) {
      tmp2 <- try(scan(paste(dataDir,"/",hdrFiles[i],sep=""),
                       what = "character",sep="\n", quiet=TRUE),silent=TRUE)
      if (is.null(attr(tmp2,"class"))) {
        dateCols <- grep("[0-9][0-9]/[0-9][0-9]/20[01][0-9]",tmp2)
        hdrDF <- data.frame(tmp2[(dateCols-1)],tmp2[dateCols],
                            stringsAsFactors=FALSE)
        hdrDF$LotDate <- rep(hdrDF[1,2],length(hdrDF[,1]))
        flag <- 1 # This is the flag to warn that the headers are not
correct
        if ((dateCols[2] - dateCols[1]) > 2) { # This is to ensure there are
full headers
          hdrDF <- hdrDF[-1,]
          hdrDF[1,1] <- tmp2[(dateCols-2)][1]
          flag <- 0 # The flag is turned off if the headers are correct
        }
        names(hdrDF) <- c(slotCol,"Date","LotDate")
        hdrDF[,slotCol] <- substring(hdrDF[,slotCol],
                                     (regexpr("=",hdrDF[,slotCol])+2),
                                     nchar(hdrDF[,slotCol]))
        if (flag) {
          print(paste("Header file",hdrFiles[i], "has no wafer
information."))
          print("      Headers will not be included.")
        } else {
          otherCols <- tmp2[grep("=",tmp2)]
          otherCols <- otherCols[-grep("WAFER",otherCols,ignore.case=TRUE)]
          otherData <- substring(otherCols,(regexpr("=",otherCols)+2),
                                 nchar(otherCols))
          otherCols <- substring(otherCols,1,(regexpr("=",otherCols)-2))
          otherData <-
as.data.frame(matrix(rep(otherData,length(hdrDF[,1])),

ncol=length(otherCols),byrow=TRUE),
                                     stringsAsFactors=FALSE)
          names(otherData) <- otherCols
          hdrDF <- cbind(hdrDF,otherData)
        } # end the "if" if it was flagged for bad headers
        hdrDF$Date <- as.POSIXct(hdrDF$Date,format=dateFormat)
        hdrDF$LotDate <- as.POSIXct(hdrDF$LotDate,format=dateFormat)
        ## In calculating "TestTime" below, I am assuming that the "Date" is
        ## the time the wafer BEGAN processing.  E.g., therefore I will not
        ## know the test time for the last wafer in the lot.
        hdrDF$TestTime <- NA
        for (j in c(1:length(hdrDF[,1])-1)) {
          hdrDF$TestTime[j] <- hdrDF$Date[(j+1)]-hdrDF$Date[j]
        }
        tmp <- merge(hdrDF,tmp,by=slotCol)
      } # end the "if" in case the header file was missing
      eTestData <- merge(tmp,eTestData,all = TRUE)
    } # end of the loop through all data files
  } # end of the "if" statement to see if there is more than 1 data file
  eTestData <- eTestData[order(eTestData$Date),]
  names(eTestData)[grep(slotCol,names(eTestData))] <- "Slot"
  lotCol2 <- which(names(eTestData) == lotCol)
  eTestData <-
eTestData[,c(lotCol2,setdiff(c(1:length(eTestData)),lotCol2))]
  eTestData
}

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] trouble with Vista & reading files

Reply via email to