Here is an example of some code that might do it for you:: > input <- readLines(textConnection("19 > c:/data/WF-100/2008/20080911/trk/20080911.013115.007.17.txt + 10 s name of program that wrote this file trkplt name of program that wrote this file + 10 GORDON machine that generated this file machine that generated this file + 10 3.7 version of program + 10 3.6 version of this data file + 10 5.81 version of Universal Library + 10 20081121.145730 when this file was written + 10 Windows_XP operating system used operating system used + * + * radar characteristics + 11 WF-100 + 11 20000000 A/D rate, samples/second + 11 7.5 bin width, m + 11 800 nominal PRF, Hz + 11 0.25 nominal pulse width, microsec + 11 0 tuning, volts + 11 3.19779 nominal wave length, cm")) > closeAllConnections() > > # parse out the data > f.parse <- function(line){ + x <- sub("^(\\S+)\\s+(\\S+)\\s*(.*)", "\\1`\\2`\\3", line) + unlist(strsplit(x, "`")) + } > > fileName <- '' > result <- NULL > for (i in input){ + values <- f.parse(i) + switch(values[1], + '19'={fileName <<- values[2]}, + '*'=NULL, # ignore comments + '10'=, + '11'={result <<- rbind(result, c(fileName, values[3], values[2]))} + ) + } > # convert to dataframe for 'melt' > result <- as.data.frame(result, stringsAsFactors=FALSE) > names(result) <- c('fileName', 'variable', 'value') > require(reshape) > cast(result, fileName ~ variable, c) fileName A/D rate, samples/second bin width, m 1 c:/data/WF-100/2008/20080911/trk/20080911.013115.007.17.txt 20000000 7.5 machine that generated this file machine that generated this file 1 GORDON name of program that wrote this file trkplt name of program that wrote this file nominal PRF, Hz 1 s 800 nominal pulse width, microsec nominal wave length, cm operating system used operating system used 1 0.25 3.19779 Windows_XP tuning, volts version of program version of this data file version of Universal Library 1 0 3.7 3.6 5.81 when this file was written NA 1 20081121.145730 WF-100 > >
On Wed, Dec 17, 2008 at 12:21 PM, Titan8883 <jpla...@gmail.com> wrote: > > The output I would be looking for would be one row for each data file with > columns for each variable, so using a .csv example with a few variables > would be: > ------------------------------------------------------------------------- > File_name,date_written,program_ver,data_file_ver,bin_width > 20080911.013115.007.17.txt, 20081121.145730,3.7,3.6,7.5 > -------------------------------------------------------------------------- > My plan is to create a table with all the data files listed. This would > allow me to find mean/min/max values for different variables,sort by a > certain variable, etc. I am not limiting myself to R, I have seen awk > mentioned before, so that sounds like it is worth looking at to prep the > data. > > Hope that helps. > > > > > > jholtman wrote: >> >> It would be helpful if you could show what the output would be for the >> example given. Exactly what are 'values' and what would be the >> 'headings'. As mentioned before, you can use readLines and then parse >> the data you want, but something like Perl might be easier, but it is >> hard to tell from the mail. >> >> On Wed, Dec 17, 2008 at 2:37 PM, Titan8883 <jpla...@gmail.com> wrote: >>> >>> Hi all, >>> >>> I am a new graduate student who is also new to R. I am ok with the >>> basics, >>> but the problem I am having right now seems beyond what I can do..so I am >>> looking for advice. I am trying to pull data from flat ASCII files, but >>> they >>> do not have a "nice" structure so a simple "read.table" doesn't work. An >>> example first half of a data file is below: >>> ---------------------------------------------------------------------------------------------- >>> 19 c:/data/WF-100/2008/20080911/trk/20080911.013115.007.17.txt >>> 10 s name of program that wrote this file trkplt name of program that >>> wrote this file >>> 10 GORDON machine that generated this file machine that generated >>> this >>> file >>> 10 3.7 version of program >>> 10 3.6 version of this data file >>> 10 5.81 version of Universal Library >>> 10 20081121.145730 when this file was written >>> 10 Windows_XP operating system used operating system used >>> * >>> * radar characteristics >>> 11 WF-100 >>> 11 20000000 A/D rate, samples/second >>> 11 7.5 bin width, m >>> 11 800 nominal PRF, Hz >>> 11 0.25 nominal pulse width, microsec >>> 11 0 tuning, volts >>> 11 3.19779 nominal wave length, cm >>> ----------------------------------------------------------------------------------------------- >>> ..the file goes on from there... >>> >>> How would I go about getting this data into some kind of useful format? >>> This >>> is one of about 1000 files I will need to go through. I would ideally >>> like >>> to get these into a format with each data file as a row with columns for >>> the >>> various values with the description text removed(version of program, file >>> version, tuning volts, etc...). >>> >>> I'm not looking for a cut and paste answer, but perhaps some direction on >>> where I should start. I have only done basic .csv, table, and line inputs >>> up >>> until now. >>> >>> Thanks for any advice >>> -- >>> View this message in context: >>> http://www.nabble.com/Trouble-pulling-data-from-a-messy-ASCII-file...-tp21059239p21059239.html >>> Sent from the R help mailing list archive at Nabble.com. >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> >> -- >> Jim Holtman >> Cincinnati, OH >> +1 513 646 9390 >> >> What is the problem that you are trying to solve? >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > -- > View this message in context: > http://www.nabble.com/Trouble-pulling-data-from-a-messy-ASCII-file...-tp21059239p21060639.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.