Can you at least provide a subset of 2 files so we can see how the data is really stored in the file and what the separators are between the 'columns' of data. Also how do you determine where the data actually starts for the rows that you want to pull off. This will aid in determining how to parse the data.
On Sun, Mar 11, 2012 at 3:07 PM, frauke <fh...@andrew.cmu.edu> wrote: > Dear R community, > > I have the following problem I hoped you could help me with. > > My data is save in thousand of files with a weird extension containing for > numbers and a z. For example *.1405z. With list.files I managed to load this > data into R. It looks like this (the row numbers are not in the original > file): > > 35 :LATEST STAGE 3.60 FT AT 730 AM CST ON > 0102 > 36 .ER ARCT2 0102 C > DC200001020813/DH12/HGIFF/DIH6 > 37 :QPF FORECAST 6AM NOON 6PM > MDNT > 38 .E1 :0102: / 3.5/ 3.4/ > 3.5 > 39 .E2 :0103: / 3.5/ 3.0/ 2.5/ > 2.1 > 40 .E3 :0104: / 1.8/ 1.5/ 1.3/ > 1.2 > 41 .E4 :0105: / 1.2/ 1.8/ 2.3/ > 2.7 > 42 .E5 :0106: / 3.0/ 3.0/ 3.1/ > 3.3 > 43 .E6 :0107: / > 3.4 > > I need the table in rows 37 to 43 in a matrix, for example: > 0201 NA 3.5 3.4 3.5 > 0103 3.5 3.0 2.5 2.1 > 0104 1.8 1.5 1.3 1.2 > 0105 1.2 1.8 2.3 2.7 > 0106 3.0 3.0 3.1 3.3 > 0107 3.4 NA NA NA > > Unfortunately the row numbers vary per file. I can call up each line with > file[40,1] for line 40 for example. It returns: > [1] .E3 :0104: / 1.8/ 1.5/ 1.3/ 1.2 > 38 Levels: .E1 :0102: / 3.5/ 3.4/ 3.5 ... > > So I have two problems really: > 1. How do I detect the table in the file (resp. the line where the table > starts)? > 2. How do I break up each line to write the values into a matrix? > > Feel free to suggest an entirely different approach if you think that is > helpful. > > Thanks a lot! Frauke > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/extracting-data-from-unstructured-text-file-tp4464423p4464423.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.