Can you at least provide a subset of 2 files so we can see how the
data is really stored in the file and what the separators are between
the 'columns' of data.  Also how do you determine where the data
actually starts for the rows that you want to pull off.  This will aid
in determining how to parse the data.

On Sun, Mar 11, 2012 at 3:07 PM, frauke <fh...@andrew.cmu.edu> wrote:
> Dear R community,
>
> I have the following problem I hoped you could help me with.
>
> My data is save in thousand of files with a weird extension containing for
> numbers and a z. For example *.1405z. With list.files I managed to load this
> data into R. It looks like this (the row numbers are not in the original
> file):
>
> 35                             :LATEST STAGE     3.60 FT AT 730 AM CST ON
> 0102
> 36                          .ER ARCT2    0102 C
> DC200001020813/DH12/HGIFF/DIH6
> 37                   :QPF FORECAST        6AM       NOON        6PM
> MDNT
> 38                   .E1 :0102:              /       3.5/       3.4/
> 3.5
> 39                   .E2 :0103:   /       3.5/       3.0/       2.5/
> 2.1
> 40                   .E3 :0104:   /       1.8/       1.5/       1.3/
> 1.2
> 41                   .E4 :0105:   /       1.2/       1.8/       2.3/
> 2.7
> 42                   .E5 :0106:   /       3.0/       3.0/       3.1/
> 3.3
> 43                                                    .E6 :0107:   /
> 3.4
>
> I need the table in rows 37 to 43 in a matrix, for example:
> 0201     NA    3.5    3.4    3.5
> 0103     3.5    3.0    2.5     2.1
> 0104     1.8    1.5    1.3    1.2
> 0105    1.2     1.8    2.3    2.7
> 0106     3.0    3.0    3.1    3.3
> 0107     3.4    NA    NA   NA
>
>  Unfortunately the row numbers vary per file.  I can call up each line with
> file[40,1] for line 40 for example. It returns:
> [1] .E3 :0104:   /       1.8/       1.5/       1.3/       1.2
> 38 Levels: .E1 :0102:              /       3.5/       3.4/       3.5 ...
>
>  So I have two problems really:
> 1. How do I detect the table in the file (resp. the line where the table
> starts)?
> 2. How do I break up each line to write the values into a matrix?
>
> Feel free to suggest an entirely different approach if you think that is
> helpful.
>
> Thanks a lot! Frauke
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/extracting-data-from-unstructured-text-file-tp4464423p4464423.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to