Hi Frauke,
Try unix commands with R's system() function.
Example:
Let's say you have a matrix like this in the file (note: the first element
is missing) called hello.txt
10 100
2 20 200
3 30 300
4 40 400
5 50 500
You can try something like:
hello = system("cut -f1 hello.txt", intern=T)
VP.
On 11 March 2012 19:07, frauke <[email protected]> wrote:
> Dear R community,
>
> I have the following problem I hoped you could help me with.
>
> My data is save in thousand of files with a weird extension containing for
> numbers and a z. For example *.1405z. With list.files I managed to load
> this
> data into R. It looks like this (the row numbers are not in the original
> file):
>
> 35 :LATEST STAGE 3.60 FT AT 730 AM CST ON
> 0102
> 36 .ER ARCT2 0102 C
> DC200001020813/DH12/HGIFF/DIH6
> 37 :QPF FORECAST 6AM NOON 6PM
> MDNT
> 38 .E1 :0102: / 3.5/ 3.4/
> 3.5
> 39 .E2 :0103: / 3.5/ 3.0/ 2.5/
> 2.1
> 40 .E3 :0104: / 1.8/ 1.5/ 1.3/
> 1.2
> 41 .E4 :0105: / 1.2/ 1.8/ 2.3/
> 2.7
> 42 .E5 :0106: / 3.0/ 3.0/ 3.1/
> 3.3
> 43 .E6 :0107: /
> 3.4
>
> I need the table in rows 37 to 43 in a matrix, for example:
> 0201 NA 3.5 3.4 3.5
> 0103 3.5 3.0 2.5 2.1
> 0104 1.8 1.5 1.3 1.2
> 0105 1.2 1.8 2.3 2.7
> 0106 3.0 3.0 3.1 3.3
> 0107 3.4 NA NA NA
>
> Unfortunately the row numbers vary per file. I can call up each line with
> file[40,1] for line 40 for example. It returns:
> [1] .E3 :0104: / 1.8/ 1.5/ 1.3/ 1.2
> 38 Levels: .E1 :0102: / 3.5/ 3.4/ 3.5 ...
>
> So I have two problems really:
> 1. How do I detect the table in the file (resp. the line where the table
> starts)?
> 2. How do I break up each line to write the values into a matrix?
>
> Feel free to suggest an entirely different approach if you think that is
> helpful.
>
> Thanks a lot! Frauke
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/extracting-data-from-unstructured-text-file-tp4464423p4464423.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.