On Oct 3, 2010, at 9:40 PM, Nilza BARROS wrote:

Hi, Michael
Thank you for your help. I have already done what you said.
But I am still facing problems to deal with my data.

I need to split the data according to station..

I was able to identify where the station information start using:

my.data<-file("d2010100100.txt",open="rt")
indata <- readLines(my.data, n=20000)
i<-grep("^[837]",indata)  #station number

That would give you the line numbers for any line that had an 8 , _or_ a 3, _or_ a 7 as its first digit. Was that your intent? My guess is that you did not really want to use the square braces and should have been using "^837".

?regex  # Paragraph starting "A character class .... "

my.data2<-read.table("d2010100100.txt",fill=TRUE,nrows=20000)
stn<- my.data2$V1[i]

That would give you the first column values for the lines you earlier selected.


====

This does not look like what I would expect as a value for stn. Is that what you wanted us to think this was?

--
David.


2010 10 01 00
*82599  -35.25  -5.91     52   1
* 1008.0  -9999    115     3.1   298.6   294.6 64
2010 10 01 00
*83649  -40.28 -20.26      4  7*
1011.0  -9999      0     0.0   298.4   296.1 64
1000.0     96     40     5.7   297.9   295.1 32
 925.0    782    325     3.1   295.4   294.1 32
 850.0   1520    270     4.1   293.8   289.4 32
 700.0   3171    240     8.7   284.1   279.1 32
 500.0   5890    275     8.2   266.2   262.9 32
 400.0   7600    335     9.8   255.4   242.4 32
===========
As you can see in the data above the line show the number of leves (or
lines) for each station.
I need to catch these lines so as to be able to feed my database.
By the way, I didn't understand the regular expression you've used. I've
tried to run it but it did not work.

Hope you can help me!
Best Regards,
Nilza





On Sun, Oct 3, 2010 at 2:18 AM, Michael Bedward
<michael.bedw...@gmail.com>wrote:

Hello Nilza,

If your file is small you can read it into a character vector like this:

indata <- readLines("foo.dat")

If your file is very big you can read it in batches like this...

MAXRECS <- 1000  # for example
fcon <- file("foo.dat", open="r")
indata <- readLines(fcon, n=MAXRECS)

The number of lines read will be given by length(indata).

You can check to see if the end of the file has been read yet with:
isIncomplete( fcon )

If a leading "*" character is a flag for the start of a station data
block you can find this in the indata vector with grepl...

start.pos <- which(indata, grepl("^\\s*\\*", indata)

When you're finished reading the file...
close(fcon)

Hope this helps,

Michael


On 3 October 2010 13:31, Nilza BARROS <nilzabar...@gmail.com> wrote:
Dear R-users,

I would like to know how could I read a file with different lines
lengths.
I need read this file and create an output to feed my database.
So after reading I'll need create an output like this

"INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20100910,837460,
39,390)"

I mean, each line should be read. But I don`t how to do this when these
lines have different lengths

I really appreciate any help.

Thanks.



====Below the file that should be read ===========


*2010 10 01 00
83746  -43.25 -22.81      6  51*
1012.0  -9999    320     1.5   299.1   294.4 64
1000.0    114    250     4.1   298.4   294.8 32
925.0    797      0     0.0   293.6   292.9 32
850.0   1524    195     3.1   289.6   288.9 32
700.0   3156    290    11.3   280.1   280.1 32
500.0   5870    280    20.1   266.1   260.1 32
400.0   7570    265    23.7   256.6   222.7 32
300.0   9670    265    28.8   240.2   218.2 32
250.0  10920    280    27.3   230.2   220.2 32
200.0  12390    260    32.4   218.7   206.7 32
176.0  -9999    255    37.6 -9999.0 -9999.0  8
150.0  14180    245    35.5   205.1   196.1 32
100.0  16560    300    17.0   195.2   186.2 32
*2010 10 01 00
83768  -51.13 -23.33    569  41
* 1000.0     79  -9999 -9999.0 -9999.0 -9999.0 32
946.0  -9999    270     1.0   295.8   292.1 64
925.0    763     15     2.1   296.4   290.4 32
850.0   1497    175     3.6   290.8   288.4 32
700.0   3140    295     9.8   282.9   278.6 32
500.0   5840    285    23.7   267.1   232.1 32
400.0   7550    255    35.5   255.4   231.4 32
300.0   9640    265    37.0   242.2   216.2 32


Best Regards,

--
Abraço,
Nilza Barros



David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to