Dear Gabor and Jim
You both gave me amazing solutions.
I will use.
Thanks!
Nilza

On Tue, Oct 5, 2010 at 2:11 AM, Gabor Grothendieck
<ggrothendi...@gmail.com>wrote:

> On Sat, Oct 2, 2010 at 11:31 PM, Nilza BARROS <nilzabar...@gmail.com>
> wrote:
> > Dear R-users,
> >
> > I would like to know how could I read a file with different lines
> lengths.
> > I need read this file and create an output to feed my database.
> > So after reading I'll need create an output like this
> >
> > "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20100910,837460,
> 39,390)"
> >
>
> Read the data filling the short lines (i.e. the date and station
> lines) with NAs.  Replace the *s with spaces and compute how many
> non-NAs are in each row (cnt).  Append group which is 1 for lines
> pertaining to the 1st station, 2 for the 2nd, etc.  Then merge it all
> together in one big data frame, All, and generate a vector of SQL
> strings:
>
> DF <- read.table("d2010100100.txt", fill = TRUE)
> DF[] <- lapply(DF, function(x) as.numeric(chartr("*", " ", x)))
> cnt <- rowSums(!is.na(DF))
> DF$group <- cumsum(cnt == 4)
> Merge <- function(x, y) merge(x, y, by = "group")
> All <- Reduce(Merge, split(DF, cnt))
> with(All, sprintf("INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES
> (%04d%02d%02d, %d, %d, %d)", V1.x, V2.x, V3.x, V1.y, V1, V2))
>
> The result looks like this:
>
> [1] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001,
> 82599, 1008, -9999)"
> [2] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001,
> 83649, 1011, -9999)"
> [3] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001,
> 83649, 1000, 96)"
> [4] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001,
> 83649, 925, 782)"
> [5] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001,
> 83649, 850, 1520)"
> [6] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001,
> 83649, 700, 3171)"
> [7] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001,
> 83649, 500, 5890)"
> [8] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001,
> 83649, 400, 7600)"
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>



-- 
Abraço,
Nilza Barros

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to