Dear Gabor and Jim You both gave me amazing solutions. I will use. Thanks! Nilza
On Tue, Oct 5, 2010 at 2:11 AM, Gabor Grothendieck <ggrothendi...@gmail.com>wrote: > On Sat, Oct 2, 2010 at 11:31 PM, Nilza BARROS <nilzabar...@gmail.com> > wrote: > > Dear R-users, > > > > I would like to know how could I read a file with different lines > lengths. > > I need read this file and create an output to feed my database. > > So after reading I'll need create an output like this > > > > "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20100910,837460, > 39,390)" > > > > Read the data filling the short lines (i.e. the date and station > lines) with NAs. Replace the *s with spaces and compute how many > non-NAs are in each row (cnt). Append group which is 1 for lines > pertaining to the 1st station, 2 for the 2nd, etc. Then merge it all > together in one big data frame, All, and generate a vector of SQL > strings: > > DF <- read.table("d2010100100.txt", fill = TRUE) > DF[] <- lapply(DF, function(x) as.numeric(chartr("*", " ", x))) > cnt <- rowSums(!is.na(DF)) > DF$group <- cumsum(cnt == 4) > Merge <- function(x, y) merge(x, y, by = "group") > All <- Reduce(Merge, split(DF, cnt)) > with(All, sprintf("INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES > (%04d%02d%02d, %d, %d, %d)", V1.x, V2.x, V3.x, V1.y, V1, V2)) > > The result looks like this: > > [1] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001, > 82599, 1008, -9999)" > [2] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001, > 83649, 1011, -9999)" > [3] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001, > 83649, 1000, 96)" > [4] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001, > 83649, 925, 782)" > [5] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001, > 83649, 850, 1520)" > [6] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001, > 83649, 700, 3171)" > [7] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001, > 83649, 500, 5890)" > [8] "INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20101001, > 83649, 400, 7600)" > > -- > Statistics & Software Consulting > GKX Group, GKX Associates Inc. > tel: 1-877-GKX-GROUP > email: ggrothendieck at gmail.com > -- Abraço, Nilza Barros [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.