Here is one way of doing it; it reads the file and create a 'long' version.

##########
input <- file("/temp/ClinicalReports.txt", 'r')
outFile <- '/temp/output.txt'  #  tempfile()
output <- file(outFile, 'w')
writeLines("ID, Date, variable, value", output)
ID <- NULL
dataSw <- NULL
repeat{
    line <- readLines(input, n = 1)
    if (length(line) == 0) break
    if (!is.null(dataSw)){
        if (line == ''){  # end of data
            ID <- NULL
            dataSw <- NULL
            next
        }
        # now write CSV output file
        cat(ID
          , ','
          , Date
          , ','
          , substring(line, 1, 31)
          , ','
          , substring(line, 32, 43)
          , '\n'
          , sep = ''
          , file = output
          )
        next
    }
    if (grepl("Acc.ne", line)){
        ID <- (substring(line, 29,35))
        Date <- (substring(line, 52,61))
        next
    }
    if (!is.null(ID)){  # looking for Esame
        if (grepl("Esame", line)){
            # skip two lines
            readLines(input, n = 2)
            dataSw <- 1
            next
        }
    }

}

# now read in the data in a long format
close(output)
result <- read.csv(outFile, as.is = TRUE)


the results from your test data is:

> str(result)
'data.frame':   43 obs. of  4 variables:
 $ ID      : int  185 185 185 185 185 185 185 185 185 185 ...
 $ Date    : chr  "05/12/2011" "05/12/2011" "05/12/2011" "05/12/2011" ...
 $ variable: chr  "AZOTEMIA                       " "CREATININEMIA
             " "SODIEMIA                       " "POTASSIEMIA
          " ...
 $ value   : num  33.6 0.99 136 4.22 94.2 8.68 1.87 1.79 189 118 ...
> head(result)
   ID       Date                        variable  value
1 185 05/12/2011 AZOTEMIA                         33.60
2 185 05/12/2011 CREATININEMIA                     0.99
3 185 05/12/2011 SODIEMIA                        136.00
4 185 05/12/2011 POTASSIEMIA                       4.22
5 185 05/12/2011 CLOREMIA                         94.20
6 185 05/12/2011 CALCEMIA                          8.68
>


On Thu, Mar 8, 2012 at 8:24 AM, ginger <bi...@igm.cnr.it> wrote:
> Ooops,
> I forgot to specify that for each raw, containing records of the clinical
> reports , the values  of the 22 parameter measurement have to be reported.
> For example, first raw, first 5 columns:
> ID                  DATE                  GLICEMIA   AZOTEMIA
> CREATININEMIA    SODIEMIA  ...        ...      ...
> 0000185      05/12/2011        115              33.6                  0.99
> 136             ...        ...      ...
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/parsing-text-files-tp4456355p4456389.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to