On Mar 4, 2011, at 9:50 AM, Asan Ramzan wrote:

Hello R-help

I am working with large data table that have the occasional label,
a particular time point in an experiment. E.g:

"Time (min)", "R1 R1", "R2 R1", "R3 R1", "R4 R1"
.909, 1.117, 1.225, 1.048, 1.258
3.942, 1.113, 1.230, 1.049, 1.262
3.976, 1.105, 1.226, 1.051, 1.259
4.009, 1.114, 1.231, 1.053, 1.259
4.042, 1.107, 1.230, 1.048, 1.262
4.076, 1.108, 1.226, 1.045, 1.257
4.109, 1.109, 1.227, 1.047, 1.259
4.142, 1.108, 1.225, 1.052, 1.260
4.176, 1.105, 1.222, 1.046, 1.260
4.209, 1.106, 1.226, 1.050, 1.258
4.242, 1.105, 1.224, 1.047, 1.258
4.276, 1.104, 1.223, 1.048, 1.259
4.309, 1.106, 1.228, 1.050, 1.260
4.342, 1.103, 1.219, 1.049, 1.260
4.376, 1.107, 1.225, 1.052, 1.259
4.409, 1.105, 1.222, 1.047, 1.258
4.442, 1.106, 1.227, 1.048, 1.262
4.476, 1.105, 1.222, 1.049, 1.261
4.509, 1.102, 1.222, 1.047, 1.259
4.555, "Gly sar"
4.555, 1.107, 1.224, 1.048, 1.261
4.576, 1.109, 1.228, 1.053, 1.259
4.609, 1.103, 1.218, 1.046, 1.258
4.642, 1.105, 1.223, 1.048, 1.256
4.676, 1.108, 1.217, 1.048, 1.260
4.709, 1.124, 1.222, 1.047, 1.258
When I try to read in the table, I get:
try<-read.table("200810_01.R",header=T,sep=",")
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
 line 136 did not have 5 elements

Is there any way to tell R to ignore these labels or better
still interpret them as being label for particular time
points, so when it comes to draw a line graph it is annotated
with these labels.

Option 1:
Prepare your data properly with an editor:

Option 2:
You could read the file with readLines, identify the offending lines with grep or grepl, then separate the offenders and non-offenders. lines <- readLines(textConnection('"Time (min)", "R1 R1", "R2 R1", "R3 R1", "R4 R1"
.909, 1.117, 1.225, 1.048, 1.258
3.942, 1.113, 1.230, 1.049, 1.262
3.976, 1.105, 1.226, 1.051, 1.259
4.009, 1.114, 1.231, 1.053, 1.259
4.042, 1.107, 1.230, 1.048, 1.262
4.076, 1.108, 1.226, 1.045, 1.257
4.109, 1.109, 1.227, 1.047, 1.259
4.142, 1.108, 1.225, 1.052, 1.260
4.176, 1.105, 1.222, 1.046, 1.260
4.209, 1.106, 1.226, 1.050, 1.258
4.242, 1.105, 1.224, 1.047, 1.258
4.276, 1.104, 1.223, 1.048, 1.259
4.309, 1.106, 1.228, 1.050, 1.260
4.342, 1.103, 1.219, 1.049, 1.260
4.376, 1.107, 1.225, 1.052, 1.259
4.409, 1.105, 1.222, 1.047, 1.258
4.442, 1.106, 1.227, 1.048, 1.262
4.476, 1.105, 1.222, 1.049, 1.261
4.509, 1.102, 1.222, 1.047, 1.259
4.555, "Gly sar"
4.555, 1.107, 1.224, 1.048, 1.261
4.576, 1.109, 1.228, 1.053, 1.259
4.609, 1.103, 1.218, 1.046, 1.258
4.642, 1.105, 1.223, 1.048, 1.256
4.676, 1.108, 1.217, 1.048, 1.260
4.709, 1.124, 1.222, 1.047, 1.258'))

 read.table(textConnection(
        lines[ c(TRUE, !grepl("[[:alpha:]]", lines)[-1]) ]),
             skip=1)

# the quotes and spaces don't work well with R column naming conventions

       V1     V2     V3     V4    V5
1   .909, 1.117, 1.225, 1.048, 1.258
2  3.942, 1.113, 1.230, 1.049, 1.262
3  3.976, 1.105, 1.226, 1.051, 1.259

snipped
23 4.642, 1.105, 1.223, 1.048, 1.256
24 4.676, 1.108, 1.217, 1.048, 1.260
25 4.709, 1.124, 1.222, 1.047, 1.258

So even more compact would be:

read.table(textConnection(
        lines[  !grepl("[[:alpha:]]", lines) ] ) )

Using the non-negated grepl expression should get you all the "labels" lines


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to