On Sun, Feb 12, 2012 at 10:35 AM, Bert Gunter <gunter.ber...@gene.com> wrote: > Folks: > > Suppose I wish to input a text file with variable length lines and > possible whitespace as is and then parse the resulting character > vector in R. Each line of text is terminated with "\n" (newline > character). > > Is there any reason to prefer one or the other of: > > scan (filename, what ="a",sep ="\n") ##or > readLines(filename) > > If it makes a difference, I'm on Windows. > > Many thanks for any advice/insight.
It depends on whether we need to retain the information regarding which elements were on the same line or not. In the first case we retain that info and in the second case we lose it: > lapply(readLines(textConnection(text)), function(x) scan(text = x)) Read 2 items Read 3 items [[1]] [1] 1 2 [[2]] [1] 3 4 5 > text <- "1 2\n3 4 5" > out <- scan(text = text); out Read 5 items [1] 1 2 3 4 5 If we did want to get back the info we lost in the last instance we need to re-read it: > num.flds <- count.fields(textConnection(text)) > tapply(out, rep(seq_along(num.flds), num.flds), c) $`1` [1] 1 2 $`2` [1] 3 4 5 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.