Thanks and with datlines <- as.data.frame(inp[( grep("<PRE>", inp)[1]+5 ):(grep("</PRE>", inp)[1]-1)]);
I get the data as needed. Thanks again H. ----- Original Message ----- On Jun 14, 2012, at 10:23 AM, Halldór Björnsson wrote: > Hi, > > I am trying to read in weather balloon data, where each file has a > header of fixed length and > a trailing section of a fixed length. The data section (the table) > is of variable length. > > An example of the data is on: > > http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2011&MONTH=06&FROM=1400&TO=1400&STNM=04018 > > This data has 97 rows and can be read as: > read.table("http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2011&MONTH=06&FROM=1400&TO=1400&STNM=04018 > ",skip=10,nrows=97) > > If I set nrows=98 I run into the trailing section. > >> From day to day the table length changes. Is there a way to get >> read.table to always read in the correct > length and just stop when it hits the trailing section? Looks to be fairly straightforward HTML inp <- readLines(con=url("http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2011&MONTH=06&FROM=1400&TO=1400&STNM=04018 ")) > grep("<PRE>", inp) [1] 6 109 That was followed by multi-line header. > inp[grep("<PRE>", inp)[1]+4] [1] "-----------------------------------------------------------------------------" The ending can be found similarly: > grep("</PRE>", inp) [1] 109 140 datlines <- inp[( grep("<PRE>", inp)[1]+5 ):(grep("</PRE>", inp)[1]-1)] You may need to use read.fwf for input since the table has missing values. -- David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.