Here is another solution made slightly shorter by using strapply twice: z <- zoo(strapply(Lines, "[0-9]+[.][0-9]+", as.numeric)[[1]], strapply(Lines, "....-..-..", as.Date)[[1]])
or to create a data frame: DF <- data.frame(date = strapply(Lines, "....-..-..", as.Date)[[1]], price = strapply(Lines, "[0-9]+[.][0-9]+", as.numeric)[[1]]) On Wed, Nov 5, 2008 at 6:22 AM, Gabor Grothendieck <[EMAIL PROTECTED]> wrote: > As others have pointed out its close to XML but not quite > there; however, you could use strapply in gsubfn to extract > the data. It pulls out the data matching the regular expression > giving vector, vec, consisting of: date price date price ... > Pulling out even and odd elements separately and > converting them to Date and numeric, respectively, gives the > resulting data.frame. > > See > http://gsubfn.googlecode.com > for more on the gsubfn package and > the three zoo vignettes in the zoo package for more on it. > > Lines <- '- <Temp diffgr:id="Temp14" msdata:rowOrder="13"> > <Date>2005-01-17T00:00:00+05:30</Date> > <SecurityID>10149</SecurityID> > <PriceClose>1288.40002</PriceClose> > </Temp> > - <Temp diffgr:id="Temp15" msdata:rowOrder="14"> > <Date>2005-01-18T00:00:00+05:30</Date> > <SecurityID>10149</SecurityID> > <PriceClose>1291.69995</PriceClose> > </Temp> > - <Temp diffgr:id="Temp16" msdata:rowOrder="15"> > <Date>2005-01-19T00:00:00+05:30</Date> > <SecurityID>10149</SecurityID> > <PriceClose>1288.19995</PriceClose> > </Temp>' > > library(gsubfn) > vec <- strapply(Lines, "....-..-..|[0-9]+[.][0-9]+")[[1]] > ix <- seq_along(vec) %% 2 == 1 > DF <- data.frame(date = as.Date(vec[ix]), price = as.numeric(vec[!ix])) > > # or, instead of the last line, you could convert it to a zoo object so > # that its in a more convenient form for time series manipulation: > > library(zoo) > z <- zoo(as.numeric(vec[!ix]), as.Date(vec[ix])) > > > > On Wed, Nov 5, 2008 at 1:22 AM, RON70 <[EMAIL PROTECTED]> wrote: >> >> Hi everyone, >> >> I have this kind of raw dataset : >> >> - <Temp diffgr:id="Temp14" msdata:rowOrder="13"> >> <Date>2005-01-17T00:00:00+05:30</Date> >> <SecurityID>10149</SecurityID> >> <PriceClose>1288.40002</PriceClose> >> </Temp> >> - <Temp diffgr:id="Temp15" msdata:rowOrder="14"> >> <Date>2005-01-18T00:00:00+05:30</Date> >> <SecurityID>10149</SecurityID> >> <PriceClose>1291.69995</PriceClose> >> </Temp> >> - <Temp diffgr:id="Temp16" msdata:rowOrder="15"> >> <Date>2005-01-19T00:00:00+05:30</Date> >> <SecurityID>10149</SecurityID> >> <PriceClose>1288.19995</PriceClose> >> </Temp> >> >> I was looking for some R procedure to extract data from this, that should be >> in following format : >> >> 2005-01-17 1288.40002 >> 2005-01-18 1291.69995 >> 2005-01-19 1288.19995 >> >> Can R help me to do this? >> >> -- >> View this message in context: >> http://www.nabble.com/How-to-extract-following-data-tp20336690p20336690.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.