On Nov 4, 2014, at 8:35 AM, CJ Davies wrote: > On 04/11/14 16:13, PIKAL Petr wrote: >> Hi >> >>> -----Original Message----- >>> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- >>> project.org] On Behalf Of CJ Davies >>> Sent: Tuesday, November 04, 2014 2:50 PM >>> To: Jim Lemon; r-help@r-project.org >>> Subject: Re: [R] Variance of multiple non-contiguous time periods? >>> >>> On 04/11/14 09:11, Jim Lemon wrote: >>>> On Mon, 3 Nov 2014 12:45:03 PM CJ Davies wrote: >>>>> ... >>>>> On 30/10/14 21:33, Jim Lemon wrote: >>>>> If I understand, you mean to calculate deviations for each >>> individual >>>>> 'chunk' of each transition & then aggregate the results? This is >>> what >>>>> I'd been thinking about, but is there a sensible manner within R to >>>>> achieve this, or is it something for which it would be easier to >>>>> preprocess the data in an external tool? Is there some way to subset >>>> the >>>>> data such that I can work over just contiguous 'chunks'? >>>>> >>>> Exactly. If there is some combination of existing variables that can >>>> be combined to make a set of unique values for each "chunk", you can >>>> calculate the deviations within each "chunk", then average the >>> squared >>>> deviations for each type of "chunk", weighting by the duration of the >>>> "chunks" so that you don't bias the pooled variance toward the longer >>>> "chunks". >>>> >>>> Jim >>>> >>> >>> I am stumped for a way of automating this process though. Each line of >>> log data looks like this; >>> >>> 2406 55.4 (-11.2, 1.0, -0.9) (-4.1, 1.0, 0.0) 7.077912 >>> 0.9203392 (0.0, >>> 0.7, -0.1, 0.7) 8.129684 89.41537 -8.212769 (0.0, >>> 0.7, -0.1, >>> 0.7) >>> 8.129684 89.41537 351.7872 1 0 0 False >>> 0.15 3 >>> 37.76761 True False 0 >>> transition 1 >> >> First you need to import it to R which could be tricky based on above line. >> Some values will probably need to process through regular expression. >> >> If I understand correctly number after transition is a signal which estimets >> continuous chunks. If it is true then >> >> ?rle is a function which can estimate length of chunks. >> >> Cheers >> Petr >> >>> >>> Where the last variable defines which transition is currently active. >>> However to separate these data into 'chunks' would involve making a >>> comparison between each line of data & the preceding line of data to >>> determine whether it is part of the same contiguous 'chunk'. Is this >>> something that would be better achieved using external preprocessing >>> written in a language I am more familiar with, as I haven't the >>> foggiest how I would approach this within R? >>> >>> Regards, >>> CJ Davies >>> >>> ______________________________________________ >> snipped >> > > Importing into R wasn't an issue; some of the fields contain spaces & > symbols, but all the fields are tab separated so I can simply use; > > foo <- read.csv("bar",header=T,sep="\t") > > I've just written a hacky bit of Java that gives me the lines of each 'chunk' > as a separate list & I think I'll then calculate these particular values > using Java's Math class rather than trying to come up with a sensible way to > import these 'chunks' back into R. When it comes to string/list manipulation > like this I think my knowledge in Java & lack of knowledge in R makes the > former the better option! >
If you had offered the output of dput(head(foo, 20) ) and explained what defined a "chunk-defining transition", it would have been fairly easy to show you how to use cumsum in an ave() call to construct a grouping variable. > Regards, > CJ Davies > > ______________________________ David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.