On Jan 2, 2015, at 12:07 AM, Kate Ignatius wrote: > Ah, crap. Yep you're right. This is not going too well. Okay - let > me try that again: > > x$childseg<-0 > x<-x$sumchild !=0
That previous line would appear to overwrite the entire dataframe with the value of one vector > span<-rle(x)$lengths[rle(x)$values==TRUE] > x$childseg[x]<-rep(seq_along(span), times = span) > > Does this one have any errors? Even assuming that the code from Jeff Newmiller is creating those objects I get > x$childseg[x]<-rep(seq_along(span), times = span) Error in `*tmp*`$childseg : $ operator is invalid for atomic vectors In the last line you are indexing a vector with a dataframe (or perhaps a data.table). If we use Newmiller's object and then change some of the instances of "x" in your code to DT we get: > DT$childseg<-0 > x<-DT$sumchild !=0 # Try not to overwrite your data-objects > span<-rle(x)$lengths[rle(x)$values==TRUE] > DT$childseg[x]<-rep(seq_along(span), times = span) > DT Dad Mum Child Group sumdad summum sumchild childseg 1: AA RR RA A 2 2 0 0 2: AA RR RR A 2 2 1 1 3: AA AA AA B 4 5 5 1 4: AA AA AA B 4 5 5 1 5: RA AA RR B 0 5 5 1 6: RR AA RR B 4 5 5 1 7: AA AA AA B 4 5 5 1 8: AA AA RA C 3 3 0 0 9: AA AA RA C 3 3 0 0 10: AA RR RA C 3 3 0 0 You persist in posting code where you do not explain what you are trying to do with it. You have already been told that your earlier efforts using `rle` did not make any sense. Post a complete example and then explain what you desire as an object. It's often helpful to provide a scientific background for what the data represents. -- David. > > > On Fri, Jan 2, 2015 at 2:32 AM, David Winsemius <dwinsem...@comcast.net> > wrote: >> >>> On Jan 1, 2015, at 5:07 PM, Kate Ignatius <kate.ignat...@gmail.com> wrote: >>> >>> Apologies - mix up of syntax all over the place, a habit of mine. The >>> last line was in there because of code beforehand so it really doesn't >>> need to be there. Here is the proper code I hope: >>> >>> childseg<-0 >>> x<-sumchild ==0 >>> span<-rle(x)$lengths[rle(x)$values==TRUE] >>> childseg[x]<-rep(seq_along(span), times = span) >>> >> >> This remains not reproducible. We have no idea what sumchild might be and >> the code throws an error. My guess is that you are trying to get a result >> such as would be delivered by: >> >> childseg <- sumchild[ sumchild != 0 ] >> >> — >> David. >> >>> >>> On Thu, Jan 1, 2015 at 12:13 PM, Jeff Newmiller >>> <jdnew...@dcn.davis.ca.us> wrote: >>>> Thank you for attempting to encode what you want using R syntax, but you >>>> are not really succeeding yet (too many errors). Perhaps another hand >>>> generated result would help? A new input data frame might or might not be >>>> needed to illustrate desired results. >>>> >>>> Your second and third lines are syntactically incorrect, and I don't >>>> understand what you hope to accomplish by assigning an empty string to a >>>> numeric in your last line. >>>> --------------------------------------------------------------------------- >>>> Jeff Newmiller The ..... ..... Go Live... >>>> DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... >>>> Live: OO#.. Dead: OO#.. Playing >>>> Research Engineer (Solar/Batteries O.O#. #.O#. with >>>> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k >>>> --------------------------------------------------------------------------- >>>> Sent from my phone. Please excuse my brevity. >>>> >>>> On January 1, 2015 4:16:52 AM PST, Kate Ignatius <kate.ignat...@gmail.com> >>>> wrote: >>>>> Is it possible to add the following code or similar in data.table: >>>>> >>>>> childseg<-0 >>>>> x:=sumchild <-0 >>>>> span<-rle(x)$lengths[rle(x)$values==TRUE >>>>> childseg[x]<-rep(seq_along(span), times = span) >>>>> childseg[childseg == 0]<-'' >>>>> >>>>> I was hoping to do this code by Group for mum, dad and >>>>> child. The problem I'm having is with the >>>>> span<-rle(x)$lengths[rle(x)$values==TRUE line which I'm not sure can >>>>> be added to data.table. >>>>> >>>>> [Previous email had incorrect code] >>>>> >>>>> On Wed, Dec 31, 2014 at 3:45 AM, Jeff Newmiller >>>>> <jdnew...@dcn.davis.ca.us> wrote: >>>>>> I do not understand the value of using the rle function in your >>>>> description, >>>>>> but the code below appears to produce the table you want. >>>>>> >>>>>> Note that better support for the data.table package might be found at >>>>>> stackexchange as the documentation specifies. >>>>>> >>>>>> x <- read.table( text= >>>>>> "Dad Mum Child Group >>>>>> AA RR RA A >>>>>> AA RR RR A >>>>>> AA AA AA B >>>>>> AA AA AA B >>>>>> RA AA RR B >>>>>> RR AA RR B >>>>>> AA AA AA B >>>>>> AA AA RA C >>>>>> AA AA RA C >>>>>> AA RR RA C >>>>>> ", header=TRUE, stringsAsFactors=FALSE ) >>>>>> >>>>>> library(data.table) >>>>>> DT <- data.table( x ) >>>>>> DT[ , cdad := as.integer( Dad %in% c( "AA", "RR" ) ) ] >>>>>> DT[ , sumdad := 0L ] >>>>>> DT[ 1==DT$cdad, sumdad := sum( cdad ), by=Group ] >>>>>> DT[ , cdad := NULL ] >>>>>> DT[ , cmum := as.integer( Mum %in% c( "AA", "RR" ) ) ] >>>>>> DT[ , summum := 0L ] >>>>>> DT[ 1==DT$cmum, summum := sum( cmum ), by=Group ] >>>>>> DT[ , cmum := NULL ] >>>>>> DT[ , cchild := as.integer( Child %in% c( "AA", "RR" ) ) ] >>>>>> DT[ , sumchild := 0L ] >>>>>> DT[ 1==DT$cchild, sumchild := sum( cchild ), by=Group ] >>>>>> DT[ , cchild := NULL ] >>>>>> >>>>>>> DT >>>>>> >>>>>> Dad Mum Child Group sumdad summum sumchild >>>>>> 1: AA RR RA A 2 2 0 >>>>>> 2: AA RR RR A 2 2 1 >>>>>> 3: AA AA AA B 4 5 5 >>>>>> 4: AA AA AA B 4 5 5 >>>>>> 5: RA AA RR B 0 5 5 >>>>>> 6: RR AA RR B 4 5 5 >>>>>> 7: AA AA AA B 4 5 5 >>>>>> 8: AA AA RA C 3 3 0 >>>>>> 9: AA AA RA C 3 3 0 >>>>>> 10: AA RR RA C 3 3 0 >>>>>> >>>>>> >>>>>> On Tue, 30 Dec 2014, Kate Ignatius wrote: >>>>>> >>>>>>> I'm trying to use both these packages and wondering whether they are >>>>>>> possible... >>>>>>> >>>>>>> To make this simple, my ultimate goal is determine long stretches of >>>>>>> 1s, but I want to do this within groups (hence using the data.table >>>>> as >>>>>>> I use the "set key" option. However, I'm I'm not having much luck >>>>>>> making this possible. >>>>>>> >>>>>>> For example, for simplistic sake, I have the following data: >>>>>>> >>>>>>> Dad Mum Child Group >>>>>>> AA RR RA A >>>>>>> AA RR RR A >>>>>>> AA AA AA B >>>>>>> AA AA AA B >>>>>>> RA AA RR B >>>>>>> RR AA RR B >>>>>>> AA AA AA B >>>>>>> AA AA RA C >>>>>>> AA AA RA C >>>>>>> AA RR RA C >>>>>>> >>>>>>> And the following code which I know works >>>>>>> >>>>>>> hetdad <- as.numeric(x[c(1)]=="AA" | x[c(1)]=="RR") >>>>>>> sumdad <- rle(hetdad)$lengths[rle(hetdad)$values==1] >>>>>>> >>>>>>> hetmum <- as.numeric(x[c(2)]=="AA" | x[c(2)]=="RR") >>>>>>> summum <- rle(hetmum)$lengths[rle(hetmum)$values==1] >>>>>>> >>>>>>> hetchild <- as.numeric(x[c(3)]=="AA" | x[c(3)]=="RR") >>>>>>> sumchild <- rle(hetchild)$lengths[rle(hetchild)$values==1] >>>>>>> >>>>>>> However, I wish to do the above code by Group (though this file is >>>>>>> millions of rows long and groups will be larger but just wanted to >>>>>>> simply the example). >>>>>>> >>>>>>> I did something like this but of course I got an error: >>>>>>> >>>>>>> LOH[,hetdad:=as.numeric(x[c(1)]=="AA" | x[c(1)]=="RR")] >>>>>>> LOH[,sumdad:=rle(hetdad)$lengths[rle(hetdad)$values==1],by=Group] >>>>>>> LOH[,hetmum:=as.numeric(x[c(2)]=="AA" | x[c(2)]=="RR")] >>>>>>> LOH[,summum:=rle(hetmum)$lengths[rle(hetmum)$values==1],by=Group] >>>>>>> LOH[,hetchild:=as.numeric(x[c(3)]=="AA" | x[c(3)]=="RR")] >>>>>>> >>>>> LOH[,sumchild:=rle(hetchild)$lengths[rle(hetchild)$values==1],by=Group] >>>>>>> >>>>>>> The reason being as I want to eventually have something like this: >>>>>>> >>>>>>> Dad Mum Child Group sumdad summum sumchild >>>>>>> AA RR RA A 2 2 0 >>>>>>> AA RR RR A 2 2 1 >>>>>>> AA AA AA B 4 5 5 >>>>>>> AA AA AA B 4 5 5 >>>>>>> RA AA RR B 0 5 5 >>>>>>> RR AA RR B 4 5 5 >>>>>>> AA AA AA B 4 5 5 >>>>>>> AA AA RA C 3 3 0 >>>>>>> AA AA RA C 3 3 0 >>>>>>> AA RR RA C 3 3 0 >>>>>>> >>>>>>> That is, I would like to have the specific counts next to what I'm >>>>>>> consecutively counting per group. So for Group A for dad there are >>>>> 2 >>>>>>> AAs, there are two RRs for mum but only 1 AA or RR for the child >>>>> and >>>>>>> that is RR (so the 1 is next to the RR and not the RA). >>>>>>> >>>>>>> Can this be done? >>>>>>> >>>>>>> K. >>>>>>> >>>>>>> ______________________________________________ >>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>> PLEASE do read the posting guide >>>>>>> http://www.R-project.org/posting-guide.html >>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>> >>>>>> >>>>>> >>>>> --------------------------------------------------------------------------- >>>>>> Jeff Newmiller The ..... ..... Go >>>>> Live... >>>>>> DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live >>>>> Go... >>>>>> Live: OO#.. Dead: OO#.. >>>>> Playing >>>>>> Research Engineer (Solar/Batteries O.O#. #.O#. with >>>>>> /Software/Embedded Controllers) .OO#. .OO#. >>>>> rocks...1k >>>>>> >>>>> --------------------------------------------------------------------------- >>>> >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.